Earth Science Data System Projects Using Semantic Web
Mirador - An Earth Science Search Tool at the GES DISC
Mirador is a search tool for data at the Goddard Earth Sciences Data and Information Services Center (GES DISC) with a hierarchical navigation interface (the Projects tab) modeled in an ontology and constructed using semantic web technologies such as Jena and SPARQL. Though this could have been implemented with traditional technologies, the power of semantic web will come out with the addition of a tab for Discipline/Parameter in the coming months, which will in turn enable integration with the GES DISC's parameter-oriented analysis tool, Giovanni.
Multi-sensor Data Synergy Advisor (MDSA)
The MDSA is a NASA-funded project to develop a service to advise users on how comparable are parameters from two different data products. This service will rely on an ontological comparison of many factors, such as instrument characteristics, processing history, data quality and space/time resolution.
Multi-sensor Data Synergy Advisor
Semantic Web for Earth and Environmental Terminology (SWEET)
SWEET is an upper-level domain ontology set for Earth System Science. It includes almost 100 ontologies spanning concepts of science, data, and applications.
Noesis
Noesis is a meta search engine and a resource aggregator designed specifically for Atmospheric Science. Noesis uses ontologies to guide users torefine their search query producing better search results and thereby reduces the user’s burden to experiment with different search strings. Noesis also serves as an educational tool as it allows users to browse and traverse the different concepts in the ontology. Noesis provides users a single site to find all the right resources in the Atmospheric Science domain and these resources cover web pages, publications, datasets, educational materials, books etc.
Semantically-Enabled Science Data Integration (SESDI)
Smart Assistant for Mining (SAM)
Scientific data mining is a very powerful means for automated knowledge extraction from the ever-increasing volumes of science observations and model output data available. NASA’s Second Data MiningWorkshop found that maturing data mining techniques show “potential for significantly expanding the scientific understanding of NASA’s Earth science data.” However, this type of tool has generally been difficult for domain scientists and students to fully exploit without extended learning curves. And even data mining specialists may not be familiar with the full range ofcomponents in a mining toolkit, so potentially useful mining strategies may be ignored.
To facilitate exploitation of these promising techniques by the increasingly IT-sophisticated NASA Earth science community, the University of Alabama in Huntsville leads a collaborative team in proposing to leverage Semantic Web technologies to build a Smart Assistant for Mining (SAM) and to deploy it for use at two data centers. This project will reuse an existing toolkit of data mining web services designed specifically for the analysis of NASA data in a web-based, service-oriented architecture. It will also leverage and extend an initial ontology describing data mining services, with links to other ontologies describing the Earth science problem domain and relevant data sets. The new SAM user interface tool, which integrates semantic reasoning into a traditional workflow composer, will allow users to discover available data and services,assist users in composing mining workflows, and invoke them to perform the desired analysis.
ACCESS-NEWS
Data Quality Screening Service (DQSS)
NASA provides a wide variety of Earth-observing satellite data products to a diverse community. These data are annotated with quality information in a variety of ways, with the result that many users struggle to understand how to properly account for quality when dealing with satellite data. To address this issue, a Data Quality Screening Service (DQSS) is being implemented for a number of datasets. The DQSS will enable users to obtain data files in which low-quality pixels have been filtered out, based either on quality criteria recommended by the science team or on the user’s particular quality criteria. The objective is to increase proper utilization of this critical quality data in science data analysis of satellite data products. Semantic Web technology is used to match a data variable with the proper quality screening algorithm. In addition, the combination of ontology and inference rules will provide to users a logically organized view of the effect of and rationale for the various schemes of quality screening.
AeroStat
AeroStat is an online environment for the direct statistical intercomparison of global aerosol parameters in which the provenance and data quality can be readily accessed by scientists. It will provide convenient access to:
- Satellite and ground-based aerosol data
- Data product quality information and provenance
- Calibration/validation data
In addition, Aerostat will provide multi-sensor and model intercomparison, cross-sensor bias adjustment and data merging. All of this will be embedded in a collaborative environment for aerosol researchers to share ideas, workflows and results.
Semantic web technology will be used in several places:
- underpinning a rich faceted interface, focused around the factors that can produce biases in aerosol measurements
- describing data product quality in a consistent manner across datasets
- supporting the provenance of the data and categorizing differences between products (see the Multi-Sensor Data Synergy Advisor).
More...
