Earth Science is a discipline that is strongly data driven, and large data sets are often developed by researchers and government agencies. The complexity of the fundamental scientific questions being addressed requires integrative and innovative approaches employing these data sets if we are to find understanding. Although a number of databases exist, none of them are truly complete, error free as is practical, easily accessible, and simple to use. The ultimate goal of the Earth Science community is a fully integrated data system populated with high quality, freely available data, as well as, a robust set of software to analyze and interpret the data. This system would feature rich and deep databases and convenient access. These capabilities are needed to attack a variety of basic and applied Earth Science problems.
The development of the capability to construct, organize, and verify an Earth Science data system is a natural, and indeed essential, step for the Earth Sciences to move forward so that we can understand the Earth as a system as well as meet societal needs. Most Earth Science problems are inherently 4-D (x,y,z,t) in nature involving the subsurface and variation with time. Thus, their solution requires data analysis that is far more complex than provided by traditional geographic information systems (GIS). The extent, complexity, and sometimes primitive form of existing data sets and data bases, as well as the need for the optimization of the collection of new data, dictate that only a large, cooperative, well coordinated, and sustained effort will allow the community to attain its scientific goals. With a strong emphasis on ease of access and use, the resulting data system would be a very powerful scientific tool to reveal new relationships in space and time and would be an important resource for students, teachers, the public at large, governmental agencies and industry.
Fundamental new discoveries will require the availability of databases that encompass a variety of temporal and spatial scales. Because of the need to integrate heterogeneous data sets and tools to analyze them , the Geoinformatics program provides the focus for community participation in a national experiment to enhance and retain the pre-eminent role in the world for the United States in Earth Sciences research. It is also going to be the catalyst for the creation of a global data base.
The Interim Steering Committee (ISC) has identified both the procedural details for community participation, as well as recommends the most exciting research frontiers for the near future that require construction and utilization of databases. However, the most important Earth Science problems to be attacked using this data system and software are probably not yet known because the creative energies of people getting together to explore relationships among the data and test ideas will lead to unanticipated insights. The TWO recommendations are described separately, and the benefits to the entire Earth Sciences community are presented in the summary section.
Initial Organization Structure
The ISC recommends the establishment of a consortium of academic institutions through
1. invitation through mailings to all Earth Sciences institutions to participate.
2. announcements in national journals and news magazines
3. inform all earth science societies
In order to take the first step in this process, an initial group would
be formed and would propose to design and develop selected nodes and the
core of the first comprehensive Earth information system for research and
education covering scales from global to local, spatial to temporal. This
system will ultimately contain not only multidisciplinary data sets, but
also data manipulation, analysis, visualization, plotting tools and modeling
codes to exploit the digital data, all accessible on-line real-time via
the World Wide Web. It will be built to handle not only 3D spatial but
also temporal changes. There are countless data sets that could be developed
into nodes on this system, but the funding levels anticipated, and prudence
dictate that the initial implementation be modest. The details of this
plan will be discussed at a workshop scheduled for the fall of 2000, but
it will be limited to about six nodes and a central node that provides
coordination, technical support, and facilities for needs such as backups.
The nodes could be based on type of data, topic, or region.
Two centuries of observational and analytical data collection and analysis
are available to construct databases. As it is unlikely that all the data
can be verified and digitally cataloged, the ISC recommends the creation
of databases utilizing a progressive growth model based on near term research
needs. A representation of our vision that provides for full community
participation, identifies data sources and expert working groups responsible
for formulating quality control methods, as well as creating attributes
for all disciplinary data is shown below. The structure of the database
will be constructed by experts in Earth Sciences that have significant
expertise in both GIS and database management techniques. Additional help
will be requested as needed from the computer science community.
(1) defining criteria for quality control within subdisciplines,
(2) locating databases available in the various subdisciplines,
(3) cataloging available software for data reduction or modeling,
(4) providing the attributes of data to be entered into the databases,
(5) promoting the utilization of geospatial data.
2. All unpublished (non proprietary) data and meeting standards of quality (i.e., would be published if submitted to a national journal) as defined the expert working groups.
3. Data available from other agencies and programs
Well crafted initial projects are critical to the success of the data system and ultimately to the formation of the consortium. A fundamental objective of the initiative must be the implementation of a visible change within the community by adding a geospatial component to the geologic culture. This initiative must be perceived as a significant contribution to the community at large. If the proposed data and information system is not regarded as an exciting and useful tool, members of the community will not expend the resources (monetary and time) required to access and ultimately contribute to the data system. The initiative requires several exciting, well integrated, and easily accessible examples of data system construction to establish the infrastructure as a indispensable community utility. To achieve this goal, the initial projects must address fundamental earth processes and make possible significant contributions to scientific understanding. It is not necessary to collect new data for this to be successful, rather the emphasis should be on mining existing data resources for the development and integration of data sets in a spatially and temporally referenced framework. In development of initial projects, this initiative must be sensitive to existing data infrastructure (IRIS, UNAVCO, NASA, USGS, NOAA/NGDC) and the anticipated needs of EarthScope.
SOME SUGGESTED INITIAL RESEARCH DATABASES AND STRUCTURAL ASPECTS FOR
GEOINFORMATICS INITIATIVE
This requires development and maintenance of a well designed front-end for a variety of programs needed to extract, interface, and model data available from the data system (e.g., GPS community has good model to review: Scripts data structure for access to raw information; UNAVCO working groups to make velocities available to the non-GPS community). The development of a toolbox is a vital consideration, using existing data sets to construct and verifying databases is a major task and, without the needed software, virtually impossible. Thus these tools are an absolute necessity for the success of the data system. In an environment characterized by access to rapidly evolving data sets developed to address specific problems (curiosity driven research) modification, addition of information, and reorientation of the structure to address a new motive for data set development will require an evolving system of software applications
In the long term, the goal of this effort is build the initial organization into a consortium overseeing a comprehensive, effective national program in support of Geoinformatics. This effort will take a number of years to mature, and will require considerable thought and deliberation. The funding required is substantial and will probably require interagency cooperation.
The benefits of this program would include the continued scientific leadership of the United States in Earth Sciences, as well as the opportunity to construct a global data base that would uniquely characterize our planet.