Introducing the Climate and Hydrology Web Harvester System
The LTER Network faces significant challenges in strengthening existing cross-site integrative research. Climatic and hydrological data are critical to these efforts and LTER Network investigators and U.S.D.A. Forest Service (USFS) experimental watersheds are committed to populating and updating these basic data sets. The cross-site ClimDB and HydroDB projects facilitate synthetic research among this network of sites through production databases widely used in intersite comparisons, modeling studies, and land management-related studies.
Supplemental NSF funding coupled with USFS Forest Health Monitoring funding has enabled major system functionality improvements and significant increases in site participation over the past year. The database has surpassed over 5 million daily records from 33 sites including all LTER sites and many USFS Experiment Stations. Air temperature, precipitation, and streamflow are the most consistently harvested measurement variables. LTER faces the continuing challenge of maintaining the currency of these variables while also extending this consistency to other meteorological measurements and accompanying metadata. Two significant projects that are directed toward meeting this challenge are described below.
Public access to the database has also been greatly improved and public use of this resource is steadily growing. The download and graphics interface now allows complete access to the combined climate and hydrology databases and supports a flexible graphical display system for comparing all sites and variables. See example. (Figure 1). Since February 2003, visitors have generated graphs, downloaded, or displayed over 1300 data sets. The ClimHyDB web pages average over 400 visitors per month (http://www.fsl.orst.edu/climhy/).
USGS Data Harvesting Service for HydroDB
In January 2002, Wade Sheldon (GCE LTER) developed an automated system for harvesting streamflow data from any real-time USGS gauging station and processing it for submission to HydroDB, the LTER All-site hydrological database at Andrews LTER. Working in collaboration with Suzanne Remillard and Don Henshaw (AND LTER), this system was generalized and offered as a service to the broader LTER community in June 2003.
In this system, recent provisional data are harvested on a weekly basis from one or more stations requested by each participating site. The data are converted to units compatible with HydroDB and undergo several levels of quality control analysis and flagging to identify questionable values. Values flagged as invalid (e.g. negative discharge) are removed from data sets prior to submission to HydroDB. Also, any updates updates to provisional data by USGS are automatically synchronized with the database each week, and provisional values are overwritten with finalized data as soon as they are released.
This harvesting service provides several important benefits to the LTER and broader scientific community. USGS has made great strides in providing timely access to national monitoring data via the WWW, but the vast size of this monitoring network (over 5500 streamflow stations alone) makes finding data relevant to LTER sites a significant task. Data are also not provided in standard metric units, and provisional data are often not subjected to any quality checks prior to web posting. Harvesting, transforming, and quality-checking data from stations near to or within LTER sites on a regular basis and providing access through a single web interface greatly enhances the usability of these data, facilitating synthesis. It also serves as a useful demonstration of how metadata-based data processing technology (see http://gcelter.marsci.uga.edu/lter/research/tools/usgs_harvester.htm), data format standards, and web-based communications protocols can ease the application of information technology developed at sites to network-level problems, providing a significant research benefit with almost no added cost.
The San Diego Supercomputer Center (SDSC) scientists and LTER information managers have been collaborating since February 2002 to develop a web services implementation of ClimDB (Network News, Fall 2002, p. 3-4).