Information managers discuss approaches to data discovery during China visit

Issue: 
Network News Spring 2012, Vol. 25 No. 2

The International Long Term Ecological Research (ILTER) Network, of which the US LTER is a founder, plans to establish an Information Management System (IMS) to facilitate data exchange across all 40 member networks. To address some of the multilingual issues that will arise, a workshop entitled “Semantic Approaches to Discovery of Multilingual ILTER Data” was held June 19-23 at the East China Normal University in Shanghai, China. The workshop brought together information managers from China, Israel, UK, Korea, Taiwan, Japan, and the US, which was represented by Kristin Vanderbilt (SEV), John Porter (AND), Margaret O’Brien (SBC), and Don Henshaw (AND).  

Additional informatics expertise was contributed by the Global Biodiversity Information Facility (GBIF). The workshop was hosted by the Chinese Ecosystem Research Network (CERN)/National Ecosystem Research Network of China (CNERN), whose hospitality and logistical support was greatly appreciated by participants.  

Each participating country is building an Information Management System to document its data in its local language, and the workshop took a major step toward establishing a multilingual service  that will query across languages. For instance, a user in Taiwan will soon be able to enter a query in Chinese and discover data in English, Korean, Spanish, and so on.

Several approaches were presented and discussed, ranging from multi-language thesauruses to ontological systems. Participants concurred that as a starting point, the “EnvThes” thesaurus, under development by EnvEurope, could provide a good basis for a multilingual thesaurus. This system contains all 627 ecology terms from the LTER’s Controlled Vocabulary, as well as 500-plus terms from other widely used biological thesauri.  During the workshop, EnvThes was translated from English into Korean, Japanese, traditional Chinese, and simplified Chinese.  Definitions for the EnvThes terms were added to the thesauruses in these languages, as well.  

Products from the workshop include a prototype web service-based tool that uses translated terms to query across metadata databases (Metacats) throughout the ILTER.  Another expected product of this effort will be a manuscript outlining the next steps for the ILTER’s Information Management System and how to integrate it with an observational ontology framework that is able to capture subtleties of meaning and the relationships between concepts.

At the conclusion, participants agreed that the IM workshop clearly demonstrated the advantages of an open data-sharing policy to member networks where such a culture is nascent.  For example, the multilingual feature of the service should facilitate data sharing and pave the way for LTER scientists to mine data resources from throughout the ILTER Network.  The workshop also demonstrated how shared systems and software can promote and support international collaboration.