Strategies for Building Scientific Cyberinfrastructure

Network News Spring 2005, Vol. 18 No. 1

Continuing an Ethnographic Approach begun in 2002...

Print version published as "Strategies for Building Scientific Cyberinfrastructure" in Spring 2005 Network Newsletter.

In 2004 the National Science Foundation's cross-cutting Human and Social Dynamics priority area awarded Geoffrey Bowker and Karen Baker a three year grant to study the dynamics of building scientific cyberinfrastructure -- the connective elements needed to hold together communities of scientists working in multiple locations who want to share data and knowledge (Atkins Report, 2003).

Work on the grant project, which is titled "Interoperability Strategies for Scientific Cyberinfrastructure: A Comparative Study," began in September 2004 with the arrival at UCSD of Florence Millerand, a postdoctoral researcher trained in the study of uses of information and communication technologies with professional experience in Human-Computer Interface design and usability testing, and of David Ribes, a graduate student in science studies working with distributed communities that incorporate technical infrastructure. The project's goal is to understand how community, organizational and technical resources are simultaneously mobilized to support data integration. Data integration, ensuring that data can work together, enables the movement of data across and within disciplinary boundaries.

A lot is at stake in the development of these new cyberinfrastructures. Often viewed as mere "technical" development efforts, we take an approach to cyberinfrastructures as being as much about the sociotechnical, the interplay of the conceptual and organizational with technical rather than about technology alone. Cyberinfrastructure has human dimensions: they include changing ways in which organizations recognize and support scientific work and how the ramifications of science are tended and individual careers are viewed.

The Comparative Study builds on work begun in 2002 with an earlier NSF grant, "Designing an Infrastructure for Heterogeneity of Ecosystem Data, Collaborators, and Organization" (see Network News Vol. 14 No.2, Fall 2001), which opened up dialogue on 'data ecology', supported postdoctoral researcher Helena Karasti, and initiated a cross-domain dialogue between LTER ecologists, information managers, and science studies researchers (Baker and Bowker, 2001).

The Biodiversity and Ecosystem Informatics (BDEI) cross-agency call that supported this work recognized data, metadata and technology as an integral part of human dimensions by supporting incubation grants to consider impacts on metadata and infrastructure of the heterogeneity of data, collaborators, and organizations (Maier et al, 2001).

Building from the salient features of infrastructure (Star and Bowker, 2002) and collaboration (Twining, 1999; Olson and Olson, 2000; Finholt 2004), we are studying community characteristics in general and coordination mechanisms in particular. Communities incorporate, just as information systems instantiate, both tacit and explicit methods, values, epistemologies, and ontologies. We are using a qualitative research approach, building from grounded theory using ethnographic and theoretical sampling methods (Strauss, 1987)

Our initial work focuses on three projects with technical infrastructures:

  • GEON, a distributed geosciences project based at the San Diego Supercomputer Center that is using ontologies as a shared community approach (Keller, 2003);
  • Ocean Informatics, an oceanographic team at Scripps Institution of Oceanography that is developing a conceptual framework as a community building strategy to initiate dialogue on informatics as a design environment (Baker et al, 2005); 
  • The LTER community, a distributed network that is working to establish very long term baselines of ecological data (Hobbie et al, 2003).

The projects will proceed in partnership with both community interfaces and an interdisciplinary advisory panel. We seek an expanded vocabulary and perspective for understanding interoperability of data and communities as well as for roles of technology and participants. We work to understand the misconceptions that the technical is objective and the enacted is social and that data and information systems are socially or culturally neutral.

Human dimension efforts within the LTER network recognize humans, in their individual capacities and as community members, as participants in designing their environment as well as in perceiving it (Kinzig et al, 2001; Redman 1999, 2004; LTER white paper).

LTER sites can draw from many potentially synergistic fields within social science, ranging from political and economic ecology to history and anthropology, as well as sociology and sociotechnical informatics. For instance, local LTER sites focus on built communities (e.g., urban or education communities); on anthropological studies, (e.g., historical land use); or on data ecologies (e.g., information systems). We are working today within a context of multiple perspectives transitioning to include realist approaches to science; for our work with data, technology, and communities, we draw specifically upon science studies, information science, sociotechnical and organizational informatics (e.g. Zimmerman, 2003; Bowker, 2001).

The LTER Network environment engages research scientists in the process of designing environments to facilitate the digital preservation of data for the long-term. This community is well suited for a cross-case study focusing on cyberinfrastructure and interoperability strategies as the Network has worked for some time to facilitate cross-discipline collaboration and is planning an infrastructure to support distributed information systems.

Data is being collected, analyzed, preserved, and accessed in new ways through the creation of information systems. Distributed collective practice is becoming the norm. Such activities help shape our science as we participate in the creation and development of a new infrastructure for scientific knowledge. With a need for coordinated databases, we focus on strategies that enable interoperability.

Karen S. Baker (UCSD, PAL/CCE LTER)
Geoffrey C. Bowker (Santa Clara University)
Florence Millerand (UCSD)
David Ribes (UCSD)


Baker KS, SJ Jackson, and JR Wanetick, 2005. Strategies Supporting Heterogeneous Data and Interdisciplinary Collaboration: Towards an Ocean Informatics Environment. Proceedings of the Hawaii International Conference for System Science, January 2005, Big Island, Hawaii. IEEE, New Brunswick, NJ.

Baker KS, and GC Bowker, 2001. Designing an Infrastructure for Heterogeneity of Ecosystem Data, Collaborators, and Organizations. LTER Network News Fall 2001.

Bowker, GC, 2001. Biodiversity, Datadiversity. Social Studies of Science, 30(5), 643-684.

Finholt TA, 2004. Collaboratories. EB Cronin (ed) Annual Review of Information Science and Technology.

Hobbie, JE, SR Carpenter, NB Grimm, JR Gosz, and TR Seastedt, 2003. The US Long Term Ecological Research Program. BioScience 53(2), 21-32.

Karasti H and KS Baker, 2004. Infrastructuring for the long-term: ecological information management. In Proceedings of the Hawaiii International Conference on System Sciences (HICSS), 5-8 January 2004, Big Island, Hawaii. IEEE, New Brunswick, NJ.

Keller, RB, 2003. GEON (GEOscience Network) -- A First Step In Creating Cyberinfrastructure for The Geosciences. Electronic Seismologist, July/August 2003.

Kinzig, AP, S Carpenter, M Dove, G Heal, S Levin, J Lubchenco, SH Schneider, and D Starrett, 2000. Nature and society: an imperative for integrated environmental research Excecutive Summary of a workshop sponsored by NSF, Developing a Research Agenda for Linking Biogeophysical and Socioeconomic Systems, Tempe, AZ, 5-8 June,

Maier, D, E Landis, J Cushing, A Frondorf, AS Iberscatz, M Frame, and JL Schnase, 2001. Research Directions in Biodiversity and Ecosystem Informatics, pp 30. Workshop held at NASA Goddard Space Flight Center, June 22-23, 2000, Greenbelt, Maryland.

Olson GM and JS Olson, 2000. Distance Matters. Human-Computer Interaction 15, 139-178.

Redman, CL, JM Grove, and LH Kuby, 2004. Integrating Social Science into the Long-Term Ecological Research (LTER) Network: Social Dimensions of Ecological Change and Ecological Dimensions of Social Change. Ecosystems 7, 161-171.

Redman, CL, 1999. Human Dimensions of Ecosystem Studies. Ecosystems 2, 296-298.

Star SL and GC Bowker, 2002. How to Infrastructure in The Handbook of New Media. Lievrouw and Livingstone (eds), SAGE Publications, London, p151-162.

Strauss A, 1987. Qualitative Analysis for Social Scientists. Cambridge University Press, Cambridge.

Twining, J, 1999. A Naturalistic Inquiry into the Collaboratory: In Search of Understanding for Prospective Participants. PhD Thesis, Texas Womans University, Denton, Texas. December 1999.

Zimmerman A, 2003 Data Sharing and Secondary Use of Scientific Data: Experiences of Ecologists PhD Thesis. The University of Michigan, Ann Arbor. pp 272.

For more information and a fuller list of references, see the online version at