Wednesday, October 27, 2010

BiSciCol Tracker: Ready, set, go!

BiSciCol (Biological Science Collections) Tracker is a recently funded NSF project (September 2010) with the goal of building an infrastructure designed to tag and track scientific collections and all of their derivatives.

Scientific collections created and used in basic research are an integral part of our scientific infrastructure. Individual specimens in these collections serve as the anchor for an expanding array of information that grows and changes with time about the specimen and the group that the specimen represents. Unfortunately, as we all know, specimens and subsamples are scattered geographically across institutions. Taxonomic, genomic, geospatial, and other information about the specimens is also scattered across independent computer systems and on paper, and are very difficult to access or synthesize. Current data sharing systems such as DigIR are one-way channels and do not allow for quick and easy two-way linking of information or updates as new knowledge is gained.

The BiSciCol team will take the appropriate next steps to address a community-wide challenge facing the biological collections community – linking and tracking scientific collection objects (specimens, sequences, images, etc.) and their digital metadata across multiple institutional collections with heterogeneous information management systems. In current distributed data systems (e.g., GBIF, MANIS, HerpNET, ORNIS), information is passed one-way from data providers to users. No mechanism exists to tag or annotate collection objects and link information to other collection objects or data records and back to the original collections. Our deliverables include 1) develop a tracking and annotation system based on globally unique identifiers (GUIDs) and ontological relationships; 2) deploy this system and others in a Virtual Information Appliance (VIA) as a Virtual Machine (VM); and 3) document and implement a set of use cases and practices, based on characteristic physical and digital workflows in the community.

The need to provide access to validated biodiversity information has been documented in a number of workshops, reports, etc., but as yet there is no single implementation that would support collections and research information management using the proposed approach. BiSciCol is designed on the simple premise that changes to data objects are trackable with GUIDs, and that semantic relationships are assignable and discoverable among physical and data objects, for example when a specimen is imaged or sampled for DNA extraction. Ultimately, this project enables discovery, accessibility, and networking of collections, in order to advance semantic interoperability for collection information systems. 

Our deliverables are designed to benefit the entire biological collections community by taking initial steps to implement core information infrastructure based on established challenges in the community. Collections data are critical to land management decisions, maintenance of biodiversity, and analysis of the causes and consequence of climate change. Finally, we will actively engage use communities through training workshops, summer student internships, and community BioBlitz enhancements.

Who we are: The BiSciCol collaborative represents a broadly trained team of biologists, collections curators, and information and technology specialists. Our team includes 6 Institutions (University of Florida, The Smithsonian, University of California, Berkeley, University of Colorado, Boulder, University of Arizona, and The Bishop Museum, Hawaii) and 15 Investigators (Nico Cellinese, Jonathan Coddington, Neil Davies, John Deck, Rob Guralnick, Bryan Heidorn, Steve Manchester, Chris Meyer, Tom Orrell, Gustav Paulay, Rich Pyle, Kate Rachwal, George Roderick, Russell Watkins, Rob Whitton, and Norris Williams).

Needless to say, we are anxious to start!


  1. "Taxonomic, genomic, geospatial, and other information about the specimens are also scattered across independent computer systems and on paper, and are very difficult to access or synthesize."

    "information" is a singular noun, which takes a singular verb -- ... information ... is also scattered ... and is very difficult ...

  2. Thank you! If that is all that you found, I consider myself happy :-)