BiSciCol (Biological Science Collections) Tracker is a funded NSF collaborative project with the goal of building an infrastructure designed to tag and track scientific collections and all of their derivatives.
Scientific collections created and used in basic research are an integral part of our scientific infrastructure. Individual specimens in these collections serve as the anchor for an expanding array of information that grows and changes with time about the specimen and the group that the specimen represents. Unfortunately, as we all know, specimens and subsamples are scattered geographically across institutions. Taxonomic, genomic, geospatial, and other information about the specimens is also scattered across independent computer systems and on paper, and are very difficult to access or synthesize. Current data sharing systems such as DigIR are one-way channels and do not allow for quick and easy two-way linking of information or updates as new knowledge is gained.
The BiSciCol team will take the appropriate next steps to address a community-wide challenge facing the biological collections community – linking and tracking scientific collection objects (specimens, sequences, images, etc.) and their digital metadata across multiple institutional collections with heterogeneous information management systems. In current distributed data systems (e.g., GBIF, MANIS, HerpNET, ORNIS), information is passed one-way from data providers to users. No mechanism exists to tag or annotate collection objects and link information to other collection objects or data records and back to the original collections. Our deliverables include 1) develop a tracking and annotation system based on globally unique identifiers (GUIDs) and ontological relationships; 2) deploy this system and others in a Virtual Information Appliance (VIA) as a Virtual Machine (VM); and 3) document and implement a set of use cases and practices, based on characteristic physical and digital workflows in the community.
The need to provide access to validated biodiversity information has been documented in a number of workshops, reports, etc., but as yet there is no single implementation that would support collections and research information management using the proposed approach. BiSciCol is designed on the simple premise that changes to data objects are trackable with GUIDs, and that semantic relationships are assignable and discoverable among physical and data objects, for example when a specimen is imaged or sampled for DNA extraction. Ultimately, this project enables discovery, accessibility, and networking of collections, in order to advance semantic interoperability for collection information systems.
Our deliverables are designed to benefit the entire biological collections community by taking initial steps to implement core information infrastructure based on established challenges in the community. Collections data are critical to land management decisions, maintenance of biodiversity, and analysis of the causes and consequence of climate change. Finally, we will actively engage use communities through training workshops, summer student internships, and community BioBlitz enhancements.
The original proposal submitted in July 2009 to the NSF Division of Biological of Infrastructure (DBI-BRC) can be dowloaded here. We are very grateful to the National Science Foundation for their support!