Semantics in Support of Biodiversity Knowledge Discovery: An Introduction to the Biological Collections Ontology and Related Ontologies

Ramona Walls(University of Arizona), John Deck(University of California, Berkeley), Robert Guralnick(Museum of Boulder), Steve Baskauf(Vanderbilt University), Reed S. Beaman(Florida Museum of Natural History), Stanley Blum(California Academy of Sciences), Shawn Bowers(Gonzaga University), Pier Luigi Buttigieg(Alfred-Wegener-Institut Helmholtz-Zentrum für Polar- und Meeresforschung), Neil Davies(Gump South Pacific Research Station), Dag Endresen, María A. Gandolfo(Cornell University), Robert Hanner(University of Guelph), Alyssa Janning(University of Arizona), Leonard Krishtalka(University of Kansas), Andréa Matsunaga(University of Florida), Peter Midford(University of Kansas), Norman Morrison(University of Manchester), Éamonn Ó Tuama(Global Biodiversity Information Facility), Mark Schildhauer(National Center for Ecological Analysis and Synthesis), Barry Smith(University at Buffalo, State University of New York), Brian J. Stucky(University of Colorado Boulder), Andrea Thomer(University of Illinois Urbana-Champaign), John Wieczorek(Museum of Vertebrate Zoology), Jamie Whitacre(Smithsonian Institution), John Wooley(University of California San Diego)
PLoS ONE
March 3, 2014
Cited by 150Open Access
Full Text

Abstract

The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques), as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1) individual organisms, including voucher specimens from ecological studies and museum specimens, 2) bulk or environmental samples (e.g., gut contents, soil, water) that include DNA, other molecules, and potentially many organisms, especially microbes, and 3) survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and researchers.


Related Papers

Linked Data - The Story So Far
Christian Bizer, Tom Heath, Tim Berners‐Lee|International Journal on Semantic Web and Information Systems|2009|4.6k
Creating the gene ontology resource: design and implementation.
Judith A. Blake, John Corradi, Janan T. Eppig et al.|The Mouseion at the JAXlibrary (Jackson Laboratory)|2001|830
The manchester OWL syntax
Matthew Horridge, Nick Drummond, John Goodwin et al.|Research Explorer (The University of Manchester)|2006|279
The Gene Ontology: enhancements for 2011
Consortium, Gene Ontology|Nucleic Acids Research|2011|209