Specimen and sample metadata standards for biodiversity genomics: a proposal from the Darwin Tree of Life project

Mara Lawniczak(Wellcome Sanger Institute), Robert Davey(Earlham Institute), Jeena Rajan(European Bioinformatics Institute), Lyndall Pereira da Conceicoa(Wellcome Sanger Institute), Estelle Kilias(University of Oxford), Peter M. Hollingsworth(Royal Botanic Garden Edinburgh), Ian Barnes(Natural History Museum), Heather Allen(Natural History Museum), Mark Blaxter(Wellcome Sanger Institute), Josephine Burgin(European Bioinformatics Institute), Gavin R. Broad(Natural History Museum), Liam M Crowley(University of Oxford), Ester Gaya(Royal Botanic Gardens, Kew), Nancy Holroyd(Wellcome Sanger Institute), Owen T. Lewis(University of Oxford), Seanna McTaggart(Earlham Institute), Nova Mieszkowska(Marine Biological Association of the United Kingdom), Alice Minotto(Earlham Institute), Felix Shaw(Earlham Institute), Thomas A. Richards(University of Oxford), Laura Sivess(Natural History Museum)
Wellcome Open Research
July 12, 2022
Cited by 337Open Access
Full Text

Abstract

<ns4:p> The vision of the <ns4:ext-link xmlns:ns5="http://www.w3.org/1999/xlink" ext-link-type="uri" ns5:href="https://www.earthbiogenome.org/">Earth BioGenome Project</ns4:ext-link> <ns4:sup>1</ns4:sup> is to complete reference genomes for all of the planet’s ~2M described eukaryotic species in the coming decade. To contribute to this global endeavour, the <ns4:ext-link xmlns:ns5="http://www.w3.org/1999/xlink" ext-link-type="uri" ns5:href="https://protect-us.mimecast.com/s/JGLTC82o95fXARy0XI1hqWb?domain=darwintreeoflife.org/">Darwin Tree of Life Project </ns4:ext-link> (DToL <ns4:sup>2</ns4:sup> ) was launched in 2019 with the aim of generating complete genomes for the ~70k described eukaryotic species that can be found in Britain and Ireland. One of the early tasks of the DToL project was to determine, define, and standardise the important metadata that must accompany every sample contributing to this ambitious project. This ensures high-quality contextual information is available for the associated data, enabling a richer set of information upon which to search and filter datasets as well as enabling interoperability between datasets used for downstream analysis. Here we describe some of the key factors we considered in the process of determining, defining, and documenting the metadata required for DToL project samples. The manifest and Standard Operating Procedure that are referred to throughout this paper are likely to be useful for other projects, and we encourage re-use while maintaining the standards and rules set out here. </ns4:p>


Related Papers

No related papers found

Powered by citation graph analysis