S. Sobhany

Rasko Leinonen, R.A. Akhtar, Ewan Birney et al.|Nucleic Acids Research|2010

Cited by 705Open Access

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary nucleotide-sequence repository. The ENA consists of three main databases: the Sequence Read Archive (SRA), the Trace Archive and EMBL-Bank. The objective of ENA is to support and promote the use of nucleotide sequencing as an experimental research platform by providing data submission, archive, search and download services. In this article, we outline these services and describe major changes and improvements introduced during 2010. These include extended EMBL-Bank and SRA-data submission services, extended ENA Browser functionality, support for submitting data to the European Genome-phenome Archive (EGA) through SRA, and the launch of a new sequence similarity search service.

Petabyte-scale innovations at the European Nucleotide Archive

Guy Cochrane, R.A. Akhtar, James Bonfield et al.|Nucleic Acids Research|2008

Cited by 104Open Access

Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence repositories, which provide the feed for these data sets into the worldwide computational infrastructure, are challenged by the impact of these data volumes. The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/embl), comprising the EMBL Nucleotide Sequence Database and the Ensembl Trace Archive, has identified challenges in the storage, movement, analysis, interpretation and visualization of petabyte-scale data sets. We present here our new repository for next generation sequence data, a brief summary of contents of the ENA and provide details of major developments to submission pipelines, high-throughput rule-based validation infrastructure and data integration approaches.

Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database

Guy Cochrane, R.A. Akhtar, P. Aldebert et al.|Nucleic Acids Research|2007

Cited by 69Open Access

The Ensembl Trace Archive (http://trace.ensembl.org/) and the EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/), known together as the European Nucleotide Archive, continue to see growth in data volume and diversity. Selected major developments of 2007 are presented briefly, along with data submission and retrieval information. In the face of increasing requirements for nucleotide trace, sequence and annotation data archiving, data capture priority decisions have been taken at the European Nucleotide Archive. Priorities are discussed in terms of how reliably information can be captured, the long-term benefits of its capture and the ease with which it can be captured.

SOAP-based services provided by the European Bioinformatics Institute

S. Pillai, V. Silventoinen, Katariina Kallio et al.|Nucleic Acids Research|2005

Cited by 59Open Access

SOAP (Simple Object Access Protocol) (http://www.w3.org/TR/soap) based Web Services technology (http://www.w3.org/ws) has gained much attention as an open standard enabling interoperability among applications across heterogeneous architectures and different networks. The European Bioinformatics Institute (EBI) is using this technology to provide robust data retrieval and data analysis mechanisms to the scientific community and to enhance utilization of the biological resources it already provides [N. Harte, V. Silventoinen, E. Quevillon, S. Robinson, K. Kallio, X. Fustero, P. Patel, P. Jokinen and R. Lopez (2004) Nucleic Acids Res., 32, 3-9]. These services are available free to all users from http://www.ebi.ac.uk/Tools/webservices.

The EBI macromolecular structure database (E-MSD) and structural genomics

K. Henrick, Harry Boutselakis, Dimitris Dimitropoulos et al.|Acta Crystallographica Section A Foundations of Crystallography|2002

Cited by 0Open Access

High throughput structural genomics projects are now underway. These projects will collect comprehensive data on protein structure. The e-msd has contributed to the detailed data representation model(s) and exchange data formats and mechanisms between each step in a structure determination. This model, incorporates the means for making the information reliable, accurate and up to date and to include indicators of reliability. The e-msd recognizes that in order for data to be used efficiently for searches within the database, and have ensured data uniformity in that the meta-description of the data is consistent across all entries. The e-msd not only allows for data harvesting and archival of clean cross referenced structural data, search mechanisms are being developed to allow research workers to for example automatically annotate structure motifs and binding site properties.

Is this you? Claim your profile.

Top publicationsby citations