AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences

Mihály Váradi(European Bioinformatics Institute), Damian Bertoni(European Bioinformatics Institute), Paulyna Magaña(European Bioinformatics Institute), Urmila Paramval(European Bioinformatics Institute), Ivanna Pidruchna(European Bioinformatics Institute), Malarvizhi Radhakrishnan(European Bioinformatics Institute), Maxim Tsenkov(European Bioinformatics Institute), Sreenath Nair(European Bioinformatics Institute), Milot Mirdita(Seoul National University), Jingi Yeo(Seoul National University), Oleg Kovalevskiy(Google DeepMind (United Kingdom)), Kathryn Tunyasuvunakool(Google DeepMind (United Kingdom)), Agata Laydon(Google DeepMind (United Kingdom)), Augustin Žídek(Google DeepMind (United Kingdom)), Hamish Tomlinson(Google DeepMind (United Kingdom)), Dhavanthi Hariharan(Google DeepMind (United Kingdom)), Josh Abrahamson(Google DeepMind (United Kingdom)), Tim Green(Google DeepMind (United Kingdom)), John Jumper(Google DeepMind (United Kingdom)), Ewan Birney(European Bioinformatics Institute), Martin Steinegger(Seoul National University), Demis Hassabis(Google DeepMind (United Kingdom)), Sameer Velankar(European Bioinformatics Institute)
Nucleic Acids Research
November 2, 2023
Cited by 1,879Open Access
Full Text

Abstract

The AlphaFold Database Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) has significantly impacted structural biology by amassing over 214 million predicted protein structures, expanding from the initial 300k structures released in 2021. Enabled by the groundbreaking AlphaFold2 artificial intelligence (AI) system, the predictions archived in AlphaFold DB have been integrated into primary data resources such as PDB, UniProt, Ensembl, InterPro and MobiDB. Our manuscript details subsequent enhancements in data archiving, covering successive releases encompassing model organisms, global health proteomes, Swiss-Prot integration, and a host of curated protein datasets. We detail the data access mechanisms of AlphaFold DB, from direct file access via FTP to advanced queries using Google Cloud Public Datasets and the programmatic access endpoints of the database. We also discuss the improvements and services added since its initial release, including enhancements to the Predicted Aligned Error viewer, customisation options for the 3D viewer, and improvements in the search engine of AlphaFold DB.


Related Papers

No related papers found

Powered by citation graph analysis