Protein complex prediction with AlphaFold-Multimer

Richard Evans(Google DeepMind (United Kingdom)), M. E. O’Neill(Google DeepMind (United Kingdom)), Alexander Pritzel(Google DeepMind (United Kingdom)), Н. В. Антропова(Google DeepMind (United Kingdom)), Andrew Senior(Google DeepMind (United Kingdom)), Tim Green(Google DeepMind (United Kingdom)), Augustin Žídek(Google DeepMind (United Kingdom)), Russ Bates(Google DeepMind (United Kingdom)), Sam Blackwell(Google DeepMind (United Kingdom)), Jason Yim(Google DeepMind (United Kingdom)), Olaf Ronneberger(Google DeepMind (United Kingdom)), Sebastian W. Bodenstein(Google DeepMind (United Kingdom)), Michał Zieliński(Google DeepMind (United Kingdom)), Alex Bridgland(Google DeepMind (United Kingdom)), Anna Potapenko(Google DeepMind (United Kingdom)), Andrew Cowie(Google DeepMind (United Kingdom)), Kathryn Tunyasuvunakool(Google DeepMind (United Kingdom)), Rishub Jain(Google DeepMind (United Kingdom)), Ellen Clancy(Google DeepMind (United Kingdom)), Pushmeet Kohli(Google DeepMind (United Kingdom)), John Jumper(Google DeepMind (United Kingdom)), Demis Hassabis(Google DeepMind (United Kingdom))
bioRxiv (Cold Spring Harbor Laboratory)
October 4, 2021
Cited by 4,018

Abstract

While the vast majority of well-structured single protein chains can now be predicted to high accuracy due to the recent AlphaFold [1] model, the prediction of multi-chain protein complexes remains a challenge in many cases. In this work, we demonstrate that an AlphaFold model trained specifically for multimeric inputs of known stoichiometry, which we call AlphaFold-Multimer, significantly increases accuracy of predicted multimeric interfaces over input-adapted single-chain AlphaFold while maintaining high intra-chain accuracy. On a benchmark dataset of 17 heterodimer proteins without templates (introduced in [2]) we achieve at least medium accuracy (DockQ [3] ≥ 0.49) on 13 targets and high accuracy (DockQ ≥ 0.8) on 7 targets, compared to 9 targets of at least medium accuracy and 4 of high accuracy for the previous state of the art system (an AlphaFold-based system from [2]). We also predict structures for a large dataset of 4,446 recent protein complexes, from which we score all non-redundant interfaces with low template identity. For heteromeric interfaces we successfully predict the interface (DockQ ≥ 0.23) in 70% of cases, and produce high accuracy predictions (DockQ ≥ 0.8) in 26% of cases, an improvement of +27 and +14 percentage points over the flexible linker modification of AlphaFold [4] respectively. For homomeric inter-faces we successfully predict the interface in 72% of cases, and produce high accuracy predictions in 36% of cases, an improvement of +8 and +7 percentage points respectively.


Related Papers

No related papers found

Powered by citation graph analysis