De novo design of protein structure and function with RFdiffusion

Joseph L. Watson(University of Washington), David Juergens(University of Washington), Nathaniel R. Bennett(University of Washington), Brian L. Trippe(University of Washington), Jason Yim(University of Washington), Helen E. Eisenach(University of Washington), Woody Ahern(University of Washington), Andrew J. Borst(University of Washington), Robert J. Ragotte(University of Washington), Lukas F. Milles(University of Washington), Basile I. M. Wicky(University of Washington), Nikita Hanikel(University of Washington), Samuel J. Pellock(University of Washington), Alexis Courbet(University of Washington), William Sheffler(University of Washington), Jue Wang(University of Washington), Preetham Venkatesh(University of Washington), Isaac Sappington(University of Washington), Susana Vázquez Torres(University of Washington), Anna Lauko(University of Washington), Valentin De Bortoli(École Normale Supérieure - PSL), Émile Mathieu(University of Cambridge), Sergey Ovchinnikov(Harvard University), Regina Barzilay(Massachusetts Institute of Technology), Tommi Jaakkola(Massachusetts Institute of Technology), Frank DiMaio(University of Washington), Minkyung Baek(Seoul National University), David Baker(Howard Hughes Medical Institute)
Nature
July 11, 2023
Cited by 1,877Open Access
Full Text

Abstract

Abstract There has been considerable recent progress in designing new proteins using deep-learning methods 1–9 . Despite this progress, a general deep-learning framework for protein design that enables solution of a wide range of design challenges, including de novo binder design and design of higher-order symmetric architectures, has yet to be described. Diffusion models 10,11 have had considerable success in image and language generative modelling but limited success when applied to protein modelling, probably due to the complexity of protein backbone geometry and sequence–structure relationships. Here we show that by fine-tuning the RoseTTAFold structure prediction network on protein structure denoising tasks, we obtain a generative model of protein backbones that achieves outstanding performance on unconditional and topology-constrained protein monomer design, protein binder design, symmetric oligomer design, enzyme active site scaffolding and symmetric motif scaffolding for therapeutic and metal-binding protein design. We demonstrate the power and generality of the method, called RoseTTAFold diffusion (RFdiffusion), by experimentally characterizing the structures and functions of hundreds of designed symmetric assemblies, metal-binding proteins and protein binders. The accuracy of RFdiffusion is confirmed by the cryogenic electron microscopy structure of a designed binder in complex with influenza haemagglutinin that is nearly identical to the design model. In a manner analogous to networks that produce images from user-specified inputs, RFdiffusion enables the design of diverse functional proteins from simple molecular specifications.


Related Papers

No related papers found

Powered by citation graph analysis