Illuminating protein space with a programmable generative model

John Ingraham(Intarcia Therapeutics (United States)), Max Baranov(Intarcia Therapeutics (United States)), Zak Costello(Intarcia Therapeutics (United States)), Vincent Frappier(Intarcia Therapeutics (United States)), Ahmed Ismail(Intarcia Therapeutics (United States)), Shan Tie(Intarcia Therapeutics (United States)), Wujie Wang(Intarcia Therapeutics (United States)), Vincent Xue(Intarcia Therapeutics (United States)), Fritz Obermeyer(Intarcia Therapeutics (United States)), Andrew L. Beam(Intarcia Therapeutics (United States)), Gevorg Grigoryan(Intarcia Therapeutics (United States))
bioRxiv (Cold Spring Harbor Laboratory)
December 2, 2022
Cited by 87Open Access
Full Text

Abstract

Abstract Three billion years of evolution have produced a tremendous diversity of protein molecules, and yet the full potential of this molecular class is likely far greater. Accessing this potential has been challenging for computation and experiments because the space of possible protein molecules is much larger than the space of those likely to host function. Here we introduce Chroma, a generative model for proteins and protein complexes that can directly sample novel protein structures and sequences and that can be conditioned to steer the generative process towards desired properties and functions. To enable this, we introduce a diffusion process that respects the conformational statistics of polymer ensembles, an efficient neural architecture for molecular systems based on random graph neural networks that enables long-range reasoning with sub-quadratic scaling, equivariant layers for efficiently synthesizing 3D structures of proteins from predicted inter-residue geometries, and a general low-temperature sampling algorithm for diffusion models. We suggest that Chroma can effectively realize protein design as Bayesian inference under external constraints, which can involve symmetries, substructure, shape, semantics, and even natural language prompts. With this unified approach, we hope to accelerate the prospect of programming protein matter for human health, materials science, and synthetic biology.


Related Papers

No related papers found

Powered by citation graph analysis