Protein sequence design by conformational landscape optimization

Christoffer Norn(University of Washington), Basile I. M. Wicky(University of Washington), David Juergens(University of Washington), Sirui Liu(Harvard University), David E. Kim(University of Washington), Doug Tischer(University of Washington), Brian Koepnick(University of Washington), Ivan Anishchenko(University of Washington), Foldit Players(Howard Hughes Medical Institute), David Baker(Howard Hughes Medical Institute), Sergey Ovchinnikov(Harvard University), Alan Coral, Alex J. Bubar, Alexander Boykov, Alexander Uriel Valle Pérez, Alison MacMillan, Allen Lubow, Andrea Mussini, Andrew Cai, Andrew John Ardill, Aniruddha Seal, Artak Kalantarian, Barbara Failer, Belinda Lackersteen, Benjamin Chagot, Beverly R. Haight, Bora Tastan, Boris Uitham, Brandon G. Roy, Breno Renan de Melo Cruz, Brian Echols, Brian Edward Lorenz, Bruce G. Blair, Bruno Kestemont, Charles Eastlake, Callen Joseph Bragdon, Carl Vardeman, Carlo Salerno, Casey Comisky, Catherine Louise Hayman, Catherine R. Landers, Cathy Zimov, Charles D. Coleman, Charles Robert Painter, Christopher Ince, Conor Lynagh, Dmitrii Malaniia, Douglas Craig Wheeler, Douglas Robertson, Vera Simon, Emanuele Chisari, E. Kai, Farah Rezae, Ferenc Lengyel, Flavian Tabotta, Franco Padelletti, Frisno Boström, G. Gross, George Victor McIlvaine, Gil Beecher, Gregory Hansen, Guido de Jong, Harald Feldmann, Jami Lynne Borman, Jamie Quinn, Jane Norrgard, Jason Truong, Jasper A. Diderich, Jeffrey M. Canfield, Jeffrey Photakis, Jesse Slone, Joanna Madzio, Joanne Mitchell, John Charles Stomieroski, John H. Mitch, Johnathan Robert Altenbeck, Jonas Schinkler, Jonathan Barak Weinberg, Joshua David Burbach, João C. Sequeira, Juan F. Bada Juarez, Jón Pétur Gunnarsson, Kathleen Diane Harper, Keehyoung Joo, Keith Clayton, Kenneth E. DeFord, Kevin F. Scully, Kevin M. Gildea, Kirk J. Abbey, K. L. Kohli, Kyle Stenner, Kálmán Takács, LaVerne Poussaint, Larry C. Manalo, Larry C. Withers, Lilium Carlson, Linda Wei, Luke Ryan Fisher, L. A. Carpenter, Ma Ji-hwan, Manuel Ricci, Marcus Belcastro, Marek Leniec, M. Hohmann, Mark Thompson, Matthew A. Thayer, Matthias Gaebel, Michael D. Cassidy, Michael Fagiola, Michael R. Lewis, Michael Pfützenreuter, Michael Simon, Moamen M. Elmassry, Noah Benevides, Norah Kathleen Kerr, Nupur Verma, Oak Shannon, Owen Yin, Pascal Wolfteich, Paul Gummersall, Paweł Tłuścik, Peter Gajar, Peter John Triggiani, Rajarshi Guha, Renton Braden Mathew Innes, Ricky Buchanan, Robert Gamble, Robert Leduc, Robert Spearing, Rodrigo Luccas Corrêa dos Santos Gomes, Roger D. Estep, Ryan DeWitt, Ryan M. Moore, Scott Shnider, Scott J. Zaccanelli, Sergey Kuznetsov, Sergio Burillo‐Sanz, S. Mooney, Sidoruk Vasiliy, Slava Butkovich, Spencer Bruce Hudson, Spencer Len Pote, Stephen Phillip Denne, Steven A. Schwegmann, Sumanth Ratna, Susan C. Kleinfelter, Thomas Bausewein, Thomas J. George, Tobias Scherf de Almeida, Ulas Yeginer, Walter Barmettler, Warwick Pulley, William Scott Wright, Willyanto, Wyatt Lansford, Xavier Hochart, Yoan Anthony Skander Gaiji, Yuriy Lagodich, Vivier Christian
Proceedings of the National Academy of Sciences
March 12, 2021
Cited by 151Open Access
Full Text

Abstract

The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen's thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not only all possible amino acid sequences but also, all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest-energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest-energy conformation for the designed sequence, and typically discarding a large fraction of designed sequences for which this is not the case. Here, we show that by backpropagating gradients through the transform-restrained Rosetta (trRosetta) structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures in a single calculation. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single-point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by conformational landscape optimization with the standard energy-based sequence design methodology in Rosetta and show that the former can result in energy landscapes with fewer alternative energy minima. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low-resolution trRosetta model serves to disfavor alternative states, and the high-resolution Rosetta model serves to create a deep energy minimum at the design target structure.


Related Papers

No related papers found

Powered by citation graph analysis