Integrating gene annotation with orthology inference at scale

Bogdan Kirilenko(Goethe University Frankfurt), Chetan Munegowda(Goethe University Frankfurt), Ekaterina Osipova(Goethe University Frankfurt), David Jebb(Max Planck Institute for the Physics of Complex Systems), Virag Sharma(Max Planck Institute for the Physics of Complex Systems), Moritz Blumer(Max Planck Institute for the Physics of Complex Systems), Ariadna E. Morales(Goethe University Frankfurt), Alexis-Walid Ahmed(Goethe University Frankfurt), Dimitrios ‐ Georgios Kontopoulos(Goethe University Frankfurt), Leon Hilgers(Goethe University Frankfurt), Kerstin Lindblad‐Toh(Broad Institute), Elinor K. Karlsson(Broad Institute), Michael Hiller(Goethe University Frankfurt), Gregory Andrews, Joel Armstrong, Matteo Bianchi, Bruce W. Birren, Kevin R. Bredemeyer, Ana M. Breit, Matthew J. Christmas, Hiram Clawson, Joana Damas, Federica Di Palma, Mark Diekhans, Michael X. Dong, Eduardo Eizirik, Kaili Fan, Cornelia Fanter, Nicole M. Foley, Karin Forsberg‐Nilsson, Carlos J. Garcia, John Gatesy, Steven Gazal, Diane P. Genereux, Linda Goodman, Jenna Grimshaw, Michaela K. Halsey, Andrew J. Harris, Glenn Hickey, Michael Hiller(Goethe University Frankfurt), Allyson G. Hindle, Robert Hubley, Graham M. Hughes, Jeremy Johnson, David Juan(Max Planck Institute for the Physics of Complex Systems), Irene M. Kaplow, Elinor K. Karlsson(Broad Institute), Kathleen C. Keough, Bogdan Kirilenko(Goethe University Frankfurt), Klaus‐Peter Koepfli, Jennifer M. Korstian, Amanda Kowalczyk, Sergey V. Kozyrev, Alyssa J. Lawler, Colleen Lawless, Thomas Lehmann, Danielle L. Levesque, Harris A. Lewin, Xue Li(Broad Institute), Abigail Lind(Broad Institute), Kerstin Lindblad‐Toh(Broad Institute), Ava Mackay-Smith, Voichita D. Marinescu, Tomàs Marquès‐Bonet, Victor C. Mason, Jennifer R. S. Meadows, Wynn K. Meyer, Jill E. Moore, Lucas R. Moreira, Diana D. Moreno-Santillán, Kathleen M. Morrill, Gerard Muntané, William J. Murphy, Arcadi Navarro, Martin Nweeia, Sylvia Ortmann, Austin Osmanski, Benedict Paten, Nicole S. Paulat, Andreas R. Pfenning, BaDoi N. Phan, Katherine S. Pollard, Henry Pratt, David A. Ray(Max Planck Institute for the Physics of Complex Systems), Steven K. Reilly, Jeb Rosen, Irina Ruf, Louise Ryan, Oliver A. Ryder, Pardis C. Sabeti, Daniel E. Schäffer, Aitor Serres, Beth Shapiro, Arian F. A. Smit, Mark S. Springer, Chaitanya Srinivasan, Cynthia Steiner, Jessica M. Storer, Kevin A. Sullivan, Patrick F. Sullivan, Elisabeth Sundström, Megan A. Supple, Ross Swofford, Joy-El Talbot, Emma C. Teeling, Jason Turner-Maier, Alejandro Valenzuela, Franziska Wagner, Ola Wallerman, Chao Wang, Juehan Wang, Zhiping Weng, Aryn P. Wilder, Morgan Wirthlin, James R. Xue, Xiaomeng Zhang
Science
April 27, 2023
Cited by 198Open Access
Full Text

Abstract

Annotating coding genes and inferring orthologs are two classical challenges in genomics and evolutionary biology that have traditionally been approached separately, limiting scalability. We present TOGA (Tool to infer Orthologs from Genome Alignments), a method that integrates structural gene annotation and orthology inference. TOGA implements a different paradigm to infer orthologous loci, improves ortholog detection and annotation of conserved genes compared with state-of-the-art methods, and handles even highly fragmented assemblies. TOGA scales to hundreds of genomes, which we demonstrate by applying it to 488 placental mammal and 501 bird assemblies, creating the largest comparative gene resources so far. Additionally, TOGA detects gene losses, enables selection screens, and automatically provides a superior measure of mammalian genome quality. TOGA is a powerful and scalable method to annotate and compare genes in the genomic era.


Related Papers

No related papers found

Powered by citation graph analysis