V

Vamsi K. Kodali

National Institutes of Health

ORCID: 0000-0001-8941-9761

Publishes on Endoplasmic Reticulum Stress and Disease, Genomics and Phylogenetic Studies, RNA and protein synthesis mechanisms. 30 papers and 9.9k citations.

30Publications
9.9kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation
Nuala A. O’Leary, Matt W. Wright, J. Rodney Brister et al.|Nucleic Acids Research|2015
Cited by 7kOpen Access

The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55,000 organisms (>4800 viruses, >40,000 prokaryotes and >10,000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.

Improved reference genome of Aedes aegypti informs arbovirus vector control
Cited by 679Open Access

Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science. We anchored physical and cytogenetic maps, doubled the number of known chemosensory ionotropic receptors that guide mosquitoes to human hosts and egg-laying sites, provided further insight into the size and composition of the sex-determining M locus, and revealed copy-number variation among glutathione S-transferase genes that are important for insecticide resistance. Using high-resolution quantitative trait locus and population genomic analyses, we mapped new candidates for dengue vector competence and insecticide resistance. AaegL5 will catalyse new biological insights and intervention strategies to fight this deadly disease vector.

A joint NCBI and EMBL-EBI transcript set for clinical genomics and research
Cited by 582Open Access

Abstract Comprehensive genome annotation is essential to understand the impact of clinically relevant variants. However, the absence of a standard for clinical reporting and browser display complicates the process of consistent interpretation and reporting. To address these challenges, Ensembl/GENCODE 1 and RefSeq 2 launched a joint initiative, the Matched Annotation from NCBI and EMBL-EBI (MANE) collaboration, to converge on human gene and transcript annotation and to jointly define a high-value set of transcripts and corresponding proteins. Here, we describe the MANE transcript sets for use as universal standards for variant reporting and browser display. The MANE Select set identifies a representative transcript for each human protein-coding gene, whereas the MANE Plus Clinical set provides additional transcripts at loci where the Select transcripts alone are not sufficient to report all currently known clinical variants. Each MANE transcript represents an exact match between the exonic sequences of an Ensembl/GENCODE transcript and its counterpart in RefSeq such that the identifiers can be used synonymously. We have now released MANE Select transcripts for 97% of human protein-coding genes, including all American College of Medical Genetics and Genomics Secondary Findings list v3.0 (ref. 3 ) genes. MANE transcripts are accessible from major genome browsers and key resources. Widespread adoption of these transcript sets will increase the consistency of reporting, facilitate the exchange of data regardless of the annotation source and help to streamline clinical interpretation.

NCBI RefSeq: reference sequence standards through 25 years of curation and annotation
Tamara Goldfarb, Vamsi K. Kodali, Shashikant Pujar et al.|Nucleic Acids Research|2024
Cited by 252Open Access

Reference sequences and annotations serve as the foundation for many lines of research today, from organism and sequence identification to providing a core description of the genes, transcripts and proteins found in an organism's genome. Interpretation of data including transcriptomics, proteomics, sequence variation and comparative analyses based on reference gene annotations informs our understanding of gene function and possible disease mechanisms, leading to new biomedical discoveries. The Reference Sequence (RefSeq) resource created at the National Center for Biotechnology Information (NCBI) leverages both automatic processes and expert curation to create a robust set of reference sequences of genomic, transcript and protein data spanning the tree of life. RefSeq continues to refine its annotation and quality control processes and utilize better quality genomes resulting from advances in sequencing technologies as well as RNA-Seq data to produce high-quality annotated genomes, ortholog predictions across more organisms and other products that are easily accessible through multiple NCBI resources. This report summarizes the current status of the eukaryotic, prokaryotic and viral RefSeq resources, with a focus on eukaryotic annotation, the increase in taxonomic representation and the effect it will have on comparative genomics. The RefSeq resource is publicly accessible at https://www.ncbi.nlm.nih.gov/refseq.

Population genomics of the critically endangered kākāpō
Cited by 238Open Access

The kākāpō is a flightless parrot endemic to New Zealand. Once common in the archipelago, only 201 individuals remain today, most of them descending from an isolated island population. We report the first genome-wide analyses of the species, including a high-quality genome assembly for kākāpō, one of the first chromosome-level reference genomes sequenced by the Vertebrate Genomes Project (VGP). We also sequenced and analyzed 35 modern genomes from the sole surviving island population and 14 genomes from the extinct mainland population. While theory suggests that such a small population is likely to have accumulated deleterious mutations through genetic drift, our analyses on the impact of the long-term small population size in kākāpō indicate that present-day island kākāpō have a reduced number of harmful mutations compared to mainland individuals. We hypothesize that this reduced mutational load is due to the island population having been subjected to a combination of genetic drift and purging of deleterious mutations, through increased inbreeding and purifying selection, since its isolation from the mainland ∼10,000 years ago. Our results provide evidence that small populations can survive even when isolated for hundreds of generations. This work provides key insights into kākāpō breeding and recovery and more generally into the application of genetic tools in conservation efforts for endangered species.