D

Divyanshi Srivastava

University of Lucknow

ORCID: 0000-0002-6580-6166

Publishes on Genomics and Chromatin Dynamics, RNA and protein synthesis mechanisms, Genomics and Phylogenetic Studies. 19 papers and 445 citations.

19Publications
445Total Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation
Johannes Linder, Divyanshi Srivastava, Han Yuan et al.|Nature Genetics|2025
Cited by 158Open Access

Sequence-based machine-learning models trained on genomics data improve genetic variant interpretation by providing functional predictions describing their impact on the cis-regulatory code. However, current tools do not predict RNA-seq expression profiles because of modeling challenges. Here, we introduce Borzoi, a model that learns to predict cell-type-specific and tissue-specific RNA-seq coverage from DNA sequence. Using statistics derived from Borzoi's predicted coverage, we isolate and accurately score DNA variant effects across multiple layers of regulation, including transcription, splicing and polyadenylation. Evaluated on quantitative trait loci, Borzoi is competitive with and often outperforms state-of-the-art models trained on individual regulatory functions. By applying attribution methods to the derived statistics, we extract cis-regulatory motifs driving RNA expression and post-transcriptional regulation in normal tissues. The wide availability of RNA-seq data across species, conditions and assays profiling specific aspects of regulation emphasizes the potential of this approach to decipher the mapping from DNA sequence to regulatory function.

Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation
Johannes Linder, Divyanshi Srivastava, Han Yuan et al.|bioRxiv (Cold Spring Harbor Laboratory)|2023
Cited by 64Open Access

Abstract Sequence-based machine learning models trained on genome-scale biochemical assays improve our ability to interpret genetic variants by providing functional predictions describing their impact on the cis-regulatory code. Here, we introduce a new model, Borzoi, which learns to predict cell- and tissue-specific RNA-seq coverage from DNA sequence. Using statistics derived from Borzoi’s predicted coverage, we isolate and accurately score variant effects across multiple layers of regulation, including transcription, splicing, and polyadenylation. Evaluated on QTLs, Borzoi is competitive with, and often outperforms, state-of-the-art models trained on individual regulatory functions. By applying attribution methods to the derived statistics, we extract cis-regulatory patterns driving RNA expression and post-transcriptional regulation in normal tissues. The wide availability of RNA-seq data across species, conditions, and assays profiling specific aspects of regulation emphasizes the potential of this approach to decipher the mapping from DNA sequence to regulatory function.

Differential abilities to engage inaccessible chromatin diversify vertebrate HOX binding patterns
Cited by 54Open Access

While Hox genes encode for conserved transcription factors (TFs), they are further divided into anterior, central, and posterior groups based on their DNA-binding domain similarity. The posterior Hox group expanded in the deuterostome clade and patterns caudal and distal structures. We aim to address how similar HOX TFs diverge to induce different positional identities. We studied HOX TF DNA-binding and regulatory activity during an in vitro motor neuron differentiation system that recapitulates embryonic development. We find diversity in the genomic binding profiles of different HOX TFs, even among the posterior group paralogs that share similar DNA binding domains. These differences in genomic binding are explained by differing abilities to bind to previously inaccessible sites. For example, the posterior group HOXC9 has a greater ability to bind occluded sites than the posterior HOXC10, producing different binding patterns and driving differential gene expression programs. From these results, we propose that the differential abilities of posterior HOX TFs to bind to previously inaccessible chromatin drive patterning diversification.

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding
Cited by 43Open Access

The intrinsic DNA sequence preferences and cell type-specific cooperative partners of transcription factors (TFs) are typically highly conserved. Hence, despite the rapid evolutionary turnover of individual TF binding sites, predictive sequence models of cell type-specific genomic occupancy of a TF in one species should generalize to closely matched cell types in a related species. To assess the viability of cross-species TF binding prediction, we train neural networks to discriminate ChIP-seq peak locations from genomic background and evaluate their performance within and across species. Cross-species predictive performance is consistently worse than within-species performance, which we show is caused in part by species-specific repeats. To account for this domain shift, we use an augmented network architecture to automatically discourage learning of training species-specific sequence features. This domain adaptation approach corrects for prediction errors on species-specific repeats and improves overall cross-species model performance. Our results show that cross-species TF binding prediction is feasible when models account for domain shifts driven by species-specific repeats.