Vipin Kumar

JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles

Ieva Rauluševičiūtė, Rafael Riudavets Puig, Romain Blanc‐Mathieu et al.|Nucleic Acids Research|2023

Cited by 1kOpen Access

JASPAR (https://jaspar.elixir.no/) is a widely-used open-access database presenting manually curated high-quality and non-redundant DNA-binding profiles for transcription factors (TFs) across taxa. In this 10th release and 20th-anniversary update, the CORE collection has expanded with 329 new profiles. We updated three existing profiles and provided orthogonal support for 72 profiles from the previous release's UNVALIDATED collection. Altogether, the JASPAR 2024 update provides a 20% increase in CORE profiles from the previous release. A trimming algorithm enhanced profiles by removing low information content flanking base pairs, which were likely uninformative (within the capacity of the PFM models) for TFBS predictions and modelling TF-DNA interactions. This release includes enhanced metadata, featuring a refined classification for plant TFs' structural DNA-binding domains. The new JASPAR collections prompt updates to the genomic tracks of predicted TF binding sites (TFBSs) in 8 organisms, with human and mouse tracks available as native tracks in the UCSC Genome browser. All data are available through the JASPAR web interface and programmatically through its API and the updated Bioconductor and pyJASPAR packages. Finally, a new TFBS extraction tool enables users to retrieve predicted JASPAR TFBSs intersecting their genomic regions of interest.

Slide-tags enables single-nucleus barcoding for multimodal spatial genomics

Andrew J. C. Russell, Jackson A. Weir, Naeem Nadaf et al.|Nature|2023

Cited by 193Open Access

. However, missing from these measurements is the ability to routinely and easily spatially localize these profiled cells. We developed a strategy, Slide-tags, in which single nuclei within an intact tissue section are tagged with spatial barcode oligonucleotides derived from DNA-barcoded beads with known positions. These tagged nuclei can then be used as an input into a wide variety of single-nucleus profiling assays. Application of Slide-tags to the mouse hippocampus positioned nuclei at less than 10 μm spatial resolution and delivered whole-transcriptome data that are indistinguishable in quality from ordinary single-nucleus RNA-sequencing data. To demonstrate that Slide-tags can be applied to a wide variety of human tissues, we performed the assay on brain, tonsil and melanoma. We revealed cell-type-specific spatially varying gene expression across cortical layers and spatially contextualized receptor-ligand interactions driving B cell maturation in lymphoid tissue. A major benefit of Slide-tags is that it is easily adaptable to almost any single-cell measurement technology. As a proof of principle, we performed multiomic measurements of open chromatin, RNA and T cell receptor (TCR) sequences in the same cells from metastatic melanoma, identifying transcription factor motifs driving cancer cell state transitions in spatially distinct microenvironments. Slide-tags offers a universal platform for importing the compendium of established single-cell measurements into the spatial genomics repertoire.

Sub-nucleosomal Genome Structure Reveals Distinct Nucleosome Folding Motifs

M. OHNO, Tadashi Ando, David G. Priest et al.|Cell|2019

Cited by 157Open Access

An Integrated Quantitative Proteomics Workflow for Cancer Biomarker Discovery and Validation in Plasma

Vipin Kumar, Sandipan Ray, Saicharan Ghantasala et al.|Frontiers in Oncology|2020

Cited by 45Open Access

Blood plasma is one of the most widely used samples for cancer biomarker discovery research as well as clinical investigations for diagnostic and therapeutic purposes. However, the plasma proteome is extremely complex due to its wide dynamic range of protein concentrations and the presence of high-abundance proteins. Here we have described an optimized, integrated quantitative proteomics pipeline combining the label-free and multiplexed-labeling-based (iTRAQ and TMT) plasma proteome profiling methods for biomarker discovery, followed by the targeted approaches for validation of the identified potential marker proteins. In this workflow, the targeted quantitation of proteins is carried out by multiple-reaction monitoring (MRM) and parallel-reaction monitoring (PRM) mass spectrometry. Thus, our approach enables both unbiased screenings of biomarkers and their subsequent selective validation in human plasma. The overall procedure takes only ~2 days to complete, including the time for data acquisition (excluding database searching). This protocol is quick, flexible, and eliminates the need for a separate immunoassay-based validation workflow in blood cancer biomarker investigations. We anticipate that this plasma proteomics workflow will help to accelerate the cancer biomarker discovery program and provide a valuable resource to the cancer research community.

JASPAR 2026: expansion of transcription factor binding profiles and integration of deep learning models

Damla Ovek, Ieva Rauluševičiūtė, Dina Ruud Aronsen et al.|Nucleic Acids Research|2025

Cited by 37Open Access

JASPAR (https://jaspar.elixir.no/) is an open-access database that has provided high-quality, manually curated, and non-redundant DNA binding profiles for transcription factors (TFs) as position frequency matrices (PFMs) for over 20 years. We expanded the CORE (306 new profiles, 12% increase) and UNVALIDATED (433, 60% increase) collections with new PFMs and updated 13 existing profiles. We updated the TF binding site predictions and genome tracks for eight species. TF binding profile clusters and familial TF binding sites were updated accordingly. We integrate the inMOTIFin software to easily simulate regulatory sequences using JASPAR PFMs. To enrich TFs' annotations, we provide scientific literature-based human TF target information. Notably, this release features a deep learning (DL) collection, providing a paradigm shift in modeling and characterizing TF-DNA interactions with 1259 BPNet models trained on Homo sapiens ENCODE chromatin immunoprecipitation followed by sequencing (ChIP-seq) datasets from 240 TFs and interpreted to reveal predictive motif patterns for the models. The motifs associated with the same TF were clustered to provide a summary of the binding properties, resulting in 240 primary and 113 alternative motif patterns in the DL collection. The JASPAR 2026 collections lay a foundation for future endeavors in genomic research, serving the scientific community in uncovering the mechanisms of gene regulation.

Is this you? Claim your profile.

Top publicationsby citations