M

Marina Sirota

University of California, San Francisco

ORCID: 0000-0002-7246-6083

Publishes on Reproductive System and Pregnancy, Systemic Lupus Erythematosus Research, Bioinformatics and Genomic Networks. 362 papers and 16.3k citations.

362Publications
16.3kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++
Eugene Davydov, David L. Goode, Marina Sirota et al.|PLoS Computational Biology|2010
Cited by 1.9kOpen Access

Computational efforts to identify functional elements within genomes leverage comparative sequence information by looking for regions that exhibit evidence of selective constraint. One way of detecting constrained elements is to follow a bottom-up approach by computing constraint scores for individual positions of a multiple alignment and then defining constrained elements as segments of contiguous, highly scoring nucleotide positions. Here we present GERP++, a new tool that uses maximum likelihood evolutionary rate estimation for position-specific scoring and, in contrast to previous bottom-up methods, a novel dynamic programming approach to subsequently define constrained elements. GERP++ evaluates a richer set of candidate element breakpoints and ranks them based on statistical significance, eliminating the need for biased heuristic extension techniques. Using GERP++ we identify over 1.3 million constrained elements spanning over 7% of the human genome. We predict a higher fraction than earlier estimates largely due to the annotation of longer constrained elements, which improves one to one correspondence between predicted elements with known functional sequences. GERP++ is an efficient and effective tool to provide both nucleotide- and element-level constraint scores within deep multiple sequence alignments.

Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges
Purvesh Khatri, Marina Sirota, Atul J. Butte|PLoS Computational Biology|2012
Cited by 1.6kOpen Access

Pathway analysis has become the first choice for gaining insight into the underlying biology of differentially expressed genes and proteins, as it reduces complexity and has increased explanatory power. We discuss the evolution of knowledge base-driven pathway analysis over its first decade, distinctly divided into three generations. We also discuss the limitations that are specific to each generation, and how they are addressed by successive generations of methods. We identify a number of annotation challenges that must be addressed to enable development of the next generation of pathway analysis methods. Furthermore, we identify a number of methodological challenges that the next generation of methods must tackle to take advantage of the technological advances in genomics and proteomics in order to improve specificity, sensitivity, and relevance of pathway analysis.

Systematic pan-cancer analysis of tumour purity
Dvir Aran, Marina Sirota, Atul J. Butte|Nature Communications|2015
Cited by 1.3kOpen Access

The tumour microenvironment is the non-cancerous cells present in and around a tumour, including mainly immune cells, but also fibroblasts and cells that comprise supporting blood vessels. These non-cancerous components of the tumour may play an important role in cancer biology. They also have a strong influence on the genomic analysis of tumour samples, and may alter the biological interpretation of results. Here we present a systematic analysis using different measurement modalities of tumour purity in >10,000 samples across 21 cancer types from the Cancer Genome Atlas. Patients are stratified according to clinical features in an attempt to detect clinical differences driven by purity levels. We demonstrate the confounding effect of tumour purity on correlating and clustering tumours with transcriptomics data. Finally, using a differential expression method that accounts for tumour purity, we find an immunotherapy gene signature in several cancer types that is not detected by traditional differential expression analyses.

Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data
Marina Sirota, Joel T. Dudley, Jeewon Kim et al.|Science Translational Medicine|2011
Cited by 832

The application of established drug compounds to new therapeutic indications, known as drug repositioning, offers several advantages over traditional drug development, including reduced development costs and shorter paths to approval. Recent approaches to drug repositioning use high-throughput experimental approaches to assess a compound's potential therapeutic qualities. Here, we present a systematic computational approach to predict novel therapeutic indications on the basis of comprehensive testing of molecular signatures in drug-disease pairs. We integrated gene expression measurements from 100 diseases and gene expression measurements on 164 drug compounds, yielding predicted therapeutic potentials for these drugs. We recovered many known drug and disease relationships using computationally derived therapeutic potentials and also predict many new indications for these 164 drugs. We experimentally validated a prediction for the antiulcer drug cimetidine as a candidate therapeutic in the treatment of lung adenocarcinoma, and demonstrate its efficacy both in vitro and in vivo using mouse xenograft models. This computational method provides a systematic approach for repositioning established drugs to treat a wide range of human diseases.

Comprehensive analysis of normal adjacent to tumor transcriptomes
Dvir Aran, Roman Camarda, Justin I. Odegaard et al.|Nature Communications|2017
Cited by 644Open Access

Histologically normal tissue adjacent to the tumor (NAT) is commonly used as a control in cancer studies. However, little is known about the transcriptomic profile of NAT, how it is influenced by the tumor, and how the profile compares with non-tumor-bearing tissues. Here, we integrate data from the Genotype-Tissue Expression project and The Cancer Genome Atlas to comprehensively analyze the transcriptomes of healthy, NAT, and tumor tissues in 6506 samples across eight tissues and corresponding tumor types. Our analysis shows that NAT presents a unique intermediate state between healthy and tumor. Differential gene expression and protein-protein interaction analyses reveal altered pathways shared among NATs across tissue types. We characterize a set of 18 genes that are specifically activated in NATs. By applying pathway and tissue composition analyses, we suggest a pan-cancer mechanism of pro-inflammatory signals from the tumor stimulates an inflammatory response in the adjacent endothelium.