G

Greg Finak

Harvard University

ORCID: 0000-0003-4341-9090

Publishes on Helicobacter pylori-related gastroenterology studies, Eosinophilic Esophagitis, Single-cell and spatial transcriptomics. 212 papers and 10.4k citations.

212Publications
10.4kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data
Greg Finak, Andrew McDavid, Masanao Yajima et al.|Genome biology|2015
Cited by 3.5kOpen Access

Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized linear model for such bimodal data that parameterizes both of these features. We argue that the cellular detection rate, the fraction of genes expressed in a cell, should be adjusted for as a source of nuisance variation. Our model provides gene set enrichment analysis tailored to single-cell data. It provides insights into how networks of co-expressed genes evolve across an experimental treatment. MAST is available at https://github.com/RGLab/MAST .

Critical assessment of automated flow cytometry data analysis techniques
Nima Aghaeepour, Greg Finak, Holger H. Hoos et al.|Nature Methods|2013
Cited by 618Open Access

In this analysis, the authors directly compared the performance of flow cytometry data processing algorithms to manual gating approaches. The results offer information of practical utility about the performance of the algorithms as applied to different data sets and challenges. Traditional methods for flow cytometry (FCM) data processing rely on subjective manual gating. Recently, several groups have developed computational methods for identifying cell populations in multidimensional FCM data. The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of these methods on two tasks: (i) mammalian cell population identification, to determine whether automated algorithms can reproduce expert manual gating and (ii) sample classification, to determine whether analysis pipelines can identify characteristics that correlate with external variables (such as clinical outcome). This analysis presents the results of the first FlowCAP challenges. Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.

Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments
Andrew McDavid, Greg Finak, Pratip K. Chattopadyay et al.|Bioinformatics|2012
Cited by 442Open Access

MOTIVATION: Cell populations are never truly homogeneous; individual cells exist in biochemical states that define functional differences between them. New technology based on microfluidic arrays combined with multiplexed quantitative polymerase chain reactions now enables high-throughput single-cell gene expression measurement, allowing assessment of cellular heterogeneity. However, few analytic tools have been developed specifically for the statistical and analytical challenges of single-cell quantitative polymerase chain reactions data. RESULTS: We present a statistical framework for the exploration, quality control and analysis of single-cell gene expression data from microfluidic arrays. We assess accuracy and within-sample heterogeneity of single-cell expression and develop quality control criteria to filter unreliable cell measurements. We propose a statistical model accounting for the fact that genes at the single-cell level can be on (and a continuous expression measure is recorded) or dichotomously off (and the recorded expression is zero). Based on this model, we derive a combined likelihood ratio test for differential expression that incorporates both the discrete and continuous components. Using an experiment that examines treatment-specific changes in expression, we show that this combined test is more powerful than either the continuous or dichotomous component in isolation, or a t-test on the zero-inflated data. Although developed for measurements from a specific platform (Fluidigm), these tools are generalizable to other multi-parametric measures over large numbers of events. AVAILABILITY: All results presented here were obtained using the SingleCellAssay R package available on GitHub (http://github.com/RGLab/SingleCellAssay).