Learning representations for image-based profiling of perturbations

Nikita Moshkov(HUN-REN Szegedi Biológiai Kutatóközpont), Michael Bornholdt(Broad Institute), Santiago Benoit(Carnegie Mellon University), Matthew Smith(Harvard College Observatory), Claire McQuin(Broad Institute), Allen Goodman, Rebecca A. Senft(Broad Institute), Yu Han(Broad Institute), Mehrtash Babadi(Broad Institute), Péter Horváth(HUN-REN Szegedi Biológiai Kutatóközpont), Beth A. Cimini(Broad Institute), Anne E. Carpenter(Broad Institute), Shantanu Singh(Broad Institute), Juan C. Caicedo(Broad Institute)
bioRxiv (Cold Spring Harbor Laboratory)
August 15, 2022
Cited by 45Open Access
Full Text

Abstract

Abstract Measuring the phenotypic effect of treatments on cells through imaging assays is an efficient and powerful way of studying cell biology, and requires computational methods for transforming images into quantitative data that highlight phenotypic outcomes. Here, we present an optimized strategy for learning representations of treatment effects from high-throughput imaging data, which follows a causal framework for interpreting results and guiding performance improvements. We use weakly supervised learning (WSL) for modeling associations between images and treatments, and show that it encodes both confounding factors and phenotypic features in the learned representation. To facilitate their separation, we constructed a large training dataset with Cell Painting images from five different studies to maximize experimental diversity, following insights from our causal analysis. Training a WSL model with this dataset successfully improves downstream performance, and produces a reusable convolutional network for image-based profiling, which we call Cell Painting CNN-1 . We conducted a comprehensive evaluation of our strategy on three publicly available Cell Painting datasets, discovering that representations obtained by the Cell Painting CNN-1 can improve performance in downstream analysis for biological matching up to 30% with respect to classical features, while also being more computationally efficient.


Related Papers

No related papers found

Powered by citation graph analysis