Learning representations for image-based profiling of perturbations

Nikita Moshkov(HUN-REN Szegedi Biológiai Kutatóközpont), Michael Bornholdt(Broad Institute), Santiago Benoit(Broad Institute), Matthew Smith(Broad Institute), Claire McQuin(Broad Institute), Allen Goodman(Broad Institute), Rebecca A. Senft(Broad Institute), Yu Han(Broad Institute), Mehrtash Babadi(Broad Institute), Péter Horváth(HUN-REN Szegedi Biológiai Kutatóközpont), Beth A. Cimini(Broad Institute), Anne E. Carpenter(Broad Institute), Shantanu Singh(Broad Institute), Juan Carlos Caicedo(Broad Institute)
Nature Communications
February 21, 2024
Cited by 99Open Access
Full Text

Abstract

Measuring the phenotypic effect of treatments on cells through imaging assays is an efficient and powerful way of studying cell biology, and requires computational methods for transforming images into quantitative data. Here, we present an improved strategy for learning representations of treatment effects from high-throughput imaging, following a causal interpretation. We use weakly supervised learning for modeling associations between images and treatments, and show that it encodes both confounding factors and phenotypic features in the learned representation. To facilitate their separation, we constructed a large training dataset with images from five different studies to maximize experimental diversity, following insights from our causal analysis. Training a model with this dataset successfully improves downstream performance, and produces a reusable convolutional network for image-based profiling, which we call Cell Painting CNN. We evaluated our strategy on three publicly available Cell Painting datasets, and observed that the Cell Painting CNN improves performance in downstream analysis up to 30% with respect to classical features, while also being more computationally efficient.


Related Papers

No related papers found

Powered by citation graph analysis