Predicting cancer outcomes from histology and genomics using convolutional networksPooya Mobadersany, Safoora Yousefi, Mohamed Amgad et al.|Proceedings of the National Academy of Sciences|2018 Cancer histology reflects underlying molecular processes and disease progression and contains rich phenotypic information that is predictive of patient outcomes. In this study, we show a computational approach for learning patient outcomes from digital pathology images using deep learning to combine the power of adaptive machine learning algorithms with traditional survival models. We illustrate how these survival convolutional neural networks (SCNNs) can integrate information from both histology images and genomic biomarkers into a single unified framework to predict time-to-event outcomes and show prediction accuracy that surpasses the current clinical paradigm for predicting the overall survival of patients diagnosed with glioma. We use statistical sampling techniques to address challenges in learning survival from histology images, including tumor heterogeneity and the need for large training cohorts. We also provide insights into the prediction mechanisms of SCNNs, using heat map visualization to show that SCNNs recognize important structures, like microvascular proliferation, that are related to prognosis and that are used by pathologists in grading. These results highlight the emerging role of deep learning in precision medicine and suggest an expanding utility for computational analysis of histology in the future practice of pathology.
NuCLS: A scalable crowdsourcing approach and dataset for nucleus classification and segmentation in breast cancerBACKGROUND: Deep learning enables accurate high-resolution mapping of cells and tissue structures that can serve as the foundation of interpretable machine-learning models for computational pathology. However, generating adequate labels for these structures is a critical barrier, given the time and effort required from pathologists. RESULTS: This article describes a novel collaborative framework for engaging crowds of medical students and pathologists to produce quality labels for cell nuclei. We used this approach to produce the NuCLS dataset, containing >220,000 annotations of cell nuclei in breast cancers. This builds on prior work labeling tissue regions to produce an integrated tissue region- and cell-level annotation dataset for training that is the largest such resource for multi-scale analysis of breast cancer histology. This article presents data and analysis results for single and multi-rater annotations from both non-experts and pathologists. We present a novel workflow that uses algorithmic suggestions to collect accurate segmentation data without the need for laborious manual tracing of nuclei. Our results indicate that even noisy algorithmic suggestions do not adversely affect pathologist accuracy and can help non-experts improve annotation quality. We also present a new approach for inferring truth from multiple raters and show that non-experts can produce accurate annotations for visually distinctive classes. CONCLUSIONS: This study is the most extensive systematic exploration of the large-scale use of wisdom-of-the-crowd approaches to generate data for computational pathology applications.
Predicting cancer outcomes from histology and genomics using convolutional networksPooya Mobadersany, Safoora Yousefi, Mohamed Amgad et al.|bioRxiv (Cold Spring Harbor Laboratory)|2017 ABSTRACT Cancer histology reflects underlying molecular processes and disease progression, and contains rich phenotypic information that is predictive of patient outcomes. In this study, we demonstrate a computational approach for learning patient outcomes from digital pathology images using deep learning to combine the power of adaptive machine learning algorithms with traditional survival models. We illustrate how this approach can integrate information from both histology images and genomic biomarkers to predict time-to-event patient outcomes, and demonstrate performance surpassing the current clinical paradigm for predicting the survival of patients diagnosed with glioma. We also provide techniques to visualize the tissue patterns learned by these deep learning survival models, and establish a framework for addressing intratumoral heterogeneity and training data deficits.
GestAltNet: aggregation and attention to improve deep learning of gestational age from placental whole-slide imagesInteractive Classification of Whole-Slide Imaging Data for Cancer ResearchersWhole-slide histology images contain information that is valuable for clinical and basic science investigations of cancer but extracting quantitative measurements from these images is challenging for researchers who are not image analysis specialists. In this article, we describe HistomicsML2, a software tool for learn-by-example training of machine learning classifiers for histologic patterns in whole-slide images. This tool improves training efficiency and classifier performance by guiding users to the most informative training examples for labeling and can be used to develop classifiers for prospective application or as a rapid annotation tool that is adaptable to different cancer types. HistomicsML2 runs as a containerized server application that provides web-based user interfaces for classifier training, validation, exporting inference results, and collaborative review, and that can be deployed on GPU servers or cloud platforms. We demonstrate the utility of this tool by using it to classify tumor-infiltrating lymphocytes in breast carcinoma and cutaneous melanoma. SIGNIFICANCE: An interactive machine learning tool for analyzing digital pathology images enables cancer researchers to apply this tool to measure histologic patterns for clinical and basic science studies.