K

Kevin Boehm

Memorial Sloan Kettering Cancer Center

ORCID: 0000-0003-2426-5436

Publishes on Radiomics and Machine Learning in Medical Imaging, AI in cancer detection, Lung Cancer Diagnosis and Treatment. 37 papers and 2k citations.

37Publications
2kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer
R. Vanguri, Jia Luo, Andrew Aukerman et al.|Nature Cancer|2022
Cited by 355Open Access

Immunotherapy is used to treat almost all patients with advanced non-small cell lung cancer (NSCLC); however, identifying robust predictive biomarkers remains challenging. Here we show the predictive capacity of integrating medical imaging, histopathologic and genomic features to predict immunotherapy response using a cohort of 247 patients with advanced NSCLC with multimodal baseline data obtained during diagnostic clinical workup, including computed tomography scan images, digitized programmed death ligand-1 immunohistochemistry slides and known outcomes to immunotherapy. Using domain expert annotations, we developed a computational workflow to extract patient-level features and used a machine-learning approach to integrate multimodal features into a risk prediction model. Our multimodal model (area under the curve (AUC) = 0.80, 95% confidence interval (CI) 0.74-0.86) outperformed unimodal measures, including tumor mutational burden (AUC = 0.61, 95% CI 0.52-0.70) and programmed death ligand-1 immunohistochemistry score (AUC = 0.73, 95% CI 0.65-0.81). Our study therefore provides a quantitative rationale for using multimodal features to improve prediction of immunotherapy response in patients with NSCLC using expert-guided machine learning.

Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer
Kevin Boehm, Emily A. Aherne, Lora H. Ellenson et al.|Nature Cancer|2022
Cited by 335Open Access

Patients with high-grade serous ovarian cancer suffer poor prognosis and variable response to treatment. Known prognostic factors for this disease include homologous recombination deficiency status, age, pathological stage and residual disease status after debulking surgery. Recent work has highlighted important prognostic information captured in computed tomography and histopathological specimens, which can be exploited through machine learning. However, little is known about the capacity of combining features from these disparate sources to improve prediction of treatment response. Here, we assembled a multimodal dataset of 444 patients with primarily late-stage high-grade serous ovarian cancer and discovered quantitative features, such as tumor nuclear size on staining with hematoxylin and eosin and omental texture on contrast-enhanced computed tomography, associated with prognosis. We found that these features contributed complementary prognostic information relative to one another and clinicogenomic features. By fusing histopathological, radiologic and clinicogenomic machine-learning models, we demonstrate a promising path toward improved risk stratification of patients with cancer through multimodal data integration.

Ovarian cancer mutational processes drive site-specific immune evasion
Cited by 297Open Access

Abstract High-grade serous ovarian cancer (HGSOC) is an archetypal cancer of genomic instability 1–4 patterned by distinct mutational processes 5,6 , tumour heterogeneity 7–9 and intraperitoneal spread 7,8,10 . Immunotherapies have had limited efficacy in HGSOC 11–13 , highlighting an unmet need to assess how mutational processes and the anatomical sites of tumour foci determine the immunological states of the tumour microenvironment. Here we carried out an integrative analysis of whole-genome sequencing, single-cell RNA sequencing, digital histopathology and multiplexed immunofluorescence of 160 tumour sites from 42 treatment-naive patients with HGSOC. Homologous recombination-deficient HRD-Dup ( BRCA1 mutant-like) and HRD-Del ( BRCA2 mutant-like) tumours harboured inflammatory signalling and ongoing immunoediting, reflected in loss of HLA diversity and tumour infiltration with highly differentiated dysfunctional CD8 + T cells. By contrast, foldback-inversion-bearing tumours exhibited elevated immunosuppressive TGFβ signalling and immune exclusion, with predominantly naive/stem-like and memory T cells. Phenotypic state associations were specific to anatomical sites, highlighting compositional, topological and functional differences between adnexal tumours and distal peritoneal foci. Our findings implicate anatomical sites and mutational processes as determinants of evolutionary phenotypic divergence and immune resistance mechanisms in HGSOC. Our study provides a multi-omic cellular phenotype data substrate from which to develop and interpret future personalized immunotherapeutic approaches and early detection research.

Automated real-world data integration improves cancer outcome prediction
Cited by 144Open Access

The digitization of health records and growing availability of tumour DNA sequencing provide an opportunity to study the determinants of cancer outcomes with unprecedented richness. Patient data are often stored in unstructured text and siloed datasets. Here we combine natural language processing annotations1,2 with structured medication, patient-reported demographic, tumour registry and tumour genomic data from 24,950 patients at Memorial Sloan Kettering Cancer Center to generate a clinicogenomic, harmonized oncologic real-world dataset (MSK-CHORD). MSK-CHORD includes data for non-small-cell lung (n = 7,809), breast (n = 5,368), colorectal (n = 5,543), prostate (n = 3,211) and pancreatic (n = 3,109) cancers and enables discovery of clinicogenomic relationships not apparent in smaller datasets. Leveraging MSK-CHORD to train machine learning models to predict overall survival, we find that models including features derived from natural language processing, such as sites of disease, outperform those based on genomic data or stage alone as tested by cross-validation and an external, multi-institution dataset. By annotating 705,241 radiology reports, MSK-CHORD also uncovers predictors of metastasis to specific organ sites, including a relationship between SETD2 mutation and lower metastatic potential in immunotherapy-treated lung adenocarcinoma corroborated in independent datasets. We demonstrate the feasibility of automated annotation from unstructured notes and its utility in predicting patient outcomes. The resulting data are provided as a public resource for real-world oncologic research. A study generates a clinicogenomics dataset resource, MSK-CHORD, that combines natural language processing-derived clinical annotations with patient medical data from various sources to improve models of cancer outcome.