A

Annelaura Bach Nielsen

University of Copenhagen

ORCID: 0009-0005-2855-208X

Publishes on Endometrial and Cervical Cancer Treatments, Advanced Proteomics Techniques and Applications, Cancer Genomics and Diagnostics. 39 papers and 1k citations.

39Publications
1kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: a retrospective study of high-frequency data in electronic patient records
Cited by 327Open Access

BACKGROUND: Many mortality prediction models have been developed for patients in intensive care units (ICUs); most are based on data available at ICU admission. We investigated whether machine learning methods using analyses of time-series data improved mortality prognostication for patients in the ICU by providing real-time predictions of 90-day mortality. In addition, we examined to what extent such a dynamic model could be made interpretable by quantifying and visualising the features that drive the predictions at different timepoints. METHODS: Based on the Simplified Acute Physiology Score (SAPS) III variables, we trained a machine learning model on longitudinal data from patients admitted to four ICUs in the Capital Region, Denmark, between 2011 and 2016. We included all patients older than 16 years of age, with an ICU stay lasting more than 1 h, and who had a Danish civil registration number to enable 90-day follow-up. We leveraged static data and physiological time-series data from electronic health records and the Danish National Patient Registry. A recurrent neural network was trained with a temporal resolution of 1 h. The model was internally validated using the holdout method with 20% of the training dataset and externally validated using previously unseen data from a fifth hospital in Denmark. Its performance was assessed with the Matthews correlation coefficient (MCC) and area under the receiver operating characteristic curve (AUROC) as metrics, using bootstrapping with 1000 samples with replacement to construct 95% CIs. A Shapley additive explanations algorithm was applied to the prediction model to obtain explanations of the features that drive patient-specific predictions, and the contributions of each of the 44 features in the model were analysed and compared with the variables in the original SAPS III model. FINDINGS: From a dataset containing 15 615 ICU admissions of 12 616 patients, we included 14 190 admissions of 11 492 patients in our analysis. Overall, 90-day mortality was 33·1% (3802 patients). The deep learning model showed a predictive performance on the holdout testing dataset that improved over the timecourse of an ICU stay: MCC 0·29 (95% CI 0·25-0·33) and AUROC 0·73 (0·71-0·74) at admission, 0·43 (0·40-0·47) and 0·82 (0·80-0·84) after 24 h, 0·50 (0·46-0·53) and 0·85 (0·84-0·87) after 72 h, and 0·57 (0·54-0·60) and 0·88 (0·87-0·89) at the time of discharge. The model exhibited good calibration properties. These results were validated in an external validation cohort of 5827 patients with 6748 admissions: MCC 0·29 (95% CI 0·27-0·32) and AUROC 0·75 (0·73-0·76) at admission, 0·41 (0·39-0·44) and 0·80 (0·79-0·81) after 24 h, 0·46 (0·43-0·48) and 0·82 (0·81-0·83) after 72 h, and 0·47 (0·44-0·49) and 0·83 (0·82-0·84) at the time of discharge. INTERPRETATION: The prediction of 90-day mortality improved with 1-h sampling intervals during the ICU stay. The dynamic risk prediction can also be explained for an individual patient, visualising the features contributing to the prediction at any point in time. This explanation allows the clinician to determine whether there are elements in the current patient state and care that are potentially actionable, thus making the model suitable for further validation as a clinical tool. FUNDING: Novo Nordisk Foundation and the Innovation Fund Denmark.

A knowledge graph to interpret clinical proteomics data
Alberto Santos, Ana R. Colaço, Annelaura Bach Nielsen et al.|Nature Biotechnology|2022
Cited by 286Open Access

Implementing precision medicine hinges on the integration of omics data, such as proteomics, into the clinical decision-making process, but the quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across multiple biomedical databases and publications, pose a challenge to data integration. Here we present the Clinical Knowledge Graph (CKG), an open-source platform currently comprising close to 20 million nodes and 220 million relationships that represent relevant experimental data, public databases and literature. The graph structure provides a flexible data model that is easily extendable to new nodes and relationships as new databases become available. The CKG incorporates statistical and machine learning algorithms that accelerate the analysis and interpretation of typical proteomics workflows. Using a set of proof-of-concept biomarker studies, we show how the CKG might augment and enrich proteomics data and help inform clinical decision-making.

Survival prediction in intensive-care units based on aggregation of long-term disease history and acute physiology: a retrospective study of the Danish National Patient Registry and electronic patient records
Cited by 114Open Access

BACKGROUND: Intensive-care units (ICUs) treat the most critically ill patients, which is complicated by the heterogeneity of the diseases that they encounter. Severity scores based mainly on acute physiology measures collected at ICU admission are used to predict mortality, but are non-specific, and predictions for individual patients can be inaccurate. We investigated whether inclusion of long-term disease history before ICU admission improves mortality predictions. METHODS: Registry data for long-term disease histories for more than 230 000 Danish ICU patients were used in a neural network to develop an ICU mortality prediction model. Long-term disease histories and acute physiology measures were aggregated to predict mortality risk for patients for whom both registry and ICU electronic patient record data were available. We compared mortality predictions with admission scores on the Simplified Acute Physiology Score (SAPS) II, the Acute Physiologic Assessment and Chronic Health Evaluation (APACHE) II, and the best available multimorbidity score, the Multimorbidity Index. An external validation set from an additional hospital was acquired after model construction to confirm the validity of our model. During initial model development data were split into a training set (85%) and an independent test set (15%), and a five-fold cross-validation was done during training to avoid overfitting. Neural networks were trained for datasets with disease history of 1 month, 3 months, 6 months, 1 year, 2·5 years, 5 years, 7·5 years, 10 years, and 23 years before ICU admission. FINDINGS: Mortality predictions with a model based solely on disease history outperformed the Multimorbidity Index (Matthews correlation coefficient 0·265 vs 0·065), and performed similarly to SAPS II and APACHE II (Matthews correlation coefficient with disease history, age, and sex 0·326 vs 0·347 and 0·300 for SAPS II and APACHE II, respectively). Diagnoses up to 10 years before ICU admission affected current mortality prediction. Aggregation of previous disease history and acute physiology measures in a neural network yielded the most precise predictions of in-hospital mortality (Matthews correlation coefficient 0·391 for in-hospital mortality compared with 0·347 with SAPS II and 0·300 with APACHE II). These results for the aggregated model were validated in an external independent dataset of 1528 patients (Matthews correlation coefficient for prediction of in-hospital mortality 0·341). INTERPRETATION: Longitudinal disease-spectrum-wide data available before ICU admission are useful for mortality prediction. Disease history can be used to differentiate mortality risk between patients with similar vital signs with more precision than SAPS II and APACHE II scores. Machine learning models can be deconvoluted to generate novel understandings of how ICU patient features from long-term and short-term events interact with each other. Explainable machine learning models are key in clinical settings, and our results emphasise how to progress towards the transformation of advanced models into actionable, transparent, and trustworthy clinical tools. FUNDING: Novo Nordisk Foundation and Innovation Fund Denmark.

Diagnosis trajectories of prior multi-morbidity predict sepsis mortality
Cited by 90Open Access

Sepsis affects millions of people every year, many of whom will die. In contrast to current survival prediction models for sepsis patients that primarily are based on data from within-admission clinical measurements (e.g. vital parameters and blood values), we aim for using the full disease history to predict sepsis mortality. We benefit from data in electronic medical records covering all hospital encounters in Denmark from 1996 to 2014. This data set included 6.6 million patients of whom almost 120,000 were diagnosed with the ICD-10 code: A41 'Other sepsis'. Interestingly, patients following recurrent trajectories of time-ordered co-morbidities had significantly increased sepsis mortality compared to those who did not follow a trajectory. We identified trajectories which significantly altered sepsis mortality, and found three major starting points in a combined temporal sepsis network: Alcohol abuse, Diabetes and Cardio-vascular diagnoses. Many cancers also increased sepsis mortality. Using the trajectory based stratification model we explain contradictory reports in relation to diabetes that recently have appeared in the literature. Finally, we compared the predictive power using 18.5 years of disease history to scoring based on within-admission clinical measurements emphasizing the value of long term data in novel patient scores that combine the two types of data.

Clinical Knowledge Graph Integrates Proteomics Data into Clinical Decision-Making
Alberto Santos, Ana R. Colaço, Annelaura Bach Nielsen et al.|bioRxiv (Cold Spring Harbor Laboratory)|2020
Cited by 61Open Access

Summary The promise of precision medicine is to deliver personalized treatment based on the unique physiology of each patient. This concept was fueled by the genomic revolution, but it is now evident that integrating other types of omics data, like proteomics, into the clinical decision-making process will be essential to accomplish precision medicine goals. However, quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across myriad biomedical databases and publications makes this exceptionally difficult. To address this, we developed the Clinical Knowledge Graph (CKG), an open source platform currently comprised of more than 16 million nodes and 220 million relationships to represent relevant experimental data, public databases and the literature. The CKG also incorporates the latest statistical and machine learning algorithms, drastically accelerating analysis and interpretation of typical proteomics workflows. We use several biomarker studies to illustrate how the CKG may support, enrich and accelerate clinical decision-making. Graphical Abstract