I

Iain Marshall

King's College London

ORCID: 0000-0003-2594-2654

Publishes on Health, Environment, Cognitive Aging, Mental Health Research Topics, Topic Modeling. 439 papers and 17.1k citations.

439Publications
17.1kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Toward systematic review automation: a practical guide to using machine learning tools in research synthesis
Iain Marshall, Byron Wallace|Systematic Reviews|2019
Cited by 635Open Access

Technologies and methods to speed up the production of systematic reviews by reducing the manual labour involved have recently emerged. Automation has been proposed or used to expedite most steps of the systematic review process, including search, screening, and data extraction. However, how these technologies work in practice and when (and when not) to use them is often not clear to practitioners. In this practical guide, we provide an overview of current machine learning methods that have been proposed to expedite evidence synthesis. We also offer guidance on which of these are ready for use, their strengths and weaknesses, and how a systematic review team might go about using them in practice.

Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
Iain Marshall, Anna H Noel-Storr, Joël Kuiper et al.|Research Synthesis Methods|2018
Cited by 446Open Access

Machine learning (ML) algorithms have proven highly accurate for identifying Randomized Controlled Trials (RCTs) but are not used much in practice, in part because the best way to make use of the technology in a typical workflow is unclear. In this work, we evaluate ML models for RCT classification (support vector machines, convolutional neural networks, and ensemble approaches). We trained and optimized support vector machine and convolutional neural network models on the titles and abstracts of the Cochrane Crowd RCT set. We evaluated the models on an external dataset (Clinical Hedges), allowing direct comparison with traditional database search filters. We estimated area under receiver operating characteristics (AUROC) using the Clinical Hedges dataset. We demonstrate that ML approaches better discriminate between RCTs and non-RCTs than widely used traditional database search filters at all sensitivity levels; our best-performing model also achieved the best results to date for ML in this task (AUROC 0.987, 95% CI, 0.984-0.989). We provide practical guidance on the role of ML in (1) systematic reviews (high-sensitivity strategies) and (2) rapid reviews and clinical question answering (high-precision strategies) together with recommended probability cutoffs for each use case. Finally, we provide open-source software to enable these approaches to be used in practice.