M

Mark S. Gilthorpe

University of London

ORCID: 0000-0001-8783-7695

Publishes on Birth, Development, and Health, Advanced Causal Inference Techniques, Health, Medicine and Society. 293 papers and 9.5k citations.

293Publications
9.5kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Robust causal inference using directed acyclic graphs: the R package ‘dagitty’
Johannes Textor, Benito van der Zander, Mark S. Gilthorpe et al.|International Journal of Epidemiology|2016
Cited by 2.3kOpen Access

Directed acyclic graphs (DAGs), which offer systematic representations of causal relationships, have become an established framework for the analysis of causal inference in epidemiology, often being used to determine covariate adjustment sets for minimizing confounding bias. DAGitty is a popular web application for drawing and analysing DAGs. Here we introduce the R package 'dagitty', which provides access to all of the capabilities of the DAGitty web application within the R platform for statistical computing, and also offers several new functions. We describe how the R package 'dagitty' can be used to: evaluate whether a DAG is consistent with the dataset it is intended to represent; enumerate 'statistically equivalent' but causally different DAGs; and identify exposure-outcome adjustment sets that are valid for causally different but statistically equivalent DAGs. This functionality enables epidemiologists to detect causal misspecifications in DAGs and make robust inferences that remain valid for a range of different DAGs. The R package 'dagitty' is available through the comprehensive R archive network (CRAN) at [https://cran.r-project.org/web/packages/dagitty/]. The source code is available on github at [https://github.com/jtextor/dagitty]. The web application 'DAGitty' is free software, licensed under the GNU general public licence (GPL) version 2 and is available at [http://dagitty.net/].

Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: review and recommendations
Peter W. G. Tennant, Eleanor J. Murray, Kellyn F Arnold et al.|International Journal of Epidemiology|2020
Cited by 975Open Access

BACKGROUND: Directed acyclic graphs (DAGs) are an increasingly popular approach for identifying confounding variables that require conditioning when estimating causal effects. This review examined the use of DAGs in applied health research to inform recommendations for improving their transparency and utility in future research. METHODS: Original health research articles published during 1999-2017 mentioning 'directed acyclic graphs' (or similar) or citing DAGitty were identified from Scopus, Web of Science, Medline and Embase. Data were extracted on the reporting of: estimands, DAGs and adjustment sets, alongside the characteristics of each article's largest DAG. RESULTS: A total of 234 articles were identified that reported using DAGs. A fifth (n = 48, 21%) reported their target estimand(s) and half (n = 115, 48%) reported the adjustment set(s) implied by their DAG(s). Two-thirds of the articles (n = 144, 62%) made at least one DAG available. DAGs varied in size but averaged 12 nodes [interquartile range (IQR): 9-16, range: 3-28] and 29 arcs (IQR: 19-42, range: 3-99). The median saturation (i.e. percentage of total possible arcs) was 46% (IQR: 31-67, range: 12-100). 37% (n = 53) of the DAGs included unobserved variables, 17% (n = 25) included 'super-nodes' (i.e. nodes containing more than one variable) and 34% (n = 49) were visually arranged so that the constituent arcs flowed in the same direction (e.g. top-to-bottom). CONCLUSION: There is substantial variation in the use and reporting of DAGs in applied health research. Although this partly reflects their flexibility, it also highlights some potential areas for improvement. This review hence offers several recommendations to improve the reporting and use of DAGs in future research.

Simpson's Paradox, Lord's Paradox, and Suppression Effects are the same phenomenon – the reversal paradox
Yu‐Kang Tu, David Gunnell, Mark S. Gilthorpe|Emerging Themes in Epidemiology|2008
Cited by 250Open Access

This article discusses three statistical paradoxes that pervade epidemiological research: Simpson's paradox, Lord's paradox, and suppression. These paradoxes have important implications for the interpretation of evidence from observational studies. This article uses hypothetical scenarios to illustrate how the three paradoxes are different manifestations of one phenomenon--the reversal paradox--depending on whether the outcome and explanatory variables are categorical, continuous or a combination of both; this renders the issues and remedies for any one to be similar for all three. Although the three statistical paradoxes occur in different types of variables, they share the same characteristic: the association between two variables can be reversed, diminished, or enhanced when another variable is statistically controlled for. Understanding the concepts and theory behind these paradoxes provides insights into some controversial or contradictory research findings. These paradoxes show that prior knowledge and underlying causal theory play an important role in the statistical modelling of epidemiological data, where incorrect use of statistical models might produce consistent, replicable, yet erroneous results.

Time to reality check the promises of machine learning-powered precision medicine
Jack Wilkinson, Kellyn F Arnold, Eleanor J. Murray et al.|The Lancet Digital Health|2020
Cited by 238Open Access

Machine learning methods, combined with large electronic health databases, could enable a personalised approach to medicine through improved diagnosis and prediction of individual responses to therapies. If successful, this strategy would represent a revolution in clinical research and practice. However, although the vision of individually tailored medicine is alluring, there is a need to distinguish genuine potential from hype. We argue that the goal of personalised medical care faces serious challenges, many of which cannot be addressed through algorithmic complexity, and call for collaboration between traditional methodologists and experts in medical machine learning to avoid extensive research waste.

Revisiting the relation between change and initial value: a review and evaluation
Yu‐Kang Tu, Mark S. Gilthorpe|Statistics in Medicine|2006
Cited by 217

The relation between initial disease status and subsequent change following treatment has attracted great interest in clinical research. However, statisticians have repeatedly warned against correlating/regressing change with baseline due to two methodological concerns known as mathematical coupling and regression to the mean. Oldham's method and Blomqvist's formula are the two most often adopted methods to rectify these problems. The aims of this article are to review briefly the proposed solutions in the statistical and psychological literature, and to clarify the popular misconception that Blomqvist's formula is superior to Oldham's method. We argue that this misconception is due to a failure to recognize that the heterogeneity of individual responses to treatment is a source of regression to the mean in the analysis of the relation between change and initial value. Furthermore, we demonstrate how each method actually answers different research questions, and how confusion arises when this is not always understood.