Publishes on Statistical Methods and Bayesian Inference, Air Quality and Health Impacts, Statistical Methods and Inference. 574 papers and 75.9k citations.
This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence. The estimating equations are derived without specifying the joint distribution of a subject's observations yet they reduce to the score equations for multivariate Gaussian outcomes. Asymptotic theory is presented for the general class of estimators. Specific cases in which we assume independence, m-dependence and exchangeable correlation structures from each subject are discussed. Efficiency of the proposed estimators in two simple situations is considered. The approach is closely related to quasi-likelihood.
Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One objective of statistical analysis is to describe the marginal expectation of the outcome variable as a function of the covariates while accounting for the correlation among the repeated observations for a given subject. This paper proposes a unifying approach to such analysis for a variety of discrete and continuous outcomes. A class of generalized estimating equations (GEEs) for the regression parameters is proposed. The equations are extensions of those used in quasi-likelihood (Wedderburn, 1974, Biometrika 61, 439-447) methods. The GEEs have solutions which are consistent and asymptotically Gaussian even when the time dependence is misspecified as we often expect. A consistent variance estimate is presented. We illustrate the use of the GEE approach with longitudinal data from a study of the effect of mothers' stress on children's morbidity.
This article discusses extensions of generalized linear models for the analysis of longitudinal data. Two approaches are considered: subject-specific (SS) models in which heterogeneity in regression parameters is explicitly modelled; and population-averaged (PA) models in which the aggregate response for the population is the focus. We use a generalized estimating equation approach to fit both classes of models for discrete and continuous outcomes. When the subject-specific parameters are assumed to follow a Gaussian distribution, simple relationships between the PA and SS parameters are available. The methods are illustrated with an analysis of data on mother's smoking and children's respiratory disease.
Abstract The first edition of Analysis for Longitudinal Data has become a classic. Describing the statistical models and methods for the analysis of longitudinal data, it covers both the underlying statistical theory of each method, and its application to a range of examples from the agricultural and biomedical sciences. The main topics discussed are design issues, exploratory methods of analysis, linear models for continuous data, general linear models for discrete data, and models and methods for handling data and missing values. Under each heading, worked examples are presented in parallel with the methodological development, and sufficient detail is given to enable the reader to reproduce the author's results using the data-sets as an appendix. This new edition of Analysis for Longitudinal Data provides a thorough and expanded revision of this important text. It includes two new chapters; the first discusses fully parametric models for discrete repeated measures data, and the second explores statistical models for time-dependent predictors.
CONTEXT: Evidence on the health risks associated with short-term exposure to fine particles (particulate matter < or =2.5 microm in aerodynamic diameter [PM2.5]) is limited. Results from the new national monitoring network for PM2.5 make possible systematic research on health risks at national and regional scales. OBJECTIVES: To estimate risks of cardiovascular and respiratory hospital admissions associated with short-term exposure to PM2.5 for Medicare enrollees and to explore heterogeneity of the variation of risks across regions. DESIGN, SETTING, AND PARTICIPANTS: A national database comprising daily time-series data daily for 1999 through 2002 on hospital admission rates (constructed from the Medicare National Claims History Files) for cardiovascular and respiratory outcomes and injuries, ambient PM2.5 levels, and temperature and dew-point temperature for 204 US urban counties (population >200,000) with 11.5 million Medicare enrollees (aged >65 years) living an average of 5.9 miles from a PM2.5 monitor. MAIN OUTCOME MEASURES: Daily counts of county-wide hospital admissions for primary diagnosis of cerebrovascular, peripheral, and ischemic heart diseases, heart rhythm, heart failure, chronic obstructive pulmonary disease, and respiratory infection, and injuries as a control outcome. RESULTS: There was a short-term increase in hospital admission rates associated with PM2.5 for all of the health outcomes except injuries. The largest association was for heart failure, which had a 1.28% (95% confidence interval, 0.78%-1.78%) increase in risk per 10-microg/m3 increase in same-day PM2.5. Cardiovascular risks tended to be higher in counties located in the Eastern region of the United States, which included the Northeast, the Southeast, the Midwest, and the South. CONCLUSION: Short-term exposure to PM2.5 increases the risk for hospital admission for cardiovascular and respiratory diseases.