J

J. S. Marron

University of North Carolina at Chapel Hill

ORCID: 0000-0003-2000-1476

Publishes on Statistical Methods and Inference, Advanced Statistical Methods and Models, Bayesian Methods and Mixture Models. 260 papers and 18.9k citations.

260Publications
18.9kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Functional Data Analysis
J. S. Marron, J. O. Ramsey, Bernard W. Silverman|Journal of the American Statistical Association|1998
Cited by 1.4k

Scientists today collect samples of curves and other functional observations. This monograph presents many ideas and techniques for such data. Included are expressions in the functional domain of such classics as linear regression, principal components analysis, linear modelling, and canonical correlation analysis, as well as specifically functional techniques such as curve registration and principal differential analysis. Data arising in real applications are used throughout for both motivation and illustration, showing how functional approaches allow us to see new things, especially by exploiting the smoothness of the processes generating the data. The data sets exemplify the wide scope of functional data analysis; they are drwan from growth analysis, meterology, biomechanics, equine science, economics, and medicine. The book presents novel statistical technology while keeping the mathematical level widely accessible. It is designed to appeal to students, to applied data analysts, and to experienced researchers; it will have value both within statistics and across a broad spectrum of other fields. Much of the material is based on the authors' own work, some of which appears here for the first time. Jim Ramsay is Professor of Psychology at McGill University and is an international authority on many aspects of multivariate analysis. He draws on his collaboration with researchers in speech articulation, motor control, meteorology, psychology, and human physiology to illustrate his technical contributions to functional data analysis in a wide range of statistical and application journals. Bernard Silverman, author of the highly regarded Density Estimation for Statistics and Data Analysis, and coauthor of Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, is Professor of Statistics at Bristol University. His published work on smoothing methods and other aspects of applied, computational, and theoretical statistics has been recognized by the Presidents' Award of the Committee of Presidents of Statistical Societies, and the award of two Guy Medals by the Royal Statistical Society.

A Brief Survey of Bandwidth Selection for Density Estimation
M. C. Jones, J. S. Marron, Simon J. Sheather|Journal of the American Statistical Association|1996
Cited by 1.2k

Abstract There has been major progress in recent years in data-based bandwidth selection for kernel density estimation. Some “second generation” methods, including plug-in and smoothed bootstrap techniques, have been developed that are far superior to well-known “first generation” methods, such as rules of thumb, least squares cross-validation, and biased cross-validation. We recommend a “solve-the-equation” plug-in bandwidth selector as being most reliable in terms of overall performance. This article is intended to provide easy accessibility to the main ideas for nonexperts.

Exact Mean Integrated Squared Error
J. S. Marron, M. P. Wand|The Annals of Statistics|1992
Cited by 763Open Access

An exact and easily computable expression for the mean integrated squared error (MISE) for the kernel estimator of a general normal mixture density, is given for Gaussian kernels of arbitrary order. This provides a powerful new way of understanding density estimation which complements the usual tools of simulation and asymptotic analysis. The family of normal mixture densities is very flexible and the formulae derived allow simple exact analysis for a wide variety of density shapes. A number of applications of this method giving important new insights into kernel density estimation are presented. Among these is the discovery that the usual asymptotic approximations to the MISE can be quite inaccurate, especially when the underlying density contains substantial fine structure and also strong evidence that the practical importance of higher order kernels is surprisingly small for moderate sample sizes.

Predicting fault incidence using software change history
Todd Graves, Alan F. Karr, J. S. Marron et al.|IEEE Transactions on Software Engineering|2000
Cited by 751

This paper is an attempt to understand the processes by which software ages. We define code to be aged or decayed if its structure makes it unnecessarily difficult to understand or change and we measure the extent of decay by counting the number of faults in code in a period of time. Using change management data from a very large, long-lived software system, we explore the extent to which measurements from the change history are successful in predicting the distribution over modules of these incidences of faults. In general, process measures based on the change history are more useful in predicting fault rates than product metrics of the code: For instance, the number of times code has been changed is a better indication of how many faults it will contain than is its length. We also compare the fault rates of code of various ages, finding that if a module is, on the average, a year older than an otherwise similar module, the older module will have roughly a third fewer faults. Our most successful model measures the fault potential of a module as the sum of contributions from all of the times the module has been changed, with large, recent changes receiving the most weight.

Does code decay? Assessing the evidence from change management data
S.G. Eick, Todd Graves, Alan F. Karr et al.|IEEE Transactions on Software Engineering|2001
Cited by 641

A central feature of the evolution of large software systems is that change-which is necessary to add new functionality, accommodate new hardware, and repair faults-becomes increasingly difficult over time. We approach this phenomenon, which we term code decay, scientifically and statistically. We define code decay and propose a number of measurements (code decay indices) on software and on the organizations that produce it, that serve as symptoms, risk factors, and predictors of decay. Using an unusually rich data set (the fifteen-plus year change history of the millions of lines of software for a telephone switching system), we find mixed, but on the whole persuasive, statistical evidence of code decay, which is corroborated by developers of the code. Suggestive indications that perfective maintenance can retard code decay are also discussed.