A <i>K</i>th Nearest Neighbour Clustering Procedure

M. Anthony Wong(Massachusetts Institute of Technology), Tom Lane(Massachusetts Institute of Technology)
Journal of the Royal Statistical Society Series B (Statistical Methodology)
July 1, 1983
Cited by 138

Abstract

Summary Due to the lack of development in the probabilistic and statistical aspects of clustering research, clustering procedures are often regarded as heuristics generating artificial clusters from a given set of sample data. In this paper, a clustering procedure that is useful for drawing statistical inference about the underlying population from a random sample is developed. It is based on the uniformly consistent kth nearest neighbour density estimate, and is applicable to both case-by-variable data matrices and case-by-case dissimilarity matrices. The proposed clustering procedure is shown to be asymptotically consistent for high-density clusters in several dimensions, and its small-sample behaviour is illustrated by an empirical example.


Related Papers

Clustering Algorithms.
A. D. Gordon, J. A. Hartigan|Journal of the American Statistical Association|1976|2.6k
Minimum Spanning Trees and Single Linkage Cluster Analysis
J. C. Gower, G. J. Ross|Journal of the Royal Statistical Society Series C (Applied Statistics)|1969|1.2k