A probabilistic model for co-occurrence analysis in bibliometrics
Abstract
The co-occurrence analysis of Medical Subject Heading (MeSH) terms extracted from the PubMed database is popularly used in bibliometrics. Practically for making the result interpretable, it is necessary to apply a certain filter procedure of co-occurrence matrix for removing the low-frequency items due to their low representativeness. Unfortunately, there is rare research referring to determine a critical threshold to remove the noise of co-occurrence matrix. Here, we proposed a probabilistic model for co-occurrence analysis that can provide statistical inferences about whether the paired items co-occur randomly. With help of this model, the dimensionality of co-occurrence matrix could be reduced according to the selected threshold. The conceptual model framework, simulation and practical applications are illustrated in the manuscript. Further details (including all reproducible codes) can be downloaded from the project website: https://github.com/xizhou/co-occurrence-analysis.git.
Related Papers
No related papers found
Powered by citation graph analysis