Unsupervised approaches for automatic keyword extraction using meeting transcripts

Feifan Liu(The University of Texas at Dallas), Deana L. Pennell(The University of Texas at Dallas), Fei Liu(The University of Texas at Dallas), Yang Liu(The University of Texas at Dallas)
Unknown
January 1, 2009
Cited by 198Open Access
Full Text

Abstract

This paper explores several unsupervised approaches to automatic keyword extraction using meeting transcripts. In the TFIDF (term frequency, inverse document frequency) weighting framework, we incorporated part-of-speech (POS) information, word clustering, and sentence salience score. We also evaluated a graph-based approach that measures the importance of a word based on its connection with other sentences or words. The system performance is evaluated in different ways, including comparison to human annotated keywords using F-measure and a weighted score relative to the oracle system performance, as well as a novel alternative human evaluation. Our results have shown that the simple unsupervised TFIDF approach performs reasonably well, and the additional information from POS and sentence score helps keyword extraction. However, the graph method is less effective for this domain. Experiments were also performed using speech recognition output and we observed degradation and different patterns compared to human transcripts.


Related Papers

No related papers found

Powered by citation graph analysis