The SuperSID project: exploiting high-level information for high-accuracy speaker recognition

D.A. Reynolds(Moscow Institute of Thermal Technology), W.D. Andrews(United States Department of Defense), Jessica K. Campbell(Moscow Institute of Thermal Technology), Jiří Navrátil(IBM (United States)), Barbara Peskin, André Adami, Qin Jin(California Miramar University), D. Klusacek, Josh Abramson(York University), Roxana Mihăescu(Princeton University), J. Godfrey(United States Department of Defense), Douglas A. Jones(Moscow Institute of Thermal Technology), Bing Xiang(Cornell University)
2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
January 23, 2004
Cited by 221

Abstract

The area of automatic speaker recognition has been dominated by systems using only short-term, low-level acoustic information, such as cepstral features. While these systems have indeed produced very low error rates, they ignore other levels of information beyond low-level acoustics that convey speaker information. Recently published work has shown examples that such high-level information can be used successfully in automatic speaker recognition systems and has the potential to improve accuracy and add robustness. For the 2002 JHU CLSP summer workshop, the SuperSID project (http://www.clsp.jhu.edu/ws2002/groups/supersid/) was undertaken to exploit these high-level information sources and dramatically increase speaker recognition accuracy on a defined NIST evaluation corpus and task. The paper provides an overview of the structure, data, task, tools, and accomplishments of this project. Wide ranging approaches using pronunciation models, prosodic dynamics, pitch and duration features, phone streams, and conversational interactions were explored and developed. We show how these novel features and classifiers indeed provide complementary information and can be fused together to drive down the equal error rate on the 2001 NIST extended data task to 0.2% - a 71% relative reduction in error over the previous state of the art.


Related Papers

No related papers found

Powered by citation graph analysis