Speaker Recognition.
Date of Submission
December 2005
Date of Award
Winter 12-12-2006
Institute Name (Publisher)
Indian Statistical Institute
Document Type
Master's Dissertation
Degree Name
Master of Technology
Subject Name
Computer Science
Department
Computer Vision and Pattern Recognition Unit (CVPR-Kolkata)
Supervisor
Mitra, Mandar (CVPR-Kolkata; ISI)
Abstract (Summary of the Work)
We have concentrated on the speaker identification part of the speaker recognition problem. Here, we have made a study which involves the classifi- cation and identification of the speakers using the Gaussian mixture models (GMM) and the mel frequency cepstral coefficients (MFCC). Due to its re- ported superior performance, especially under adverse conditions, MFCC is becoming an increasingly popular choice as feature extraction front end to spoken language systems. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are effective for modelling speaker identity. A complete experimental evaluation is conducted on two sets of data of 7 speakers and 21 speakers. The GMM attains 100% accuracy on the 7 speaker data and 97.3% on the 21 speaker data using clean speech utterances.
Control Number
ISI-DISS-2005-150
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
DOI
http://dspace.isical.ac.in:8080/jspui/handle/10263/6319
Recommended Citation
Nangalia, Sulabh, "Speaker Recognition." (2006). Master’s Dissertations. 249.
https://digitalcommons.isical.ac.in/masters-dissertations/249
Comments
ProQuest Collection ID: http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:28843273