Robust speaker identification using fusion of features and classifiers
International Journal of Machine Learning and Computing
Speaker identification using Gaussian Mixture Models (GMMs) based on Mel Frequency Cepstral Coefficients (MFCCs) as features, proposed by Reynolds (1995), is one of the most effective approaches available in the literature. The use of GMMs for modeling speaker identity is motivated by the interpretation that the Gaussian components represent some general speaker-dependent spectral shapes, and the capability of mixtures to model arbitrary densities. In this work, we have established empirically how combining two different well-known set of features (MFCCs and Perceptual Linear Predictive Coefficients) and using ensemble classifiers in conjunction with principal component transformation and some robust estimation procedures, can be used to enhance significantly the performance of the MFCC-GMM speaker recognition systems, using the benchmark speech corpus NTIMIT.
Bose, Smarajit; Pal, Amita; Mukherjee, Anish; and Das, Debasmita, "Robust speaker identification using fusion of features and classifiers" (2017). Journal Articles. 2381.