International Science Index


10010982

Speaker Recognition Using LIRA Neural Networks

Abstract:

This article contains information from our investigation in the field of voice recognition. For this purpose, we created a voice database that contains different phrases in two languages, English and Spanish, for men and women. As a classifier, the LIRA (Limited Receptive Area) grayscale neural classifier was selected. The LIRA grayscale neural classifier was developed for image recognition tasks and demonstrated good results. Therefore, we decided to develop a recognition system using this classifier for voice recognition. From a specific set of speakers, we can recognize the speaker’s voice. For this purpose, the system uses spectrograms of the voice signals as input to the system, extracts the characteristics and identifies the speaker. The results are described and analyzed in this article. The classifier can be used for speaker identification in security system or smart buildings for different types of intelligent devices.

References:
[1] E. Kussul, T. Baidyk, D. Wunsch, Neural Networks and Micromechanics, New York: Springer-Verlag, 2010.
[2] O. Makeyev, E. Sazonov, T. Baidyk, A. Martin, “Limited receptibe area neural classifier for texture recognition of mechanically treated metal surfaces,” Neurocomputing, vol. 71, no 7-9, pp. 1413-1421, March 2008.
[3] O. Makeyev, E. Sazonov, S. Schuckers, P. Lopez, T. Baidyk, E. Melanson, M. Neuman, “Recognition of swallowing sounds using time frequency decomposition and limited receptive area neural classifier, “ in Proc. Of AI-2008, The twenty-eight SGAI Intern Conf on Innovetive Techniques and Applications of Artificial Intelligence, Eds. Tont Allen, Richard Ellis and Miltos Petridis, Springer, Cambridge, UK, December, pp. 33-46, 2008.
[4] J. P. Campbell, “Speaker recognition: a tutorial,” in Proc. IEEE, vol. 85, no 9, pp. 1437-1462, 1997.
[5] D. A. Reynolds, “Speaker identification and verification using Gaussian mixture speaker models,” Speech Commun., vol. 17, pp. 91–108, 1995.
[6] D. A. Reynolds, “A Gaussian mixture modeling approach to text independent speaker identification,” Ph.D. Thesis, Georgia Institute of Technology, 1992.
[7] A. L. Higgins and W R. E. Ohlford, “A new method of text independent speaker recognition,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Tokyo, Japan, pp. 869–872, 1986.
[8] N. Z. Tishby, “On the application of mixture AR hidden Markov models to text independent speaker recognition,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 39, no 3, pp. 563–570, 1991.
[9] D. Reynolds and B. Carlson, “Text-dependent speaker verification using decoupled and integrated speaker and speech recognizers,” in Proc. EUROSPEECH, pp. 647–650, 1995.
[10] C. Che and Q. Lin, “Speaker recognition using HMM with experiments on the YOHO database,” in Proc. EUROSPEECH, pp. 625–628, 1995.
[11] J. Colombi, D. Ruck, S. Rogers, M. Oxley, and T. Anderson, “Cohort selection and word grammer effects for speaker recognition,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 85–88, 1996.
[12] D. A. Reynolds, “M.I.T. Lincoln Laboratory site presentation,” Speaker Recognition Workshop, A. Martin, Ed., 1996.
[13] Instituto Politécnico de Madrid, Escuela Universitaria Técnica de Telecomunicación, Manual Técnico de sonido. Recuperado el 08 Julio del 2013 http://www.diac.upm.es/escuela, 2000.
[14] F. Miyara, Acústica y Sistemas de sonido, Argentina: UNR, 1999.
[15] O. Makeyev, “Automatic method of acoustical swalling detection for monitoring of ingestive behavior,” Ph.D Thesis, Clarkson University, Potsdam, NY, USA, April 2010.
[16] E. Sazonov, O. Makeyev, S. Schuckers, P. Meyer, E. Melanson, M. R Neuman, “Automatic detection of swallowing events by acoustical means for applications of monitoring of ingestive behavior,” IEEE Trans on Biomed Eng, vol 57, no 3, pp. 626-633, 2010.