Text independent emotion recognition for Telugu speech by using prosodic features

  • Authors

    • Kasiprasad Mannepalli
    • Suman Maloji
    • Panyam Narahari Sastry
    • Swetha Danthala
    • Durgaprasad Mannepalli
    2018-03-18
    https://doi.org/10.14419/ijet.v7i2.7.10887
  • Emotion Recognition, Telugu Speech Emotion, Prosodic Features.
  • Abstract

    The human speech delivers different types of information about the speaker and speech. From the speech production side, the speech signal carries linguistic information such as the meaningful message and the language and emotional, geographical and the speaker’s physiological characteristics of the speaker information are conveyed. This paper focuses on automatically identifying the emotion of a speaker given a sample of speech. the speech signals considered in this work are collected from Telugu speakers. The features like pitch, pitch related prosody, energy and formants. The overall recognition accuracy obtained is 72% in this work.

     

  • References

    1. [1] Jia Rong, Gang Li, Yi-Ping Phoebe Chen “Acoustic feature selection for automatic emotion recognition from speechâ€, Elsevier, Information Processing and Management volume 45 (2009).

      [2] Hagai Aronowitz and David Burshtein, “Efficient Speaker Recognition Using Approximated Cross Entropy (ACE)†IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 7, September 2007, pp. 2033-2043.

      [3] Yuan-Fu Liao and Yau-Tarng Juang “Latent Prosody Analysis for Robust Speaker Identificationâ€, IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 6, August 2007, pp. 1870-1883

      [4] Ning Wang et al, “Robust Speaker Recognition Using De-noised Vocal Source and Vocal Tract Featuresâ€, IEEE Transactions on Audio, Speech and Language Processing, Vol. 19, No. 1, January 2011, pp. 196-205.

      [5] Nobutoshi Hanai and Richard M. Stern, “Robust speech recognition in the automobile†Carnegie Mellon University, Pittsburgh.

      [6] Seiichi Nakagawa et al, “Speaker Identification and Verification by Combining MFCC and Phase Informationâ€, IEEE Transactions on Audio, Speech and Language Processing, Vol. 20, No. 4, May 2012, pp. 1085-1095.

      [7] Marco Grimaldi and Fred Cummins “Speaker Identification Using Instantaneous Frequenciesâ€, IEEE Transactions on Audio, Speech and Language Processing, Vol. 16, No. 6, August 2008, pp. 1097-1111.

      [8] Ji Ming et al, “Robust Speaker Recognition in Noisy Conditionsâ€, IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 5, July 2007, pp. 1711-1723.

      [9] Jamel Price and Ali Eydgahi, University of Maryland Eastern Shore, “Design of Matlab®-Based Automatic Speaker Recognition Systemsâ€.

      [10] Karthikeyan Umapathy et al “Au

      [11] dio Signal Feature Extraction and Classification Using Local Discriminant Bases†IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 4, May 2001, pp. 1236-1246.

      [12] Khalid Saeed and Mohammad KheirNammous, “A Speech-and-Speaker Identification System: Feature Extraction, Description, and Classification of Speech-Signal Imageâ€, IEEE Transactions on Industrial Electronics Vol. 54, No.2, April 2007, pp. 887-897.

      [13] Nitisha and AshuBansal, “Speaker Recognition Using MFCC Front End Analysis and VQ Modelling Technique for Hindi Words using MATLABâ€, Hindu College of Engineering, Haryana, India.

      [14] Kishore, P.V.V., Kishore, S.R.C. And Prasad, M.V.D., 2013. Conglomeration Of Hand Shapes And Texture Information For Recognizing Gestures Of Indian Sign Language Using Feed Forward Neural Networks. International Journal Of Engineering And Technology, 5(5), Pp. 3742-3756.

      [15] Ramkiran, D.S., Madhav, B.T.P., Prasanth, A.M., Harsha, N.S., Vardhan, V., Avinash, K., Chaitanya, M.N. And Nagasai, U.S., 2015. Novel Compact Asymmetrical Fractal Aperture Notch Band Antenna. Leonardo Electronic Journal Of Practices And Technologies, 14(27), Pp. 1-12.

      [16] Karthik, G.V.S., Fathima, S.Y., Rahman, M.Z.U., Ahamed, S.R. And Lay-Ekuakille, A., 2013. Efficient Signal Conditioning Techniques For Brain Activity In Remote Health Monitoring Network. Ieee Sensors Journal, 13(9), Pp. 3273-3283.

      [17] Kishore, P.V.V., Prasad, M.V.D., Prasad, C.R. And Rahul, R., 2015. 4-Camera Model For Sign Language Recognition Using Elliptical Fourier Descriptors And Ann, International Conference On Signal Processing And Communication Engineering Systems - Proceedings Of Spaces 2015, In Association With Ieee 2015, Pp. 34-38.

  • Downloads

  • How to Cite

    Mannepalli, K., Maloji, S., Narahari Sastry, P., Danthala, S., & Mannepalli, D. (2018). Text independent emotion recognition for Telugu speech by using prosodic features. International Journal of Engineering & Technology, 7(2.7), 594-596. https://doi.org/10.14419/ijet.v7i2.7.10887

    Received date: 2018-04-01

    Accepted date: 2018-04-01

    Published date: 2018-03-18