Emolah: A Malay Language Spontaneous Speech Emotion Recognition on iOS Platform

  • Authors

    • Izzad Ramli
    • Nursuriati Jamil
    • Norizah Ardi
    • Raseeda Hamzah
    2018-08-13
    https://doi.org/10.14419/ijet.v7i3.15.17520
  • Speech emotion recognition, spontaneous speech, Internet of Things, iOS platform, Malay language.
  • This paper presented the implementation of spontaneous speech emotion recognition (SER) using smartphone on iOS platform. The novelty of this work is at the time of writing, no similar work has been done using Malay language spontaneous speech. The development of SER using a mobile device is important for ease of use anytime and anywhere. The main factors to be considered is the computational complexity of classifying the emotions in real-time. Therefore, we introduced EmoLah, a Malay language spontaneous SER that is able to recognize emotions on the go with satisfactory accuracy rate. Pitch and energy prosody features are used to represent the emotions in the spontaneous speech and Naïve Bayes learning model is selected as the classifier. EmoLah is trained and tested using Malay language spontaneous speech acquired from television talk shows, live interviews from news broadcast and mini-parliament sessions conducted by children. Four types of speech emotions are collected that are happy, sad, angry and neutral. The total duration of all the speech emotion is four hours. The speech emotion training is using MATLAB scripts and the weights are implemented in XCODE as the iOS software for application development. Emolah accuracy is evaluated using cross-validation test and the result showed that it can discriminate angry, sad and happy. However, most emotions are misclassified as neutral emotion.

     

     

  • References

    1. [1] Hossain MS, Muhammad G. An Emotion Recognition System for Mobile Applications. IEEE Access. 2017;5:2281-7.

      [2] Narudin FA, Feizollah A, Anuar NB, Gani A. Evaluation of Machine Learning Classifiers for Mobile Malware Detection. Soft Computing. 2016 Jan 1; 20(1):343-57.

      [3] Petrushin V. Emotion in Speech: Recognition and Application to Call Centers. Proceedings of Artificial Neural Networks in Engineering. 1999:710.

      [4] Cen L, Wu F, Yu ZL, Hu F. A Real-Time Speech Emotion Recognition System and its Application in Online Learning. In Emotions, Technology, Design, and Learning. 2016; 27-46.

      [5] Koolagudi SG, Rao KS. Emotion Recognition from Speech: A Review. Int J Speech Technology. 2012; 15:99-117.

      [6] Williams C, Stevens K. Emotions and Speech: Some Acoustical Correlates. J. Acoust. Soc. Am.1972; 52 (4 Pt 2):1238–1250.

      [7] Wolfgang W. Mobile Speech-to-Speech Translation of Spontaneous Dialogs: An Overview of the Final Verbmobil System. In: Wahlster, W. Editor. Verbmobil: Foundations of Speech-to-Speech Translation. Springer; 2000: 3–21.

      [8] Apandi N, Jamil, N. Emotional Speech Corpus Development for Emotion Recognition in Malay Language. Proceedings of the Industrial Electronics and Applications Conference (IEACon). Kota Kinabalu, Sabah. 2016; 225–231.

      [9] Mustafa MB, Ainon RN. Emotional Speech Acoustic Model for Malay: Interative versus Isolated Unit Training. J. Acoust. Soc. Am. 2013 Oct; 134(4):3057-66.

      [10] Urbano Romeu Ã. Emotion Recognition Based on the Speech, using a Naive Bayes Classifier. Bachelor Thesis, Universitat Politècnica de Catalunya: 2016.

      [11] Nicholson J, Takahashi K, Nakatsu R. Emotion Recognition in Speech using Neural Networks. Neural Comput. Appl. 2000; 9:290–296.

      [12] Cristianini N, Shawe-Taylor J. An Introduction to Support Vector Machines. Cambridge University Press; 2000.

      [13] Burges CJC. A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining Knowl. Discovery. 1998; 2 (2): 121–167.

      [14] El Ayadi M, Kamel MS, Karray F. Survey on Speech Emotion Recognition: Features, Classification Schemes, and Databases. Pattern Recognition. 2011 Mar 1; 44(3):572-87.

      [15] Seman N. Coalition of Artificial Intelligence (AI) Algorithm for Isolated Spoken Malay Speech Recognition. PhD Thesis. Universiti Teknologi MARA: 2011.

      [16] Wu CH, Liang WB. Emotion Recognition of Affective Speech Based on Multiple Classifiers using Acoustic-Prosodic Information and Semantic lLabels. IEEE Transactions on Affective Computing. 2011 Jan;2(1):10-21.

      [17] Paliwal KK, Lyons JG, Wójcicki KK. Preference for 20-40 ms window duration in speech analysis. Proceedings of 2010 4th International Conference on Signal Processing and Communication Systems (ICSPCS), 2010 Dec 13; 1-4.

      [18] Podder P, Khan TZ, Khan MH, Rahman MM. Comparative Performance Analysis of Hamming, Hanning and Blackman Window. International Journal of Computer Applications. 2014 Jan 1; 96(18):1-7.

      [19] Ali SA, Zehra S, Khan M, Wahab F. Development and Analysis of Speech Emotion Corpus using Prosodic Features for Cross Linguistics. International Journal of Scientific & Engineering Research. 2013 Jan;4(1):1-8.

  • Downloads

  • How to Cite

    Ramli, I., Jamil, N., Ardi, N., & Hamzah, R. (2018). Emolah: A Malay Language Spontaneous Speech Emotion Recognition on iOS Platform. International Journal of Engineering & Technology, 7(3.15), 151-156. https://doi.org/10.14419/ijet.v7i3.15.17520