An epitomization of stress recognition from speech signal

  • Authors

    • Veena Narayanan Amrita Vishwa Vidyapeetham
    • S Lalitha Amrita Vishwa Vidyapeetham
    • Deepa Gupta Amrita Vishwa Vidyapeetham
    2018-08-02
    https://doi.org/10.14419/ijet.v7i2.27.10123
  • Stress Recognition, Feature Extraction, Statistical Measures, MFCC, LPCC.
  • Abstract

    The Detection of stress from speech signal is gaining large attention recently. The emergence of new methods and techniques for feature extraction and classification paved the way to different solutions to detect different stress conditions using human speech and led to an in-crease in the accuracy of stress recognition. A large number of parameters are proposed for the characterization of stress in speech. Similarly numerous classifiers and machine learning algorithms are investigated for stress classification and regression. In this treatise, a recital on the commonly used databases, stress conditions, different feature extraction methods and classifiers along with some of the statistical measures as well as compensation techniques for stress detection are presented in this article. After thorough illustration of existing methodology for the task, future prospects for the work are elaborated.

     

     

     

     

  • References

    1. [1] H. GAO, A. Yüce and J. P. Thiran, "Detecting emotional stress from facial expressions for driving safety," 2014 IEEE International Conference on Image Processing (ICIP), Paris, 2014, pp. 5961-5965. https://doi.org/10.1109/ICIP.2014.7026203.

      [2] S. Boonnithi and S. Phongsuphap, "Comparison of heart rate variability measures for mental stress detection," 2011 Computing in Cardiology, Hangzhou, 2011, pp. 85-88.

      [3] G. Shivakumar and P. A. Vijaya, "Emotion Recognition Using Finger Tip Temperature: First Step towards an Automatic System," International Journal of Computer and Electrical Engineering vol. 4, no. 3, pp. 252-255, 2012. https://doi.org/10.7763/IJCEE.2012.V4.489.

      [4] H. Kurniawan, A. V.Maslov and M. Pechenizkiy, "Stress detection from speech and Galvanic Skin Response signals," Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems, Porto, 2013, pp. 209-214 https://doi.org/10.1109/CBMS.2013.6627790.

      [5] J. Xie, W. Wen, G. Liu, C. Chen, J. Zhang and H. Liu, "Identifying strong stress and weak stress through blood volume pulse," 2016 International Conference on Progress in Informatics and Computing (PIC), Shanghai, 2016, pp. 179-182 https://doi.org/10.1109/PIC.2016.7949490.

      [6] S.Lalitha, S. Patnaik, T. Arvind, V. Madhusudhan, and S.Tripathi, "Emotion Recognition through Speech Signal for Human-Computer Interaction," in Electronic System Design (ISED), 2014 Fifth International Symposium on, 2014, pp. 217-218. https://doi.org/10.1109/ISED.2014.54.

      [7] Lalitha, S., Geyasruti, D., Narayanan, R., Shravani, and M.: “Emotion detection using MFCC and Cepstrum featuresâ€in Procedia Comput. Sci. 70, 29–35 (2015)

      [8] D. Ververidis and C. Kotropoulos, "A State of the Art Review on Emotional Speech Databases", in Proc. 1st Richmedia Conference, Lausanne, Switzerland, pp. 109-119, October 2003.

      [9] C.N. Anagnostopoulos T. Iliou and I. Giannoukos "Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011 " Artificial Intelligence Review pp. 1-23, 2012.

      [10] Hansen, J., Patil, S, “Speech under stress: Analysis, modeling and recognitionâ€. In: Müller, C. (ed.) Speaker Classification 2007. LNCS (LNAI), vol. 4343, pp. 108–137. Springer, Heidelberg (2007) https://doi.org/10.1007/978-3-540-74200-5_6.

      [11] Rashmi Makhijani Urmila Shrawankar Dr. V. M. Thakare "Speech Enhancement using Pitch Detection Approach for Noisy Environment" International Journal of Engineering Science and Technology (IJEST) Vol. 3 No. 2 PP. 1764-1769 Feb 2011.

      [12] M .P. Kesarkar And P.Rao,“Feature Extraction for Speech Recognition†,Credit Seminar Report, Electronic Systems Group, EE. Dept, IIT Bombay, 2003.

      [13] Sunny Sonia, S David Peter, K. Poulose Jacob, "Performance of different classifiers in speech recognition", International Journal of Research in Engineering and Technology, vol. 2, no. 4, pp. 590-597, Apr. 2013. https://doi.org/10.15623/ijret.2013.0204032.

      [14] L. He, M. Lech, C. Namunu, et al., “Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speechâ€, Biomedical Signal Processing and Control 6 (2011) 139–146. https://doi.org/10.1016/j.bspc.2010.11.001.

      [15] Hansen, J., Bou-Ghazale, S., 1997. “Getting started with SUSAS: A speech under simulated and actual stress databaseâ€. In: Proceedings of the Eurospeech, Rhodes, Greece, Vol. 5, pp. 2387–2390, 1997.

      [16] Suman Deb, S Dandapat “Classification of speech under stress using harmonic peak to energy ratioâ€, Computers and Electrical Engineering, Volume 55 Issue C, October 2016, Pages 12-23.

      [17] J. H. L. Hansen and B. D. Womack, "Feature analysis and neural network-based classification of speech under stress," in IEEE Transactions on Speech and Audio Processing, vol. 4, no. 4, pp. 307-313, Jul 1996. https://doi.org/10.1109/89.506935.

      [18] Tin Lay Nwe, Say Wei Foo and L. C. De Silva, "Classification of stress in speech using linear and nonlinear features," Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on, 2003, pp. II-9-12 vol.2.

      [19] L. He, M. Lech, N. C. Maddage and N. Allen, "Neural Networks and TEO Features for an Automatic Recognition of Stress in Spontaneous Speech," 2009 Fifth International Conference on Natural Computation, Tianjin, 2009, pp. 227-231.

      [20] S. Besbes and Z. Lachiri, "Multi-class SVM for stressed speech recognition," 2016 second International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), Monastir, 2016, pp. 782-787. https://doi.org/10.1109/ATSIP.2016.7523188.

      [21] Senthil Raja, G. & Dandapat, S. “speaker recognition under stressed condition†Int J Speech Technol (2010) 13: 141 https://doi.org/10.1007/s10772-010-9075-z.

      [22] H. Patro, G. S. Raja and S. Dandapat, "Classification of Stressed Speech using Gaussian Mixture Model," 2005 Annual IEEE India Conference - Indicon, 2005, pp. 342-346. https://doi.org/10.1109/INDCON.2005.1590186.

      [23] Shukla, S.; Dandapat, S.; Prasanna, S.R.M. “Spectral slope based analysis and classification of stressed speechâ€. Int. J. Speech Technol. 2011, 14, 245–258. https://doi.org/10.1007/s10772-011-9100-x.

      [24] S. Shukla, S. R. M. Prasanna and S. Dandapat, "Stressed speech processing: Human vs automatic in non-professional speakers scenario," 2011 National Conference on Communications (NCC), Bangalore, 2011, pp. 1-5. https://doi.org/10.1109/NCC.2011.5734704.

      [25] S. Deb and S. Dandapat, ‘‘A novel breathiness feature for analysis and classification of speech under stress,’’ in Proc. 21st Nat. Conf. Commun. (NCC), 2015, pp. 1–5 https://doi.org/10.1109/NCC.2015.7084826.

      [26] Leandro D. Vignolo, S.R. Mahadeva Prasanna, Samarendra Dandapat, H. Leonardo Rufiner, Diego H. Milone. “Feature optimisation for stress recognition in speechâ€, Pattern Recognition Letters, Volume 84. 1‒7, 2016. https://doi.org/10.1016/j.patrec.2016.07.017.

      [27] B. Priya and S. Dandapat, "Linear transformation on speech subspace for analysis of speech under stress condition," 2015 Twenty First National Conference on Communications (NCC), Mumbai, 2015, pp. 1-6. https://doi.org/10.1109/NCC.2015.7084831.

      [28] L. Czap and J. M. Pintér, "Intensity feature for speech stress detection," Proceedings of the 2015 16th International Carpathian Control Conference (ICCC), Szilvasvarad, 2015, pp. 91-94. https://doi.org/10.1109/CarpathianCC.2015.7145052.

      [29] Rodellar-Biarge, V., Palacios-Alonso, D., Nieto-Lluis, V., and Gómez-Vilda, P. “Towards the search of detection in speech-relevant features for stressâ€, Expert Systems, 32: 710–718, 2015. https://doi.org/10.1111/exsy.12109.

      [30] Frank, M.G. and P. Ekman, “Appearing truthful generalizes across different deception situationsâ€, Journal of Personality and Social Psychology, 86, 486–495, 2004. https://doi.org/10.1037/0022-3514.86.3.486.

      [31] S. Gillespie, E. Moore, J. Laures-Gore, M. Farina, S. Russell and Y. Y. Logan, "Detecting stress and depression in adults with aphasia through speech analysis," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 2017, pp. 5140-5144. https://doi.org/10.1109/ICASSP.2017.7953136.

      [32] M. Ã. Tündik, G. Kiss, D. Sztahó and G. Szaszák, "Assessment of pathological speech prosody based on automatic stress detection and phrasing approaches," 2017 8th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), Debrecen, 2017, pp. 000067-000072.

  • Downloads

  • How to Cite

    Narayanan, V., Lalitha, S., & Gupta, D. (2018). An epitomization of stress recognition from speech signal. International Journal of Engineering & Technology, 7(2.27), 61-68. https://doi.org/10.14419/ijet.v7i2.27.10123

    Received date: 2018-03-14

    Accepted date: 2018-05-02

    Published date: 2018-08-02