An approach to spectral analysis of psychologically influenced speech

  • Authors

    • Bhagyalaxmi Jena
    • Sudhansu Sekhar Singh
    2017-12-28
    https://doi.org/10.14419/ijet.v7i1.2.8993
  • Speech Signal, Stress, Fast Fourier Transform, Spectrogram, Power Spectral Density.
  • Abstract

    The significant part of any speech signal lies in the information content and the emotions contents like stress or fatigue at a particular period of time. The classification of various types of stress and their effects are defined here. To analyze the changes in stressed speech than that of the normal speech, a database has been created which has investigated the stress among students during the examination in our college. In this paper, the spectral analysis of speech is done where emphasis has been given in the parameters like Fast Fourier Transform (FFT), spectrogram and Power Spectral Density (PSD). These parameters have been simulated using MATLAB codes. The comparison of the mentioned parameters is also done between a normal speech and a psychological stressed speech.

  • References

    1. [1] M. Sigmund. (2006). “Introducing the database ExamStress for speech under stress,†Proceedings of 7th IEEE NordicSignal Processing Symposium (NORSIG 2006). Reykjavik, (pp. 290-293). https://doi.org/10.1109/NORSIG.2006.275258.

      [2] D. A. Cairns & J. H. L. Hansen. (1994), “Nonlinear analysis and detection of speech understressed conditions,†J. Acoust. Soc. Amer., vol. 96, (pp.3392–3400). https://doi.org/10.1121/1.410601.

      [3] V. Mohan. (2013). “Analysis & Synthesis of Speech Signal Using Matlabâ€, International Journal of Advancements in Research & Technology, Volume 2, Issue 5.

      [4] T. Johnstone & K. Scherer. (1999) “The effects of emotions on voice quality,†Proceedings of 14th International Congressof Phonetic Science. San Francisco, (pp. 2029-2032).

      [5] D. Ververidis & C. Kotropoulos. (2006). “Emotional speech recognition: Resources, features, and methods,†SpeechCommunication, vol. 48, No. 9, (pp. 1162-1181). https://doi.org/10.1016/j.specom.2006.04.003.

      [6] L. R. Rabiner & B. H. Juang. (1993) Fundamentals of Speech Recognition,Englewood Cliffs, NJ: Prentice-Hall.

      [7] Cowie & R.Cornelius, R.R. (2003). Describing the emotional statesthat are expressed in speech. Speech Comm. 40 (1), 5–32.Cowie, R., Douglas-Cowie, E., 1996. Automatic statistical.Rep. 236, Univ. of Hamburg. https://doi.org/10.1016/S0167-6393(02)00071-7.

      [8] Flanagan, J.L. (1972). Speech Analysis, Synthesis and Perception.second ed. Springer-Verlag, NY. https://doi.org/10.1007/978-3-662-01562-9.

      [9] Heuft, B., Portele & T. Rauth, (1996). Emotions in time domain synthesis. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP ’96), Vol. 3, (pp. 1974–1977).

      [10] Markel, J.D., Gray & A.H. (1976). Linear Prediction of Speech. Springer-Verlag, NY. https://doi.org/10.1007/978-3-642-66286-7.

      [11] Quatieri, T.F. (2002). Discrete-Time Speech Signal Processing. Prentice-Hall, NJ.

      [12] Rahurkar & M.Hansen (2002). Frequency band analysis for stress detection using a Teager energy operator based feature. In: Proc. Internat. Conf. on Spoken Language Processing (ICSLP ’02), Vol. 3, (pp. 2021–2024).

      [13] Steeneken & Hansen (1999). Speech under stress conditions: overview of the effect of speech production and on system performance. In: Proc. Internat. Conf. on Acoustics, Speech, and Signal Processing (ICASSP ’99), Phoenix, Vol. 4, (pp. 2079–2082).

      [14] Womack & B.D., Hansen, (1996). Classification of speech under stress using target driven features. Speech Comm. 20, (pp.131–150). https://doi.org/10.1016/S0167-6393(96)00049-0.

      [15] Zhou, G., Hansen, J.H.L. & Kaiser, J.F. (2001). Nonlinear featurebased classification of speech under stress. IEEE Trans.Speech Audio Processing 9 (3), (pp.201–216). https://doi.org/10.1109/89.905995.

      [16] Deller, J. R., Hansen, J. H. L., Proakis, J. G. (2000). Discete- Time Processing of Speech Signals. N.Y.: Wiley.

      [17] M. Sigmund, Voice Recognition by Computer. Tectum Verlag, Marburg. (2003).

      [18] M. Sigmund & P. Matĕjka. (2002) “An environment for automatic speech signal labelling,†Proceedings of 28th IASTED International Conference on Applied Informatics. Innsbruck, (pp. 298-301).

      [19] A. Nagoor Kani. (2005). Signals & Systems. Tata McGraw Hill Education.

      [20] Sanjit K Mitra. (2009). Digital signal processing, A computer base approach, Tata McGraw Hill.

      [21] Lawrence R. Rabiner & Ronald W. Schafer. (2003). Digital Processing of Speech Signals. AT&T.

      [22] Alan V. Oppenheim, Alan S. Willsky & S. Hamid Nawab.(2005). Signal & Systems. PHI Learning.

      [23] J.H. Hasen & S.E.Ghazale.Getting started with SUSAS. Proceedings of Eurospeech’97. Rhodes, (pp.1743-1746).

      [24] M.Kepesi & L.Weruaga. (2006). Adaptive chirp-based time-frequency analysis of speech signals.vol.48, No.5, (pp. 474-492).

      [25] B. Gold & N. Morgan. (2000). Speech and AudioSignal Processing. New York. John Wiley and Sons.

      [26] Milan Sigmund. (2007). Spectral Analysis of speech under stress. IJCSNS International Journal of Computer Science and Network Security, vol.7.

      [27] J.H.L Hansen & B.D.Womack. (1996). Feature analysis and neural network-based classification of speech under stress.(pp. 307-313)

      [28] R.J McAulay & T.F. Quatieri. (1986).Speech Analysis based on a Sinusoidal Representation. IEEE Transaction On Audio, Speech, And Language Processing.Vol.14.No.3 https://doi.org/10.1109/TASSP.1986.1164910.

      [29] W.Press, S.Teukolsky, W.Vetterling & Flannery. (1992).

      [30] Ruhi Sarikya & John N. Gowdy. (1997). Wavelet Based Analysis of Speech under stress.

      [31] B.S. Atal. (1976). Automatic Recognition of Speakers from their Voices. Vol.64. no. 4(pp. 460-476) https://doi.org/10.1109/PROC.1976.10155.

      [32] D.O’ Shauhnessy. (2004). Speech Communication (Human and Machine).

      [33] Herman J.M. Steeneken and Johan H.L. Hasen. Speech under Stress Conditions: Overview of the Effect on Speech Production and on System Performance.

  • Downloads

  • How to Cite

    Jena, B., & Sekhar Singh, S. (2017). An approach to spectral analysis of psychologically influenced speech. International Journal of Engineering & Technology, 7(1.2), 66-70. https://doi.org/10.14419/ijet.v7i1.2.8993

    Received date: 2017-12-30

    Accepted date: 2017-12-30

    Published date: 2017-12-28