Real time device control over voice recognition inoffline

  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract

    The consistent increase in the number and ownership population of mobile devices introduces a variety of limitations. A set of this limitations revolve around interactivity. The overly dependent haptic mechanism of interaction has caused device falls, slower time to interaction, health concerns, and limited support for the disabled among other problems. There is need to formulate innovative techniques that facilitate our interaction with these devices for users. In order to achieve this, a Real-time Voice Recognition Algorithm is formulated that lets users of mobile devices acquire freedom to move about and reduce the need for constantly glancing at their screen. This is achieved by allowing users to verbally command their devices to carry out ordinary tasks. An added unique feature is that it also offers offline access as any commands given by a user are processed and executed locally on the device.

  • Keywords

    Mobile Devices; Real Time; Voice Recognition; Screen.

  • References

      [1] Davis, K., Biddulph, R., and Balashek, S., “Automatic Recognition of Spoken Digit,” J. Acoust. Soc. Am. 24: Nov 1952, p. 637.

      [2] Hemdal, J.F. and Hughes, G.W., A feature based computer recognition program for the modeling of vowel perception, in Models for the Perception of Speech and Visual Form, Wathen-Dunn, W. Ed. MIT Press, Cambridge, MA.

      [3] Watcher, M. D., Matton, M., Demuynck, K., Wambacq, P., Cools, R., “Template Based Continuous Speech Recognition”, IEEE Transaction on Audio, Speech, & Language Processing, 2007.

      [4] Samoulian, A., “Knowledge Based Approach to Speech Recognition”, 1994.

      [5] Tripathy, H. K., Tripathy, B. K., Das, P. K., “A Knowledge based Approach Using Fuzzy Inference Rules for Vowel Recognition”, Journal of Convergence Information Technology Vol. 3 No 1, March 2008.

      [6] Savage, J., Rivera, C., Aguilar, V., “Isolated word speech recognition using Vector Quantization Techniques and Artificial Neural Networks”, 1991.

      [7] Debyeche, M., Haton, J.P., Houacine, A., “Improved Vector Quantization Technique for Discrete HMM speech recognition system”, International Arab Journal of information Technology, Vol. 4, No. 4, October 2007.

      [8] Hatulan, R. J. F., Chan, A. J. L., Hilario, A. D., Lim, J. K. T., and Sybingco, E., “Speech to text converter for Filipino Language using Hybrid Artificial Neural Network and Hidden Markov Model”, ECE Student Forum December 1, 2007 De La Salle University.

      [9] K.Sathesh Kumar, K.Shankar, M. Ilayaraja and M. Rajesh, “Sensitive Data Security In Cloud Computing Aid Of Different Encryption Techniques, Journal of Advanced Research in Dynamical and Control Systems, vol.18, no.23, 2017.

      [10] Sendra, J. P., Iglesias, D. M., Maria, F. D., “Support Vector Machines For Continuous Speech Recognition”, 14th European Signal Processing Conference 2006, Florence, Italy, Sept 2006.

      [11] Jain, R. And Saxena, S. K., “Advanced Feature Extraction & Its Implementation In Speech Recognition System”, IJSTM, Vol. 2 Issue 3, July 2011.

      [12] Aggarwal, R.K. and Dave, M., “Acoustic Modelling Problem for Automatic Speech Recognition System: Conventional Methods (Part I)”, International Journal of Speech Technology (2011) 14:297–308.

      [13] Aggarwal, R. K. and Dave, M., “Acoustic modelling problem for automatic speech recognition system: advances and refinements (Part II)”, International Journal of Speech Technology (2011) 14:309–320.

      [14] Ostendorf, M., Digalakis, V., & Kimball, O. A. (1996). From HMM‟s to segment models: a unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing, 4(5), 360–378.

      [15] Yasuhisa Fujii, Y., Yamamoto, K., Nakagawa, S., “Automatic Speech Recognition Using Hidden Conditional Neural Fields”, Icassp 2011: P-5036-5039.

      [16] Mohamed, A. R., Dahl, G. E., and Hinton, G., “Acoustic Modelling using Deep Belief Networks”, submitted to IEEE TRANS. On audio, speech, and language processing, 2010.

      [17] Sorensen, J., and Allauzen, C., “Unary data structures for Language Models”, INTERSPEECH 2011.

      [18] Kain, A., Hosom, J. P., Ferguson, S. H., Bush, B., “Creating a speech corpus with semi-spontaneous, parallel conversational and clear speech”, Tech Report: CSLU-11-003, August 2011.

      [19] Hamdani, G. D., Selouani, S. A., Boudraa, M., “Algerian Arabic Speech Database (Algasd): Corpus Design and Automatic Speech Recognition Application”, the Arabian Journal for Science and Engineering, Volume 35, Number 2c, Dec 2010.




Article ID: 10011
DOI: 10.14419/ijet.v7i1.9.10011

Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.