Deformed character recognition using convolutional neural networks

  • Authors

    • N Shobha Rani Amrita Vishwa Vidyapeetham, Mysuru
    • N Chandan
    • A Sajan Jain
    • H R. Kiran
    2018-07-26
    https://doi.org/10.14419/ijet.v7i3.14053
  • Ancient Documents, Deep Neural Networks, Degraded Character Recognition, Handwritten Text, Kannada Documents, Printed Text, South Indian Script.
  • Realization of high accuracies towards south Indian character recognition is one the truly interesting research challenge. In this paper, our investigation is focused on recognition of one of the most widely used south Indian script called Kannada. In particular, the proposed exper-iment is subject towards the recognition of degraded character images which are extracted from the ancient Kannada poetry documents and also on the handwritten character images that are collected from various unconstrained environments. The character images in the degraded documents are slightly blurry as a result of which character image is imposed by a kind of broken and messy appearances, this particular aspect leads to various conflicting behaviors of the recognition algorithm which in turn reduces the accuracy of recognition. The training of degraded patterns of character image samples are carried out by using one of the deep convolution neural networks known as Alex net.The performance evaluation of this experimentation is subject towards the handwritten datasets gathered synthetically from users of age groups between 18-21, 22-25 and 26-30 and also printed datasets which are extracted from ancient document images of Kannada poetry/literature. The datasets are comprised of around 497 classes. 428 classes include consonants, vowels, simple compound characters and complex com-pound characters. Each base character combined with consonant/vowel modifiers in handwritten text with overlapping/touching diacritics are assumed as a separate class in Kannada script for our experimentation. However, for those compound characters that are non-overlapping/touching are still considered as individual classes for which the semantic analysis is carried out during the post processing stage of OCR. It is observed that the performance of the Alex net in classification of printed character samples is reported as 91.3% and with reference to handwritten text, and accuracy of 92% is recorded.

     

     

  • References

    1. [1] Ahmad, I., Wang, X., Li, R., & Rasheed, S. (2017). Offline Urdu Nastaleeq optical character recognition based on stacked denoising autoencoder. China Communications, 14(1), 146-157.https://doi.org/10.1109/CC.2017.7839765.

      [2] Aloysius, N., & Geetha, M. (2017, April). A review on deep convolutional neural networks. In Communication and Signal Processing (ICCSP), 2017 International Conference on (pp. 0588-0592). IEEE.https://doi.org/10.1109/ICCSP.2017.8286426.

      [3] Anil, R., Manjusha, K., Kumar, S. S., & Soman, K. P. (2015). Convolutional neural networks for the recognition of Malayalam characters. In Proceedings of the Third International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014 (pp. 493-500). Springer, Cham.

      [4] Arica, N., &Yarman-Vural, F. T. (2001). An overview of character recognition focused on off-line handwriting. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 31(2), 216-233.https://doi.org/10.1109/5326.941845.

      [5] Bhowmik, T. K., Bhattacharya, U., &Parui, S. K. (2004, November). Recognition of Bangla handwritten characters using an MLP classifier based on stroke features. In International Conference on Neural Information Processing (pp. 814-819). Springer, Berlin, Heidelberg.https://doi.org/10.1007/978-3-540-30499-9_125.

      [6] Ciresan, D. C., Meier, U., Gambardella, L. M., &Schmidhuber, J. (2011, September). Convolutional neural network committees for handwritten character classification. In Document Analysis and Recognition (ICDAR), 2011 International Conference on (pp. 1135-1139). IEEE.

      [7] Dewan, S., &Chakravarthy, S. (2012, November). A system for offline character recognition using auto-encoder networks. In International Conference on Neural Information Processing (pp. 91-99). Springer, Berlin, Heidelberg.https://doi.org/10.1007/978-3-642-34478-7_12.

      [8] Droettboom, M. (2003, May). Correcting broken characters in the recognition of historical printed documents. In Digital Libraries, 2003. Proceedings. 2003 Joint Conference on (pp. 364-366). IEEE.https://doi.org/10.1109/JCDL.2003.1204889.

      [9] Graves, A., & Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in neural information processing systems (pp. 545-552).

      [10] Lavrenko, V., Rath, T. M., &Manmatha, R. (2004). Holistic word recognition for handwritten historical documents. In Document Image Analysis for Libraries, 2004. Proceedings. First International Workshop on (pp. 278-287). IEEE.https://doi.org/10.1109/DIAL.2004.1263256.

      [11] Loey, M., El-Sawy, A., & EL-Bakry, H. (2017). Deep Learning Autoencoder Approach for Handwritten Arabic Digits Recognition. arXiv preprint arXiv:1706.06720.

      [12] Nithin, D. K., & Sivakumar, P. B. (2015). Generic feature learning in computer vision. Procedia Computer Science, 58, 202-209.https://doi.org/10.1016/j.procs.2015.08.054.

      [13] Pal, A., &Pawar, J. D. (2015, April). Recognition of online handwritten Bangla characters using hierarchical system with Denoising Autoencoders. InInternational Conference on Computation of Power, Energy Information and Commuincation (ICCPEIC), 2015.IEEE,0047-0051.https://doi.org/10.1109/ICCPEIC.2015.7259440.

      [14] Pal, Umapada & Chaudhuri, Bidyut. (2004). Chaudhuri, B.B.: Indian script character recognition-A survey. Pattern Recognition 37, 1887-1899. Pattern Recognition. 37. 1887-1899. https://doi.org/10.1016/j.patcog.2004.02.003.

      [15] Pratama, M. O & Kareen,P.(2017,October). Reconstructing Japanese handwritten images using auto-encoder with residual block in parallel computing, 2017 International Conference on Electrical Engineering and Informatics (ICELTICs), Banda Aceh, 231-234.https://doi.org/10.1109/ICELTICS.2017.8253262

      [16] Rani, D. A. N. S., Vineeth, P., & Ajith, D. (2016). Detection and removal of graphical components in pre-printed documents. International Journal of Applied Engineering Research, 11(7), 4849-4856.

      [17] Rani, N. S., & Vasudev, T. (2018). An Efficient Technique for Detection and Removal of Lines with Text Stroke Crossings in Document Images. In Proceedings of International Conference on Cognition and Recognition .Springer, Singapore. 83-97.https://doi.org/10.1007/978-981-10-5146-3_9

      [18] Sreelakshmi, U. K., Akash, V. G., & Rani, N. S. (2017, April). Detection of variable regions in complex document images. In Proceedings ofInternational Conference on Communication and Signal Processing (ICCSP), 2017.IEEE, 0807-0811.https://doi.org/10.1109/ICCSP.2017.8286476.

      [19] Yang, W., Jin, L., Xie, Z., & Feng, Z. (2015). Improved deep convolutional neural network for online handwritten Chinese character recognition using domain-specific knowledge. arXiv preprint .https://arxiv.org/abs/1505.07675

      [20] Zhong, Z., Jin, L., & Xie, Z. (2015, August). High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps. In 13th International Conference on Document Analysis and Recognition (ICDAR), 2015. IEEE, 846-850.https://arxiv.org/abs/1505.04925

  • Downloads

  • How to Cite

    Shobha Rani, N., Chandan, N., Sajan Jain, A., & R. Kiran, H. (2018). Deformed character recognition using convolutional neural networks. International Journal of Engineering & Technology, 7(3), 1599-1604. https://doi.org/10.14419/ijet.v7i3.14053