Skew detection based on vertical projection in latin character recognition of text document image

Authors

  • Ronny Susanto
  • Farica P. Putri
  • Y. Widya Wiratama

DOI:

https://doi.org/10.14419/ijet.v7i4.44.26983

Published:

2018-12-01

Keywords:

Optical Character Recognition, Preprocessing, Skew Detection, Projection Profile, Vertical Projection.

Abstract

The accuracy of Optical Character Recognition is deeply affected by the skew of the image.  Skew detection & correction is one of the steps in OCR preprocessing to detect and correct the skew of document image. This research measures the effect of Combined Vertical Projection skew detection method to the accuracy of OCR. Accuracy of OCR is measured in Character Error Rate, Word Error Rate, and Word Error Rate (Order Independent). This research also measures the computational time needed in Combined Vertical Projection with different iteration. The experiment of Combined Vertical Projection is conducted by using iteration 0.5, 1, and 2 with rotation angle within -10 until 10 degrees. The experiment results show that the use of Combined Vertical Projection could lower the Character Error Rate, Word Error Rate, and Word Error Rate (Order Independent) up to 35.53, 34.51, and 32.74 percent, respectively. Using higher iteration value could lower the computational time but also decrease the accuracy of OCR.

 

 


 

References

[1] Chandarana J & Kapadia MR, “Optical character recognitionâ€, International Journal of Emerging Technology and Advanced Engineering, Vol. 4, No. 5, (2014), pp. 219-223.

[2] Minoru M, Character Recognition, IntechOpen, (2010).

[3] Berchmans D & Kumar SS, “Optical character recognition: an overview and an insightâ€, Proceedings of International Control, Instrumentation, Communication and Computational Technologies (ICCICCT), (2014), pp: 1361-1365.

[4] Papandreou A & Gatos B, “A novel Skew Detection technique based on Vertical Projectionsâ€, Proceedings of International Document Analysis and Recognition (ICDAR), (2011), pp: 384-388.

[5] Postl W, “Detection of linear oblique structures and skew scan in digitized documentsâ€, Proceedings of International Conference on Pattern Recognition, (1986), pp: 687-689.

[6] Chauduri BB & Pal U, “An improved document skew angle estimation techniqueâ€, Journal of Pattern Recognition Letters, Vol. 17, No. 8, (1996), pp. 899-904.

[7] Kant AJ & Vyavahare AJ, “Devanagari OCR using projection profile segmentation methodâ€, International Research Journal of Engineering and Technology, Vol. 3, No. 7, (2016), pp. 132-134.

[8] Carrasco RC, “An open-source OCR evaluation toolâ€, Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage, (2014), pp: 179-184.

[9] Smith R, et.al., “Tesseract Open Source OCR Engineâ€, (2017), available online: https://github.com/tesseract-ocr/tesseract

[10] Vijayarani S & Sakila A, “Performance comparison of OCR Toolsâ€, International Journal of UbiComp (IJU), Vol. 6, No. 3, (2015), pp. 19-30.

[11] Al-Khatatneh A, Pitchay SA, & Al-qudah M, “A Review of Skew Detection Techniques for Documentâ€, Proceedings of International Conference on Modelling and Simulation (UKSim), (2015), pp: 316-321.

[12] Jain B & Borah M, “A survey paper on skew detection of offline handwritten character recognition systemâ€, International Journal of Computer Engineering and Applications, Vol. 6, No. 1, (2014).

[13] Poovizhi P, “A study on preprocessing techniques for the character recognitionâ€, International Journal of Open Information Technologies, Vol. 2, No. 12, (2014), pp. 21-24.

View Full Article:

How to Cite

Susanto, R., P. Putri, F., & Widya Wiratama, Y. (2018). Skew detection based on vertical projection in latin character recognition of text document image. International Journal of Engineering & Technology, 7(4.44), 198–202. https://doi.org/10.14419/ijet.v7i4.44.26983
Received 2019-02-02
Accepted 2019-02-02
Published 2018-12-01