Part of Speech Tagging for Arabic Long Sentence

Ahmed H. Aliwy; Duaa A. Al Raza

doi:10.14419/ijet.v7i3.27.17671

Article Summary Abstract References Full Article How to cite

Authors
- Ahmed H. Aliwy
- Duaa A. Al Raza
2018-08-15

https://doi.org/10.14419/ijet.v7i3.27.17671
.
Abstract

Part Of Speech (POS) tagging of Arabic words is a difficult and non-travail task it was studied in details for the last twenty years and its performance affects many applications and tasks in area of natural language processing (NLP). The sentence in Arabic language is very long compared with English sentence. This affect tagging process for any approach deals with complete sentence at once as in Hidden Markov Model HMM tagger. In this paper, new approach is suggested for using HMM and n-grams taggers for tagging Arabic words in a long sentence. The suggested approach is very simple and easy to implement. It is implemented on data set of 1000 documents of 526321 tokens annotated manually (containing punctuations). The results shows that the suggested approach has higher accuracy than HMM and n-gram taggers. The F-measures were 0.888, 0.925 and 0.957 for n-grams, HMM and the suggested approach respectively.
References
1. [1] Jurafsky D & Martin J, â€œSpeech and Language Processing: An introduction to natural language processingâ€, computational linguistics, and speech recognition, (2008).
  [2] Nitin I & Fred J, Handbook of Natural Language Processing, Second Edition, Chapman & Hall/CRC Machine Learning & Pattern Recognition, USA, (2010).
  [3] Aliwy AH, â€œArabic morphosyntactic raw text part of speech tagging systemâ€, Ph.D dissertation, University of Warsaw, warsaw, Poland, (2010).
  [4] Darwish K, Abdelali A & Mubarak H, â€œUsing Stem-Templates to Improve Arabic POS and Gender/Number Taggingâ€, LREC, (2014), pp.2926-2931.â€
  [5] Diab M, Hacioglu K & Jurafsky D, â€œAutomatic tagging of Arabic text: From raw text to base phrase chunksâ€, Proceedings of HLT-NAACL:Short papers, (2004), pp.149-152.â€
  [6] Attia M & Rashwan M, â€œA large-scale Arabic POS tagger based on a compact Arabic POS tags set, and application on the statistical inference of syntactic diacritics of Arabic text wordsâ€, Proceedings of the Arabic Language Technologies and Resources Intâ€™l Conference, (2004).
  [7] Albared M, Omar N, Ab Aziz MJ & Nazri MZA, â€œAutomatic part of speech tagging for Arabic: an experiment using Bigram hidden Markov modelâ€, International Conference on Rough Sets and Knowledge Technology, (2010), 361-370.
  [8] Mansour S, Sima'an K & Winter Y, â€œSmoothing a lexicon-based POS tagger for Arabic and Hebrewâ€, Proceedings of the Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, (2007), pp.97-103.â€
  [9] Surendar, A., & Nelakuditi, U. R. (2017). Editorial -New developments in electronics, cloud and IoT. Electronic Government, 13(4).
  [10] Albared M, Omar N & Ab Aziz MJ, â€œDeveloping a competitive HMM arabic POS tagger using small training corporaâ€, Asian Conference on Intelligent Information and Database Systems, (2011), pp.288-296.â€
  [11] Aliwy AH, â€œCombining POS taggers in master-slaves technique for highly inflected languages as Arabicâ€, International Conference on Cognitive Computing and Information Processing, (2015), pp. 1-5.
  [12] Abbas M, Smaili K & Berkani D, â€œEvaluation of Topic Identification Methods on Arabic Corporaâ€, Journal of Digital Informa0on Management, Vol.9, No.5, (2011), pp.185-192.
  [13] Toutanova K, Klein D, Manning CD & Singer Y, â€œFeature-Rich Part-Of-Speech Tagging With a Cyclic Dependency Networkâ€, Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, (2003), pp.173â€“180.
  [14] Z Iskakova, M Sarsembayev, Z Kakenova (2018). Can Central Asia be integrated as asean? OpciÃ³n, AÃ±o 33. 152-169.
  [15] G Cely Galindo (2017) Del Prometeo griego al de la era-biÃ³s de la tecnociencia. Reflexiones bioÃ©ticas OpciÃ³n, AÃ±o 33, No. 82 (2017):114-133
Downloads
How to Cite
H. Aliwy, A., & A. Al Raza, D. (2018). Part of Speech Tagging for Arabic Long Sentence. International Journal of Engineering & Technology, 7(3.27), 125-128. https://doi.org/10.14419/ijet.v7i3.27.17671
ACM

ACS

APA

ABNT

Chicago

Harvard

IEEE

MLA

Turabian

Vancouver

Download Citation

Endnote/Zotero/Mendeley (RIS)

BibTeX
Received date: 2018-08-16

Accepted date: 2018-08-16

Published date: 2018-08-15

Part of Speech Tagging for Arabic Long Sentence

Authors

Abstract

References

Downloads

How to Cite

Published