A Hybrid Bootstrapping Approach for developing Odiya Named Entity Corpora from Wikipedia
2018-12-03 https://doi.org/10.14419/ijet.v7i4.38.24311
Named Entity Recognition, NER, Wikipedia, Machine Translation, Information Extraction, Information Retrieval. -
Named Entity Recognition (NER) is considered as very influential undertaking in natural language processing appropriate to Question Answering system, Machine Translation (MT), Information extraction (IE), Information Retrieval (IR) etc. Basically NER is to identify and classify different types of proper nouns present inside given file like location name, person name, number, organization name, time etc. Although huge amount of progress is made for different Indian languages, NER is still a big problem for Odiya Language. Odiya is also a resource constrained language and till today, this is very tough to find out a large and accurate corpus for training and test. Therefore in this paper, we have utilized Wikipedia to develop a huge Odiya corpus of annotated name entities which is quite efficient to be training dataset further. After evaluation, we have got a very promising result with a F-score of 78.89.
Biswas, S., & Dash, S. (2018). A Hybrid Bootstrapping Approach for developing Odiya Named Entity Corpora from Wikipedia. International Journal of Engineering and Technology, 7(4.38), 11-16. https://doi.org/10.14419/ijet.v7i4.38.24311
Accepted date: 2018-12-18
Published date: 2018-12-03