Development of Punjabi-English (PunEng) Parallel Corpus for Machine Translation System

Kamal Deep; Ajit Kumar; Vishal Goyal

doi:10.14419/ijet.v7i2.10762

Article Summary Abstract References Full Article How to cite

Authors
- Kamal Deep Punjabi University
- Ajit Kumar Punjabi University
- Vishal Goyal Punjabi University
2018-05-10

https://doi.org/10.14419/ijet.v7i2.10762
English, Machine Translation, Parallel Corpus, Punjabi, Puneng Corpus
Abstract

This paper describes the creation process and statistics of Punjabi English (PunEng) parallel corpus. Parallel corpus is the main requirement to develop statistical machine translation as well as neural machine translation. Until now, we do not have any availability of PunEng parallel corpus. In this paper, we have shown difficulties and intensive labor to develop parallel corpus. Methods used for collecting data and the results are discussed, errors during the process of collecting data and how to handle these errors will be described.
References
1. [1] M. Post, C. Callison-Burch, and M. Osborne, â€œConstructing parallel corpora for six Indian languages via crowdsourcing,â€ Wmt-2012, pp. 401â€“409, 2012.
  [2] A. Kunchukuttan, P. Mehta, and P. Bhattacharyya, â€œThe IIT Bombay English-Hindi Parallel Corpus,â€ pp. 2â€“5, 2017.
  [3] V. Goyal and G. S. Lehal, â€œHindi to Punjabi machine translation system,â€ Commun. Comput. Inf. Sci., vol. 139 CCIS, no. 1, pp. 236â€“241, 2011.
  
  Webliography
  [W1]https://en.wikipedia.org/wiki/Punjabi_language
  [W2] http://www.lancaster.ac.uk/fass/projects/corpus/emille/
  [W3]http://www.lancaster.ac.uk/fass/projects/corpus/emille/MAUAL.htm
  [W4]http://tdildc.in/index.php?option=com_download&task=fsearch&lang=en&limitstart=15&limit=5
  [W5]http://www.statmt.org/wmt16/translation-task.html
  [W6]https://translate.google.com/
  [W7]https://www.wikipedia.org/
  [W8]http://tdildc.in/index.php?option=com_download&task=showresourceDetails&toolid=281&lang=en
Downloads
How to Cite
Deep, K., Kumar, A., & Goyal, V. (2018). Development of Punjabi-English (PunEng) Parallel Corpus for Machine Translation System. International Journal of Engineering and Technology, 7(2), 690-693. https://doi.org/10.14419/ijet.v7i2.10762
ACM

ACS

APA

ABNT

Chicago

Harvard

IEEE

MLA

Turabian

Vancouver

Download Citation

Endnote/Zotero/Mendeley (RIS)

BibTeX
Received date: 2018-03-28

Accepted date: 2018-04-06

Published date: 2018-05-10

Development of Punjabi-English (PunEng) Parallel Corpus for Machine Translation System

Authors

Abstract

References

Downloads

How to Cite

Published