Open Problems in Indonesian Automatic Essay Scoring System
Keywords:Indonesian, Natural language processing, Automatic essay scoring system, Open problems.
This paper presents open problems in Indonesian Scoring System. The previous study exposes the comparison of several similarity metrics on automated essay scoring in Indonesian. The metrics are Cosine Similarity, Euclidean Distance, and Jaccard. The data being used in the research are about 2,000 texts. This data are obtained from 50 students who answered 40 questions on politics, sports, lifestyle, and technology. The study also evaluates the stemming approach for the system performance. The difference between all methods between using stemming or not is around 4-9%. The results show Jaccard is the best metric both for the system with stemming or not. Jaccard method with stemming has the percentage error lowest than the others. The politic category has the highest average similarity score than lifestyle, sport, and technology. The percentage error of Jaccard with stemming is 52.31%, Cosine Similarity is 59.49%, and Euclidean Distance is 332.90%. In addition, Jaccard without stemming is also the best than the others. The percentage error without stemming of Jaccard is 56.05%, Cosine Similarity is 57.99%, and Euclidean Distance is 339.41%. However, this percentage error is high enough to be used for a functional essay grading system. The percentage errors are relatively high, more than 50%. Therefore this paper explores several ideas of open problems in this issue. The openly available dataset can be used to develop better approaches than the standard similarity metrics. The approaches expose are ranging from feature extraction, similarity metrics, learning algorithm, environment implementation, and performance evaluation.
 M. A. Raihan, R. H. Shamim, C. K. Clement, and H. S. Lock, â€œA Study on Assessment & Evaluation of Engineering Studentsâ€™ Learning by Essay Test Based on The Cognitive Domain of Bloomâ€™s,â€ Int. J. Adv. Eng. Technol., vol. 6, no. 1, pp. 1â€“11, 2013.
 T. Kakkonen and E. Sutinen, â€œAutomatic Assessment of the Content of Essays Based on Course Materials,â€ in ITRE 2004. 2nd International Conference Information Technology: Research and Education, 2004, pp. 126â€“130.
 S. Ghosh and S. S. Fatima, â€œDesign of an Automated Essay Grading (AEG) System in Indian Context,â€ in TENCON 2008 - 2008 IEEE Region 10 Conference, 2008, pp. 1â€“6.
 T. Roshinta and F. Rahutomo, â€œAnalisis Aspek-Aspek Ujian Esai Daring Berbahasa Indonesia,â€ Pros. Sentrinov (Seminar Nas. Terap. Ris. Inov., vol. 2, no. 1, 2016.
 F. Rahutomo and T. A. Roshinta, â€œIndonesian Query Answering Dataset for Online Essay Test System.â€ .
 R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology Behind Search, 2nd ed. USA: Addison-Wesley Publishing Company, 2008.
 C. D. Manning, P. Raghavan, and H. SchÃ¼tze, Introduction to Information Retrieval. New York, NY, USA: Cambridge University Press, 2008.
 M. Adriani, J. Asian, B. Nazief, S. M. M. Tahaghoghi, and H. E. Williams, â€œStemming Indonesian: A Confix-stripping Approach,â€ vol. 6, no. 4, pp. 1â€“33, Dec. 2007.
 F. Z Tala, â€œA Study of Stemming Effects on Information Retrieval in Bahasa Indonesia,â€ 2003.
 A. Z. Broder, S. C. Glassman, M. S. Manasse, and G. Zweig, â€œSyntactic Clustering of the Web,â€ Comput. Networks ISDN Syst., vol. 29, no. 8, pp. 1157â€“1166, 1997.
 P. Bahasa, Kamus Tesaurus Bahasa Indonesia. Departemen Pendidikan Nasional, 2008.
 R. Navigli, â€œWord Sense Disambiguation: A Survey,â€ ACM Comput. Surv., vol. 41, no. 2, pp. 1â€“69, 2009.
 F. Rashel, A. Luthfi, A. Dinakaramani, and R. Manurung, â€œBuilding an Indonesian Rule-Based Part-of-Speech Tagger,â€ in 2014 International Conference on Asian Language Processing (IALP), 2014, pp. 70â€“73.
 Y. Wibisono, â€œIndonesian Stopword,â€ 2008. [Online]. Available: https://yudiwbs.wordpress.com/2008/07/23/stop-words-untuk-bahasa-indonesia/. [Accessed: 01-Aug-2018].
 D. Doyle, â€œIndonesian Stopword.â€ [Online]. Available: https://www.ranks.nl/stopwords/indonesian. [Accessed: 01-Aug-2018].
 G. Salton, A. Wong, and C. S. Yang, â€œA Vector Space Model for Automatic Indexing,â€ Commun. ACM, vol. 18, no. 11, pp. 613â€“620, Nov. 1975.
 G. Salton and C. Buckley, â€œTerm-Weighting Approaches in Automatic Text Retrieval,â€ Inf. Process. Manag., vol. 24, no. 5, pp. 513â€“523, Aug. 1988.
 S. Robertson and H. Zaragoza, â€œThe Probabilistic Relevance Framework: BM25 and Beyond,â€ Found. TrendsÂ® Inf. Retr., vol. 3, no. 4, pp. 333â€“389, 2009.
 Y. Goldberg and O. Levy, â€œword2vec Explained: Deriving Mikolov et al.â€™s Negative-Sampling Word-Embedding Method,â€ 2014. .
 S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, â€œIndexing by Latent Semantic Analysis,â€ J. Am. Soc. Inf. Sci., vol. 41, no. 6, pp. 391â€“407, 1990.
 E. Gabrilovich and S. Markovitch, â€œComputing Semantic Relatedness Using Wikipedia-Based Explicit Semantic Analysis,â€ in Proceedings of the 20th international joint conference on Artifical intelligence, 2007, pp. 1606â€“1611.
 G. A. Miller, â€œWordNet: A Lexical Database for English,â€ Commun. ACM, vol. 38, no. 11, pp. 39â€“41, Nov. 1995.
 R. Feldman and J. Sanger, Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. New York, NY, USA: Cambridge University Press, 2006.
 A. Kao and S. R. Poteet, Natural Language Processing and Text Mining. Springer Publishing Company, Incorporated, 2006.
 I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. The MIT Press, 2016.
 V. Klema and A. Laub, â€œThe Singular Value Decomposition: Its Computation and Some Applications,â€ IEEE Trans. Automat. Contr., vol. 25, no. 2, pp. 164â€“176, 1980.
 I. T. Jolliffe, Principal Component Analysis. Springer Verlag, 1986.
 J. Benesty, J. Chen, Y. Huang, and I. Cohen, â€œPearson Correlation Coefficient,â€ in Noise Reduction in Speech Processing, Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 1â€“4.