A study on improving the quality inspection on national information by using levenshtein distance algorithm

  • Authors

    • Sanggi Lee
    • Inje Kang
    • Eungyeong Kim
    • Kangryul Shon
    • Chulsu Lim
    2018-06-08
    https://doi.org/10.14419/ijet.v7i2.33.13876
  • Use about five key words or phrases in alphabetical order, Separated by Semicolon.
  • Background/Objectives: In Korea, much effort and budget were spent to improve national R&D information management. However yet, project summaries of national R&D are not accurate enough to be utilized.

    Methods/Statistical analysis: To examine the accuracy of project summaries, Levenshtein Distance Algorithm (LDA) was applied. LDA is expected to extract improper project summaries of which some parts of sentences are repeatedly used. To evaluate how the algorithm performs with national R&D information in Korea, project summaries of 53,492 national R&D projects that were conducted in 2014 were used.

    Findings: Unlike other algorithms, LDA was able to detect project summaries consisted of repeatedly used phrases. According to the test with LDA, from 53,492 cases, 3,445 projects had inaccurate contents in project summaries. In details, 2,707 projects had improper research objective, while 712 projects and 26 projects had improper contents in research summary and expected impact, respectively. Although the algorithm allowed extracting repeatedly used phrases, it had problems of time; thus, it was only applied offline. Also, a research had to confirm once more to verify the accuracy of the result.

    Improvements/Applications: This paper applied LDA to detect inappropriate project summaries. The result implies that by applying LDA, the quality of the information can be improved to facilitate the utilization.

     

     

  • References

    1. [1] Fernando H S G. Relevance of development assistance to the economy and its impact after Sri Lanka’s elevation to upper middle income status, Sri Lanak Forum of University Economists (SLFUE), Department of Economics, Faculty of Social Sciences, University of Kelaniya, 2016, pp.202-210.

      [2] Herrera M E F. Contrasting the strategic role of firms in the economic development of Ecuador with that of South Korea using Ghemawat CAGE distance framework, 2017.

      [3] Gross domestic spending on R&D. https://data.oecd.org/rd/gross-domestic-spending-on-r-d.htm. Date accessed: 11/29/2017.

      [4] Kang N, Park M, Choi K, Kim T, Joo W, Kown O. A development of service model for mapping the ecology of scientific research using national science & technology information service. Indian Journal of Science and Technology. 2015, 8 (S1), pp. 121-130.

      [5] Kang I, Lee B, Kim S, Lim C, Choi K. Design and pilot test of software prototype for linking national R&D information with result materials. Indian Journal of Science and Technology. 2016, 9 (46), pp. 1-8.

      [6] Lee S, Kim E, Shon K, Lim C. A study of algorithm for quality inspection on the summary of national R&D program information, The 7th International Conference on Convergence Technology 2017, 2017, pp.58-59.

      [7] Lee S. Improving the quality management system of national R&D data. Doctoral dissertation, University of Seoul, Seoul, Republic of Korea, 2015.

  • Downloads

  • How to Cite

    Lee, S., Kang, I., Kim, E., Shon, K., & Lim, C. (2018). A study on improving the quality inspection on national information by using levenshtein distance algorithm. International Journal of Engineering & Technology, 7(2.33), 161-164. https://doi.org/10.14419/ijet.v7i2.33.13876