A new approach to represent textual documents using CVSM

Article Summary Abstract References Full Article How to cite

Authors
- Dr. Brahmananda Reddy
- Dr. Y. Sagar
- Dr. P. Subhash
2018-11-14

https://doi.org/10.14419/ijet.v7i4.27430
Text Mining, Vector Space Model, Conceptual Vector Space Model, Wordnet, NLTK, Clustering.
Abstract

Due to advancements in technology, a vast amount of data is produced which is generally in the form of unstructured data. This is where text mining finds its value to discover and retrieve useful information. Text mining is a process of seeking or extracting high quality information. Generally, in text mining, Vector Space Model (VSM) is used which transforms unstructured data to structured data by the use of traditional keyword based approach. One of the problems with this approach is that if a user puts a query, the set of documents are retrieved which match the keywords in the query. To overcome this, a Conceptual Vector Space Model (CVSM) is described in this paper which helps to categorize different documents with the same content which may use different vocabulary. The Conceptual Vector Space Model is implemented with the help of WordNet, Natural Language ToolKit (NLTK).Clustering algorithms are applied on it to form clusters based on concepts.
Â
References
1. [1] Dr. G. Rasitha Banu, VK Chitra, A Survey of Text Mining Concepts, semanticsscholar.org, April 2015.
  [2] Niladri Biswas, Text Mining and its Business Applications, September 2014.
  [3] A. Brahmananda Reddy, A. Govardhan â€œIntegrated Feature Selection Methods for Text Document Clusteringâ€, International Journal of Applied Engineering Research, ISSN 0973-4562 Vol. 10 No.81 (2015), PP: 153-158, Research India Publications.
  [4] https://wordnet.princeton.edu/.
  [5] https://www.nltk.org/.
  [6] Dr.S.Kannan, VairaprakashGurusamy, Preprocessing Techniques for Text Mining, Conference Paper, March 201.
  [7] E. E. Ogheneovo, R. B. Japheth, Application of Vector Space Model to Query Ranking and Information Retrieval, International Journal of Advanced Research in Computer Science and Software Engineering, Vol. 6, Issue 5, May 2016.
  [8] Gerald J. Kowalski, Mark T. Maybury, Information Storage and Retrieval Systems, Theory and Implementation, 2006.
  [9] Brahmananda Reddy; A. Govardhan , A novel approach for similarity and indexing-based ontology for semantic web educational system, International Journal of Intelligent Engineering Informatics (IJIEI), Vol. 4 No.2, 2016. https://doi.org/10.1504/IJIEI.2016.076698.
  [10] Dr. S. Vijayarani, Ms. J. Ilamathi, Ms. Nithya3, Preprocessing Techniques for Text Mining - An Overview, International Journal of Computer Science & Communication Networks, Vol 5(1), 7-16.
  [11] Vaibhav Kant Singh, Vinay Kumar Singh, Vector Space Model: An Information Retrieval System, International Journal of Advanced Engineering Research and Studies, 2015.
  [12] http://www.nltk.org/howto/wordnet.html.
Downloads
How to Cite
Brahmananda Reddy, D., Y. Sagar, D., & P. Subhash, D. (2018). A new approach to represent textual documents using CVSM. International Journal of Engineering and Technology, 7(4), 4678-4682. https://doi.org/10.14419/ijet.v7i4.27430
ACM

ACS

APA

ABNT

Chicago

Harvard

IEEE

MLA

Turabian

Vancouver

Download Citation

Endnote/Zotero/Mendeley (RIS)

BibTeX
Received date: 2019-02-14

Accepted date: 2019-02-14

Published date: 2018-11-14

A new approach to represent textual documents using CVSM

Authors

Abstract

References

Downloads

How to Cite

Published