An efficient technique for hybrid classification and feature extraction using normalization
-
2018-08-06 https://doi.org/10.14419/ijet.v7i2.27.14534 -
TextMining, Text Classification, Feature Extraction, Feature Selection, Machine Learning -
Abstract
Text classification is technique for assigning the class or label to a particular document within predefined class labels. Predefined classes examples are sports, business, technical, education and science etc. Classification is supervised learning technique i.e. these classes are trained with certain features and then document is classified based on similarity measure with these trained document set. Text classification is used in many applications like assigning the label to the documents, separating the spam messages from the genuine one, filtering of text, natural language processing etc. Feature selection, extraction and classification are various phases for assigning label to any document. In this paper, PCA is used for feature extraction, ABC is used for feature selection and SVM is used for classification. PCA is improved by applying normalization-using size of features in our proposed approach. It reduces the redundant features to larger extent. There are very few research works, which have implemented PCA, ABC and SVM for complete classification. Evaluation parameters like accuracy, F-measure and G-mean are calculated to check classifier efficiency. The proposed system is deployed on 20-Newsgroup dataset. Experiment analysis proves that accuracy is improved using our proposed approach as compared to existing approaches.
Â
Â
-
References
[1] S.A.Salloum, M.A.Emran, A.A.Monem, &K.Shaalen(2017) “Using Text Mining Techniques for Extracting Information from Research Articlesâ€,Intelligent Natural Language Processing: Trends and Applications,Vol.740,pp:373-397, Springer.
[2] B Jyot& G. Bathla (2018),†Document classification using various classification algorithms: a surveyâ€,,International journal of future revolution in computer science and communication engineering,vol.4,pp.150-155.
[3] P. L. Prasanna, D. R. Rao, Y. Meghana, K. Maithri& T. Dhinesh (2018),â€Analysis of supervised classification techniques: International Journal of Engineering and technology, vol.7, pp.283-285, SPC.
[4] P.L.Prasanna&D.R.Rao (2018)â€Text classification using artificial neural networks†International Journal of Engineering and technology, vol.7, no.1.1, pp.603-606, SPC.
[5] M.P Mali & M. Atique(2014) “Applications of Text Classification using Text Miningâ€, International Journal of Engineering Trends and Technology (IJETT),Vol.13, no.5,SPC.
[6] J. Deepika, T. Senthil, C. Rajan& A. Surendar(2018),â€Machine learning algorithms: a background artifactâ€, International Journal of Engineering and technology,vol.7, pp.143-149,SPC.
[7] R.Thiyagarajana, S.Arulselvia& G. Sainarayanan (2010),†Gabor Feature based Classification using Statistical Models for Face Recognitionâ€, in Proceedings ofICEBT2 pp:83-93, Elsevier.
[8] A. Jain, K. Nandakumar& A. Ross (2005)‘Score normalization in multimodal biometric systemsâ€, Pattern Recognition vol.38, pp. 2270 – 2285, Elsevier.[9] D. Karaboga& B. Basturk,(2008) “On the performance of artificial bee colony (ABC) algorithmâ€.Applied soft computing, vol .8, no.1, and pp: 687-697, Elsevier.
[10] C J.C.Burges& B. Schölkopf(1997), “Improving the accuracy and speed of support vector machinesâ€. In Advances in neural information processing systems, pp. 375-381.
[11] S. Zobeidi, M. Naderan& S. E. Alavi, (2017) “Effective text classification using multi-level fuzzy neural networkâ€, in proceedings of the 5th Iranian Joint Congress onFuzzy and Intelligent Systems (CFIS), pp. 91-96, IEEE.
[12] B.Tang, H. He, P.M.Baggantoss&S.kay (2016)"A Bayesian classification approach using class-specific features for text categorization.†IEEE Transactions on Knowledge and Data Engineering vol.28, no.6 pp: 1602-1606.
[13] F.P.Shah&V.Patel(2016) ,â€A review on feature extraction and Feature selection for text classificationâ€, Wispnet pp.2264-2268,IEEE.
[14] V. K. Vijayan, K. R. Bindu&L.Parameswaran(2017), "A comprehensive study of text classification algorithms.†IEEE Advances in Computing, Communications and Informatics (ICACCI), pp: 1109-1113.
[15] M...S. Uzer, N. Yilmaz, & O. Inan (2013), â€Feature Selection Method Based on Artificial Bee Colony Algorithm and Support Vector Machines for Medical Datasets Classification†the scientific world journal,pp.1-10,Hindawi.
[16] Santoso, E. M. Yuniarno, &M.Hariadi (2015),â€Large Scale Text Classification Using Map Reduce and Naive Baye’s Algorithm for Domain Specified Ontology Buildingâ€, in Proceedings of the 7th International Conference onIntelligent Human-Machine Systems and Cybernetics (IHMSC),vol. 1, pp. 428-432, IEEE.
[17] Y. Xue, J. Jiang, B. Zhao &T.Ma (2017),†A self-adaptive artificial bee colony algorithm based on global best for global optimizationâ€, Soft Computing, pp:1-18,Springer.
[18] M. Somvanshi,& P. Chavan (2016). "A review of machine learning techniques using decision tree and support vector machine", in Proceedings of the International Conference onComputing Communication Control and automation (ICCUBEA), pp. 1-7. IEEE, 2016.
[19] L. Demidova & I. Klyueva (2017),†SVM classification: Optimization with the SMOTE algorithm for the class imbalance problem†in Proceedings of the sixth Mediterranean Conference on Embedded Computing (MECO) pp. 1-4. IEEE.
[20] K.Y. Wu, M. Zhou, X.S.Lu &L .huang (2017) "A fuzzy logic-based text classification method for social media data"in Proceedings of the International Conference on Systems, Man, and Cybernetics (SMC),, vol.13,no.3 pp:1942-19472,IEEE.
[21] V. Bobicev (2016). “Text classification: the case of multiple labelsâ€,in Proceedings of the International Conference on Communications (COMM) pp. 39-42. IEEE,
[22] K. Glinka, R. Woźniak, & D. Zakrzewska(2017), “Improving Multi-Label Medical Text Classification by Feature Selectionâ€,in Proceedings of the26th International Conference onEnabling Technologies: Infrastructure for Collaborative Enterprises (WETICE) pp. 176-181. IEEE.
[23] N. Bidi, &Z.Elberrichi (2016),â€Feature selection for text classification using genetic algorithmsâ€, in Proceedings of the 8th International Conference onModelling, Identification and Control (ICMIC), pp. 806-810. IEEE.
[24] H.wang, H.yu, Q.Zhang, S.Cang&W.Liao (2017).â€Parameters optimization of classifier and feature selection based on improved artificial bee colony algorithmâ€, in Proceedings of the International Conference on Advanced Mechatronic Systems (ICAMechS), IEEE.
[25] K.Modarresi(2015), “Unsupervised Feature Extraction Using Singular Value Decompositionâ€, in Proceedings of the International Conference On Computational Science.vol.51,pp:2417–2425.Elsevier.
[26] H. Abdi, & L. J. Williams (2010),†Principal component analysisâ€, Wiley interdisciplinary reviews: computational statistics, two, pp.1-47.
[27] T Meenpal, A.Goyal& A. Meenpal(2018),â€Facial recognition system based on principle component analysis and distance measures “, International Journal of Engineering and technology,vol.7, no.2.21,pp.15-19,SPC.
[28] D. Karaboga& B. Basturk (2007).†A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithmâ€, Journal of global optimization, vol: 39 no.3, pp.459-471, Springer.
[29] A. J. Smola& B. Schölkopf,(2004), “A tutorial on support vector regressionâ€, Statistics and computing, vol.14, no.3, pp.199-222.
-
Downloads
-
How to Cite
Kaur, B., & Bathla, G. (2018). An efficient technique for hybrid classification and feature extraction using normalization. International Journal of Engineering & Technology, 7(2.27), 156-160. https://doi.org/10.14419/ijet.v7i2.27.14534Received date: 2018-06-22
Accepted date: 2018-07-29
Published date: 2018-08-06