An Approach To Twitter Sentiment Analysis Over Hadoop

  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract

    Sentiment analysis is the process of identifying people’s attitude and emotional state from the language they use via any social websites or other sources. The main aim is to identify a set of potential features in the review and extract the opinion expressions of those features by making full use of their associations. The Twitter has now become a routine for the people around the world to post thousands of reactions and opinions on every topic, every second of every single day. It’s like one big psychological database that’s constantly being updated and which can be used to analyze the sentiments of the people. Hadoop is one of the best options available for twitter data sentiment analysis and which also works for the distributed big data, streaming data, text data etc.  This paper provides an efficient mechanism to perform sentiment analysis/ opinion mining on Twitter data over Hortonworks Data platform, which provides Hadoop on Windows, with the assistance of Apache Flume, Apache HDFS and Apache Hive.


  • Keywords

    Apache Flume; Apache Hadoop; Apache Hive; Sentiment Analysis; Twitter

  • References

      [1] Peter D. Turney, “Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification Of Reviews.” Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 417-424.

      [2] Bo Pang and Lillian Lee, “A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts.” Proceedings of the Association for Computational Linguistics (ACL), 2004

      [3] Borikar D. A. and Chandak M. B. (2016), “An Approach to Sentiment Analysis on Unstructured Data in Big Data Environment.” In: Unal A., Nayak M., Mishra D., Singh D., Joshi A. (eds) Smart Trends in Information Technology and Computer Communications. SmartCom 2016. Communications in Computer and Information Science, vol 628. Springer, Singapore

      [4] A. C. E. S. Lima, L. N. de Castro and J. M. Corchado. “A polarity analysis framework for Twitter messages”, in Applied Mathematics and Computation, vol. 270, 2015, pp. 756–767.

      [5] Theresa Wilson, Janyce Wiebe and Paul Hoffmann, “Recognizing Contextual Polarity in Phrase-level Sentiment Analysis.” Proceedings of the conference on human language technology and empirical methods in natural language processing, ACL,2005

      [6] Batool, Khattak, Maqbool and Sungyoung Lee, “Precise tweet classification and sentiment analysis.” Computer and Information Science (ICIS), 2013 IEEE/ACIS 12th International Conference on , vol., no., pp.461,466, 16-20 June 2013

      [7] Xiaoqian Zhang, Shoushan Li, Guodong Zhou and Hongxia Zhao, “Polarity Shifting: Corpus Construction and Analysis.” Asian Language Processing (IALP), 2011 International Conference on , vol., no., pp.272,275, 15-17 Nov. 2011

      [8] Kumar Singh, Sachdeva, Mahajan, Pande and Sharma, “An approach towards feature specific opinion mining and sentimental analysis across e-commerce websites.” Confluence The Next Generation Information Technology Summit (Confluence), 2014 5th International Conference, vol., no., pp.329,335, 25-26 Sept. 2014

      [9] Go. A, Bhayani. R and Huang. L, “Twitter sentiment classification using distant supervision.” CS224N Project Report, Stanford (2009)

      [10] Pak and Paroubek, “Twitter as a corpus for sentiment analysis and opinion mining.” Proceedings of LREC 2010 (2010)

      [11] Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow and Rebecca Passonneau , “Sentiment Analysis of Twitter Data” Department of Computer Science Columbia University New York, NY 10027 USA fapoorv@cs, xie@cs, iv2121@, rambow@ccls, 2011

      [12] Sunil B. Mane, YashwantSawant, SaifKazi and VaibhavShinde, “Real Time Sentiment Analysis of Twitter Data Using Hadoop” College of Engineering, Pune International Journal of Computer Science and Information Technologies 2014

      [13] Bo Pang and Lillian Lee, “Opinion Mining and Sentiment Analysis,” Found. Trends Inf. Retrieval, vol.2, Nos. 1-2 (2008) 1-135, 2008 DOI: 10.1561/1500000001.

      [14] Subramaniyaswamy V, Vijayakumar V, Logesh R and Indragandhi V, “Unstructured Data Analysis on Big Data using Map Reduce”, 2nd International Symposium on Big Data and Cloud Computing (ISBCC’15), ScienceDirect 2015.

      [15] Tanvi Hardeniya and D. A. Borikar, “An Approach To Sentiment Analysis Using Lexicons With Comparative Analysis of Different Techniques”, IOSR Journal of Computer Engineering (IOSR-JCE), e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 3, Ver. I (May-Jun. 2016), PP 53-57

      [16] Piyush Gupta, Pardeep Kumar and Girdhar Gopal, “Sentiment Analysis on Hadoop with Hadoop Streaming”, International Journal of Computer Applications(0975-8887) Volume 121-No.11, July 2015

      [17] Deebha Mumtaz and Bindiya Ahuja, “Sentiment Analysis of Movie Review Data Using Senti-Lexicon Algorithm”, 2nd International Conference on Applied and Theoretical Computing and Communication Technology, IEEE 2016

      [18] Kyong-Ha Lee, Yoon-Joon Lee,Hyunsik Choi, Yon Dhn Chung and Bongki Moon, “Parallel Data Processing with MapReduce: A Survey”, SIGMOD Record, December 2011, (Vol. 40, No.4)

      [19] E.Kouloumpis, T.Wilson and J.Moore, “ Twitter Sentiment Analysis: The Good the Bad and the OMG! ”, Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, 2011

      [20] ZHAO JIANQIANG and GUI XIAOLIN, “Comparison Research on Text Pre-Processing Methods on Twitter Sentiment Analysis,” DOI: 10.1109/ACCESS.2017.2672677

      [21] Bo Pang and Lillian Lee and Shivakumar Vaithyanathan, “Thumbs up? Sentiment Classification using Machine Learning Techniques,” Appears in Proc. 2002 Conf. on Empirical Methods in Natural Language Processing (EMNLP)

      [22] Manning, Christopher D and Hinrich Schutze, “Introduction to information retrieval,” Cambridge University Press 2008.

      [23] “The Ultimate Hands-On Hadoop – Tame your Big Data!”




Article ID: 20110
DOI: 10.14419/ijet.v7i4.5.20110

Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.