A novel approach: big data analysis based on multi-view data visualization using clustering similarity measure

  • Authors

    • Srinivasa Rao Madala Dr.M.G.R Educational Research Institute,university.
    • V. N. Rajavarman Dr.M.G.R Educational Research Institute,university.
    • T. Venkata Satya Vivek Dr.M.G.R Educational Research Institute,university.
    2018-11-15
    https://doi.org/10.14419/ijet.v7i4.19458
  • Data Visualization, Parallel Co-Ordinate, Multivariate Attributes, Clustering Methods, Similarity Measure, Multi Viewpoint.
  • In big data, data visualization is annotable concept to represent data for competent data analysis to handle high dimensional data. In data visualization, there are three main properties i) to characterize without loss of data patterns ii) without any changes in data pattern change the attributes iii) data visualization among structure and unstructured data attributes for data examination. There are various types of data visualization are existing virtually to identify data analysis (i.e. topic based data revelation, attribute based data visualization, audio based data visualization and text based data visualization in different data sets). Parallel coordinate is  proficient and effective data visualization tool to analyze and handle multi attribute high dimensional data. It is based 5Ws density sending and receiving data visualization, it also read data patterns and attributes with reduces the overlapping to data patterns. Parallel measure is a labeling property to characterize data with affiliation objects in data set appraisal with different pair of attributes. We need to get better parallel coordinate tool to sustain multi-attribute object relations, so we recommend and implement novel method i.e. (Similarity Measure Centered with Multi Viewpoint (SMCMV)) approach and related clustering approaches to represent data. Using multi-viewpoint, we can accomplish assessment based similarity index with data visualization. Using multi viewpoint, we present hypothetical analysis based on multi attributes presentation. Our experimental results gives best data representation in data visualization with capable similarity measure on real time document evaluation with different known collected clustering approaches.

     


     
  • References

    1. [1] Jinson Zhang, Wen Bo Wang,†Big Data Density Analytics using Parallel Coordinate Visualizationâ€, 2014 IEEE 17th International Conference on Computational Science and Engineering.

      [2] Pingdom, “Internet 2012 in numbersâ€, posted on Jan 16, 2013, http://royal.pingdom.com/2013/01/16/internet-2012-in-numbers/.

      [3] J. Sanyal, S. zhang, J. Dyer, A. Mercer, P. Amburn, and R.J. Moorhead, “Noodles: A Tool for Visualization on Numerical Weather Model Ensemble Uncertaintyâ€, IEEE Transactions on Visualization and Computer Graphics, vol. 16, no 6, pp 1421-1430, Nov/Dec 2010. https://doi.org/10.1109/TVCG.2010.181.

      [4] S. Hadiak, H.J Schulz, and H. Schumann, “In Situ Exploration of Large Dynamic Networksâ€, IEEE Transactions on Visualization and Computer Graphics, vol. 17, no 12, pp 2334-2343, Dec 2011. https://doi.org/10.1109/TVCG.2011.213.

      [5] Y.S. Wang, C. Wang, T.Y. Lee, and K.L. Ma, “Feature-Preserving Volume Data Reduction and Focus+Context Visualizationâ€, IEEE Transactions on Visualization and Computer Graphics, vol. 17, no 2, pp 171-181, Feb 2011 https://doi.org/10.1109/TVCG.2010.34.

      [6] S. Afzal, R. Maciejewski, Y. Jang, N. Elmqvist, and D.S. Ebert, “Spatial Text Visualization Using Automatic Typographic Mapsâ€, IEEE Transactions on Visualization and Computer Graphics, vol. 18, no 12, pp 2556-2564, Dec 2012. https://doi.org/10.1109/TVCG.2012.264.

      [7] A.H. Meghdadi, and P. Irani, “Interactive Exploration of Surveillance Video through Action Shot Summarization and Trajectory Visualizationâ€, IEEE Transactions on Visualization and Computer Graphics, vol. 19, no 12, pp 2119-2128, Dec 2013 https://doi.org/10.1109/TVCG.2013.168.

      [8] E. Lamboray, S. Wurmlin, and M. Gross, “Data Streaming in Telepresence Environmentsâ€, IEEE Transactions on Visualization and Computer Graphics, vol. 11, no 6, pp 637-648, Nov/Dec 2005 https://doi.org/10.1109/TVCG.2005.98.

      [9] L. Shi, Q. Liao, X. Sun, Y. Chen and C. Lin, “Scalable Network Traffic Visualization Using Compressed Graphsâ€, In Proc. 2013 IEEE International Conference on Big Data (IEEE BigData 2013), pp. 606-612, Oct 2013

      [10] W. Cui, Y. Wu, S. Liu, F. Wei, M.X. Zhou, and H. QU, “Context- Preserving, Dynamic Word Cloud Visualizationâ€, IEEE Computer Graphics and Applications, vol. 30, no 6, pp. 42-53, Nov/Dec 2010 https://doi.org/10.1109/MCG.2010.102.

      [11] J. Zhang and M.L Huang, “5Ws Model for Big Data Analysis and Visualizationâ€, In Proc. 2013 16th IEEE International Conference on Computational Science and Engineering (CSE), pp. 1021-1028, Dec 2013 https://doi.org/10.1109/CSE.2013.149.

      [12] A. Shiravi, H. Shiravi, M. Tavallaee, and A.A. Ghorbani, “Toward developing a systematic approach to generate benchmark datasets for intrusion detection,†Computers & Security, vol. 31, no. 3, pp 357-374, May 2012 https://doi.org/10.1016/j.cose.2011.12.012.

      [13] W.S. Seol, H.W. Jeong, B. Lee and H.Y. Youn, “Reduction of Association Rules for Big Data Sets in Socially-Aware Computingâ€, In Proc. 2013 16th IEEE International Conference on Computational Science and Engineering (CSE), pp. 949-956, Dec 2013 https://doi.org/10.1109/CSE.2013.140.

      [14] Z. Wang, W. Xiao, B. Ge, and H. Xu, “ADraw: A novel social network visualization tool with attribute-based layout and coloringâ€, In Proc. 2013 IEEE International Conference on Big Data (IEEE BigData 2013), pp. 25-32, Oct 2013

      [15] J. Zhang and M.L. Huang, “Density approach: a new model for BigData analysis and visualizationâ€, Concurrency and Computation: Practice and Experience. Publish online July 2014, https://doi.org/10.1002/cpe.3337.

      [16] Z. Wang, J. Zhou, W. Chen, C. Chen, J. Liao and R. Maciejewski, “A Novel Visual analytics Approach for Clustering Large-Scale Social Dataâ€, In Proc. 2013 IEEE International Conference on Big Data (IEEE BigData 2013), pp. 79-86, Oct 2013.

      [17] Duc Thang Nguyen, Lihui Chen,†Clustering with Multi-Viewpoint based Similarity Measureâ€, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. XX, NO. YY, 2011.

      [18] Y. Zhao and G. Karypis, “Empirical and theoretical comparisons of selected criterion functions for document clustering,†Mach. Learn., vol. 55, no. 3, pp. 311–331, Jun 2004. https://doi.org/10.1023/B:MACH.0000027785.44527.d6.

      [19] G. Karypis, “CLUTO a clustering toolkit,†Dept. of Computer Science, Uni. of Minnesota, Tech. Rep., 2003, http://glaros.dtc. umn.edu/gkhome/views/cluto.

      [20] A. Strehl, J. Ghosh, and R. Mooney, “Impact of similarity measures on web-page clustering,†in Proc. of the 17th National Conf. on Artif. Intell. Workshop of Artif. Intell. For Web Search. AAAI, Jul. 2000, pp. 58–64.

      [21] A. Ahmad and L. Dey, “A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set,†Pattern Recognit. Lett. vol. 28, no. 1, pp. 110 – 118, 2007. https://doi.org/10.1016/j.patrec.2006.06.006.

      [22] D. Ienco, R. G. Pensa, and R. Meo, “Context-based distance learning for categorical data clustering,†in Proc. of the 8th Int. Symp. IDA, 2009, pp. 83–94. https://doi.org/10.1007/978-3-642-03915-7_8.

      [23] P. Lakkaraju, S. Gauch, and M. Speretta, “Document similarity based on concept tree distance,†in Proc. of the 19th ACM conf. on Hypertext and hypermedia, 2008, pp. 127–132. https://doi.org/10.1145/1379092.1379118.

      [24] H. Chim and X. Deng, “Efficient phrase-based document similarity for clustering,†IEEE Trans. on Knowl. In addition, Data Eng., vol. 20, no. 9, pp. 1217–1229, 2008.

      [25] Madala S.R., Rajavarman V.N., Venkata Satya Vivek T. (2018) Analysis of Different Pattern Evaluation Procedures for Big Data Visualization in Data Analysis. In: Satapathy S., Bhateja V., Raju K., Janakiramaiah B. (eds) Data Engineering and Intelligent Computing. Advances in Intelligent Systems and Computing, vol 542. Springer, Singapore. https://doi.org/10.1007/978-981-10-3223-3_44.

  • Downloads

    Additional Files

  • How to Cite

    Rao Madala, S., N. Rajavarman, V., & Venkata Satya Vivek, T. (2018). A novel approach: big data analysis based on multi-view data visualization using clustering similarity measure. International Journal of Engineering & Technology, 7(4), 4503-4508. https://doi.org/10.14419/ijet.v7i4.19458