Big Data Analysis of Web Data Extraction

  • Authors

    • Nadia Ibrahim
    • Alaa Hassan
    • Marwah Nihad
  • Web data extracting, classification, data mining algorithms, WEKA.
  • Abstract

    In this study, the large data extraction techniques; include detection of patterns and secret relationships between factors numbering and bring in the required information. Rapid analysis of massive data can lead to innovation and concepts of the theoretical value. Compared with results from mining between traditional data sets and the vast amount of large heterogeneous data interdependent it has the ability expand the knowledge and ideas about the target domain. We studied in this research data mining on the Internet. The various networks that are used to extract data onto different locations complex may appear sometimes and has been used to extract information on the web technology to extract and data analysis (Marwah et al., 2016). In this research, we extracted the information on large quantities of the web pages and examined the pages of the site using Java code, and we added the extracted information on a special database for the web page. We used the data network function to get accurate results of evaluating and categorizing the data pages found, which identifies the trusted web or risky web pages, and imported the data onto a CSV extension. Consequently, examine and categorize these data using WEKA to obtain accurate results. We concluded from the results that the applied data mining algorithms are better than other techniques in classification and extraction of data and high performance.


  • References

    1. [1] Abdullah, Marwah N., Alaa Hassan, and Nadia Naef. , 2016, Knowledge-Based Analysis of Web Data Extractionâ€, Proceedings of the Fifth International Conference on Informatics and Applications, Takamatsu, Japan, ISBN: 978-1-941968-41-3 SDIWC 26.

      [2] Bharati M., 2010, Data Mining Techniques and Applications, Indian Journal of Computer Science and Engineering Vol. 1 No. 4 301-305.

      [3] Bhu L., Arundathi, and Jagadeesh, 2014, Data Mining: A prediction for Student"s Performance Using Decision Tree ID3 Method, International Journal of Scientific & Engineering Research, Volume 5, Issue 7, July-2014 1329 ISSN 2229-5518.

      [4] Boyd D., and Crawford K., 2011, Six provocations for big data. In A decade in internet time: Symposium on the dynamics of the internet and society ", Vol. 21, Oxford Internet Institute.

      [5] Chandaka B. , Mandapati V. and Vedula V. , 2018, Efficient Association Rule Mining for Retrieving Frequent Itemsets in Big Data Sets " , CJAST.39546 , PP.1-14.

      [6] Chitra1 and Maheswari, 2017, A Comparative Study of Various Clustering Algorithms in Data Mining, K. Chitra et al, International Journal of Computer Science and Mobile Computing, Vol.6 Issue.8.

      [7] Galathiya, Ganatra, and Bhensdadia, 2012, Classification with an improved Decision Tree Algorithm, International Journal of Computer Applications (0975 – 8887) Volume 46– No.23.

      [8] Gunasundari and Karthikeyan, 2012, A Study of Content Extraction from Web Pages Based on Links, International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.2, No.3.

      [9] Himani Sharma1 and Sunil Kumar, 2015, A Survey on Decision Tree Algorithms of Classification in Data Mining, International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064.

      [10] Jharna M. , Sneha N. and Shilpa A., 2017, Analysis of agriculture data using data mining techniques: application of big data, J Big Data DOI 10.1186/s40537-017-0077-4.

      [11] Lourdu C., Jayanthy, and Sakthivel, 2016, Implementation of Different Techniques of Web Data Mining through Cloud Computing Technologies, Volume 6, Issue 6, June 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering.

      [12] Manisha R., Mohod, and Thakare, 2015, Various Data-Mining Techniques for Big Data, International Journal of Computer Applications.

      [13] Neha G. and Saba H., 2011, A Heuristic Approach for Web Content Extraction, International Journal of Computer Applications (0975 – 8887) Volume 15– No.5.

      [14] Nelofar R., 2017, Data Mining Techniques Methods Algorithms and Tools, Vol.6 Issue.7, International Journal of Computer Science and Mobile Computing.

      [15] Pranit B. and Sheetal D.,(2018), Web Data Mining Techniques and Implementation for Handling Big Data', IJCSMC, Vol. 4, Issue. 4, April 2015, pg.330 – 334.

      Rajkumar D. and Usha S., 2016, A Survey on Big Data Mining Platforms, Algorithms and Handling Techniques, International for research in Emerging Science And Technology, VOLUME-3, SPECIAL ISSUE-1, NCRTCT"16
  • Downloads

  • How to Cite

    Ibrahim, N., Hassan, A., & Nihad, M. (2018). Big Data Analysis of Web Data Extraction. International Journal of Engineering & Technology, 7(4.37), 168-172.

    Received date: 2018-12-16

    Accepted date: 2018-12-16

    Published date: 2018-12-13