Studying Cloud as IaaS for Big Data Analytics : Opportunity, Challenges

  • Authors

    • Amitkumar Manekar
    • Dr Pradeepini Gera
    2018-03-18
    https://doi.org/10.14419/ijet.v7i2.7.11094
  • Big Data, Cloud, Big Data Analytics, Internet, IaaS, OLM, RFHC, Data Migration.
  • Abstract

    James Watt steam engine revolution was greatest revolution in mankind history in 20th century. In 1776, the first steam engines were installed and working in commercial enterprises. This revolution minimize and make world smaller for human being, now world is connected seamlessly. “Big Data Analytics and Cloud†these two words are second numerous revolutions in 21st century.  We are living in an era of information explosion. These two magical terms are nothing but relatively very new and fortunately diverted all market trends to a new era of computation in last decade. As these two emerging technology are their early childhood, many people were confused with its relevancy and applicability. Cloud Computing is Infrastructure based solution for managing data and computational framework. 2016 was a significantly more important year for this volumes data technology or Big Data eco system as large number of enterprises, and organizations are generating data, storing that data and worried about future aspect of that data. In 2017, corporate world take cognizance of their large volumes structured and unstructured data as these enterprises and organizations continuously generating large volumes data. The term big data doesn’t just refer to the massive amounts of data existing today, it also refers to the whole ecosystem of Storing or gathering data, Different types of data and analyzing that data. In traditional data ecosystem all leverages are with legacy system.  Transforming or migration of these traditional ecosystems to the cloud is full of great challenges and benefits. Cloud computing is an agile and scalable resource access computation paradigm, provides heterogeneous platform seamlessly with infrastructure of internet, exclusively for the trapped and work on pre and post process of big data. Now the challenges are finding opportunity and challenges for managing, migrating and abstracting cloud based big data using cloud infrastructure for future eco system of Big Data Analysis.  This paper is basically focused on this issue. We try to reevaluate the facts of existing Cloud Infrastructure as IaaS for tomorrow’s big data analytics.  

      

  • References

    1. [1] Shui Yu, Xiaodong Lin et. al. “Networking with Big Data†CRC Press Taylor and Francis Group ISBN 978-1-4822-6350-3,2016.

      [2] D. Laney, 3D Data Management: Controlling Data Volume, Velocity & Variety, TechnicalReport, META-Group, 2001.

      [3] A. Robbins, Network modernization is key to leveraging big data, http://www.federaltimes.com

      [4] E. Dart, L. Rotman, B. Tierney, M. Hester, J. Zurawski, The science DMZ: A network designpattern for data-intensive science, Proceedings of IEEE/ACM Supercomputing, 2013.

      [5] A. Das, C. Lumezanu, Y. Zhang, V. Singh, G. Jiang, C. Yu, Transparent and flexible networkmanagement for big data processing in the cloud, Proceedings of USENIX HotCloud, 2013.

      [6] Dasari Madhavi, B.V.Ramanaâ€De-Identified Personal Health Care System Using Hadoopâ€, International Journal of Electrical and Computer Engineering ,Vol. 5, No. 6, December 2015, pp. 1492~1499

      [7] Linquan Zhang, Chuan Wu†Moving Big Data to The Cloudâ€, 2013 Proceedings IEEE INFOCOM, ISSN -978-1-4673-5946-7/13 pp-405 to 409.

      [8] B. Kotiyal, A. Kumar, B. Pant, R. H. Goudar, Big Data Mining of Log File through Hadoop, International Conference on Human Computer Interactions (ICHCI),2013.

      [9] Prasad Teli, Manoj V. Thomas, K. Chandrasekaran, An Efficient Approach for Cost Optimization of the Movement of Big Data, Open Journal of Big Data, vol.1, no.1, pp. 4-15, 2015.

      [10] C. Wilson, H. Ballani, T. Karagiannis, A. Rowstron, Better never than late: Meeting deadlines in datacenter networks.

      [11] Bing Wu, Deirdre Lawless, Jesus Bisbal, Jane Grimson. 1997. Lagacy System Migration: A Legacy Data Migration Engine. 17th International Database Conference.

      [12] Rackspace, http://www.rackspace.com/.

      [13] M. Armbrust, A. Fox, R. Grifth, A. D. Joseph, R. Katz, A. Konwinski,G. Lee, D. P. A. Rabkin, I. Stoica, and M. Zaharia, “Above the Clouds:A Berkeley View of Cloud Computing,†EECS, University of California,Berkeley, Tech. Rep., 2009.

      [14] S. Pandey, L. Wu, S. Guru, and R. Buyya, “A Particle Swarm Optimization(PSO)-based Heuristic for Scheduling Workflow Applicationsin Cloud Computing Environment,†in Proc. of IEEE AINA, 2010.

      [15] D. Kossmann, T. Kraska, and S. Loesing, “An evaluationof alternative architectures for transaction processing in thecloud,†in Proceedings of the 2010 international conferenceon Management of data. ACM, 2010, pp. 579–590.

      [16] S. Ghemawat, H. Gobioff, and S. Leung, “The google filesystem,†in ACM SIGOPS Operating Systems Review, vol. 37,no. 5. ACM, 2003, pp. 29–43.

      [17] WSImport/Export, http://aws.amazon.com/importexport/.

      [18] E. E. Schadt, M. D. Linderman, J. Sorenson, L. Lee, and G. P. Nolan, “Computational Solutions to Large-scale Data Management and Analysis,†Nat Rev Genet, vol. 11, no. 9, pp. 647–657, 09 2010.

      [19] Moving an Elephant: Large Scale Hadoop Data Migration at Facebook,http://www.facebook.com/notes/paul-yang/moving-an-elephant-largescalehadoop-data-migration-at-facebook/10150246275318920.

      [20] R. J. Brunner, S. G. Djorgovski, T. A. Prince, and A. S. Szalay,“Handbook of Massive Data Sets,†J. Abello, P. M. Pardalos, andM. G. C. Resende, Eds. Norwell, MA, USA: Kluwer AcademicPublishers, 2002, ch. Massive Datasets in Astronomy, pp. 931–979.

      [21] JianhuaGu, Jinhua Hu et. al. “A New Resource Scheduling Strategy Based onGenetic Algorithm in Cloud ComputingEnvironmentâ€, Journal of Computers, doi:10.4304/jcp.7.1.42-52 ,VOL. 7, NO. 1, JANUARY 2012 pp-42-52.

      [22] Wei Wang, “A reliable dynamic scheduling algorithm based onBayes trust model,†Computer Science, 2007.

      [23] Rewinin H E, Lewis T G, Ali H H, “Task Scheduling in paralleland Distributed System Englewood Cliffs,†New Jersey: PrenticeHall, 1994, pp. 401-403.

      [24] Wu M, Gajski D, Hypertool, “A programming aid for messagepassing system,†IEEE Trans Parallel DistribSyst, 1990, pp.330-343.

      [25] Hwang J J, Chow Y C, Anger F D, “Scheduling precedencegraphics in systems with inter-processor communication times,â€SIAM J Comput, 1989, pp. 244-257.

      [26] Rewinin H E, Lewis T G, “Scheduling parallel programs ontoarbitrary target machines,†J Parallel DistribComput, 1990, pp.138-53.

      [27] Sih G C, Lee E A, “A compile-time scheduling heuristic forInterconnection-constraint heterogeneous processor architectures,â€IEEE Trans Parallel DistribSyst, 1993, pp. 175-187.

      [28] Yu-Xiang Wang, Jun-Zhou Luo, Ai-Bo Song, FangDong, “Partition-Based Online Aggregation with SharedSampling in the Cloudâ€, Journal of Computer Scienceand Technology, November 2013, Volume 28, Issue 6,pp 989-1011.

      [29] Amazon Elastic MapReduce, http://aws.amazon.com/elasticmapreduce/.

      [30] M. Cardosa, C. Wang, A. Nangia, A. Chandra, and J. Weissman, “Exploringmapreduce efficiency with highly-distributed data,†in Proc. OfMapReduce 2011.

  • Downloads

  • How to Cite

    Manekar, A., & Pradeepini Gera, D. (2018). Studying Cloud as IaaS for Big Data Analytics : Opportunity, Challenges. International Journal of Engineering & Technology, 7(2.7), 909-912. https://doi.org/10.14419/ijet.v7i2.7.11094

    Received date: 2018-04-05

    Accepted date: 2018-04-05

    Published date: 2018-03-18