A Study on Big Data Hadoop Map Reduce Job Scheduling

  • Authors

    • N Deshai
    • S Venkataramana
    • I Hemalatha
    • G P. S. Varma
    2018-08-24
    https://doi.org/10.14419/ijet.v7i3.31.18202
  • Big data, Hadoop, HDFS, Map Reduce, Scheduling,
  • Abstract

    A latest tera to zeta era has been created during huge volume of data sets, which keep on collected from different social networks, machine to machine devices, google, yahoo, sensors etc. called as big data. Because day by day double the data storage size, data processing power, data availability and digital world data size in zeta bytes. Apache Hadoop is latest market weapon to handle huge volume of data sets by its most popular components like hdfs and mapreduce, to achieve an efficient storage ability and efficient processing on massive volume of data sets. To design an effective algorithm is a key factor for selecting nodes are important, to optimize and acquire high performance in Big data. An efficient and useful survey, overview, advantages and disadvantages of these scheduling algorithms provided also identified throughout this paper.

     

     
  • References

    1. [1] Ehab Mohamed Zheng Hong, “Hadoop-MapReduce Job Scheduling Algorithms Surveyâ€, 7th International confrence on Cloud Computing and Big Data, (2016), 237– 242.

      [2] Abhishek Verma, Ludmila Cherkasova, Roy H. Campbell, "ARIA: Automatic Resource Inference And Allocation for MapReduce environments", 8th Autonomic computing ACM, IEEE, (2011), 235 – 244.

      [3] S. Bardhan, D. A. Menasce, "Queuing Network Models to Predict the Completion Time of the Map Phase of Map reduce Jobs", ICMG, IEEE, ( 2012).

      [4] J.V.Gautam, Harshadkumar, Vipul K Dabhi, Sanjay Chaud hary B,"A survey on job scheduling Algrithms in Big data processingâ€, (ICECCT), IEEE, (2015), 1 – 11.

      [5] Nikos Zacheilas, Vana Kalogeraki, “Pareto-Based Scheduling of Map Reduce Workloads†19th International Symposium on Real-Time Distributed Computing (ISORC), IEEE, (2016), 174 – 181.

      [6] Mark Yong, Nitin Garegrat, Shiwali Mohan, “Towards a Resource Aware Scheduler in Hadoopâ€, in Proc. ICWS, (2009), 102– 109.

      [7] A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. A nthon, H. Liu, P. Wyckoff, R. Murthy, “Hive - A W rehousing Solution over a Map-Reduce Frameworkâ€, PV LDB, (2009), 1626 – 1629.

      [8] J. K. Laurila, D. Gatica-Perez, I. Aad, O. Bornet, T.-M.T.D O. Dousse, J. Eberle, and M. Miettinen, “The Mobile Data Challenge: Big Data for Mobile Computing Researchâ€, Nokia Mobile Data Challenge Workshop, Newcastle, U K, (2012), 321 – 330.

      [9] Sanjay G, Howard G, S.T.Leung, “The Google file system†, 19th Symposium Op. Sys. Principle, New York, (2003), 29 – 43.

      [10] Casavant et al, “Taxonomy of scheduling in general purpose Distributed computing systems", IEEE Transactions, (1988), 141 – 154.

      [11] N. Tiwari, "Classification Framework of Map Reduce Scheduling Algorithms", ACM Computing Surveys, 47, 3, (2015), 49.

      [12] Quan Chen, Daqiang Zhang, Minyi Guo, Qianni Deng, Song Guo, "SAMR: A Self-adaptive MapReduce Scheduling Algorithm in Heterogeneous Environment", IEEE 10th International Conference, (2010), 2736 – 2743.

      [13] M.Zaharia, "Delay scheduling: a simple technique for achieving locality and fairness in cluster schedulingâ€, 5th European conference on computers, New York, ( 2010), 265-278.

      [14] PEI Shu-jun, Zheng Xi-min, Hu Da-ming, Lou Shu-hui, Zhang Yuan-xu, “Optimization and Research of Hadoop Platform Based on FIFO Schedulerâ€, Seventh Internation al Conference on Measuring Technology and Mechatroni cs Automation, IEEE, (2015), 727 – 730

      [15] Kc K, Anyanwu K, â€Scheduling Hadoop Jobs to Meet Deadlines Cloud Computing Technology and Science (C1oudCom)†2nd International Conference, IEEE, (2010), 388-392.

      [16] J. Chen et al, "A Task Scheduling Algorithm for Hadoop Platform", journal of Computers, 8, 4, (2013), 29 – 936.

      [17] Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz, Ion Stoica, "Improving Map Reduce Performance in Heterogeneous environments", 8th USENIX Symposium, (2008), 26 – 33.

      [18] Geetha J., N. Uday Bhaskar, P. Chenna Reddy, Neha Sniha,†Hadoop Scheduler with Deadline Costraint ", (IJCCSA), 4, 5, (2014), 1– 7.

      [19] Mark Yong, Nitin Garegrat, Shiwali Mohan, “Towards a Resource Aware Scheduler in Hadoopâ€, Proc. ICWS, (2009), 102–109.

      [20] Archana G.K, V. Deeban, "HPCA: A Node Selection and Scheduling Method for Hadoop MapReduce", (ICCCT’15) , IEEE, (2015), 368 – 372.

      [21] Y. Wang, Ruonan Rao, Yinglin Wang, “A Round Robin with Multiple Feedback Job Scheduler in Hadoop", Progress in Informatics and Computing (PIC) International Conference, IEEE, (2014), 471 – 475.

      [22] J. Chen, Dan Wang, Wenbing Zhao," A Task Scheduling Algorithm for Hadoop Platform ", Journal of Computers, 8, 4, (2013), 929– 936.

      [23] Yingjie Guoa, Linzhi Wub, Wei Yuc, Bin Wud, Xiaotian Wang, “Research and Improvement of Job Scheduling Algorithms in Hadoop Platformâ€, IEEE, (2010), 15– 21.

      [24] Wei Zhang, Sundaresan R., “Timothy Wood, Min gfa Zhu " MIMP: Deadline and Interference Aware Sched-uling of Hadoop Virtual Machines",14th IEEE/ACM I international Symposium on Cluster, Cloud and Grid Com putting, IEEE, (2014), 394 – 403.

      [25] Saima Gulzar Ahmad, Chee Sun Liew, M. Mustafa Rafique, Ehsan Ullah Munir, Samee U. Khan " Data-Intensive Work flow Optimiztion based on Application Task Graph Partitioning in Heterogeneous Computing Systems ", Fourth International Conference on Big Data and Cloud Computing, IEEE, (2014), 129 – 136.

      [26] Zhao-Rong Lai1, Che-Wei Chang, Xue Liu, Tei-Wei Kuo, Pi-Cheng Hsiu, “Deadline-Aware Load Balancing for MapReduce ", 20th (RTCSA), IEEE, (2014), 1–10.

      [27] J. Wang, Jiayin Wang, Yi Yao, Ying Mao, Bo Sheng, Ning fang Mi, "FRESH: Fair and Efficient Slot Configuration and Scheduling for Hadoop Clusters", International Conference on Cloud Computing, IEEE, (2014), 761– 768.

      [28] Paolo Bellavista, Antonio Corradi, Andrea Reale, and Nicola Ticca, " Priority-based Resource Scheduling in Distributed Stream Processing Systems for Big Data Applicationsâ€, ACM 7th International Conference on Utility and Cloud Co mputing, IEEE, (2014), 363 – 370.

      [29] C. He, Y. Lu and D. Swanson," Real-Time Scheduling in MapReduce Clusters ", doi: 10.1109/HPCC.and.EUC. 216, 2013.

  • Downloads

  • How to Cite

    Deshai, N., Venkataramana, S., Hemalatha, I., & P. S. Varma, G. (2018). A Study on Big Data Hadoop Map Reduce Job Scheduling. International Journal of Engineering & Technology, 7(3.31), 59-65. https://doi.org/10.14419/ijet.v7i3.31.18202

    Received date: 2018-08-25

    Accepted date: 2018-08-25

    Published date: 2018-08-24