Implementation of cost effective hierarchical Hadoop cluster–a case study for education

  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract

    To equip the younger generation of the province with computing skills and provide them access to a wide variety of modern educational resources such as multimedia based on educational content in government schools and colleges to set a strong foundation at an early stage of their education. Educational Media help to empower educational institutions by altering the way of using Information Communication Technology (ICT). To put this into practice, several challenges need to be addressed. It requires a scalable technological architecture and algorithms to form a cluster in which effective resource sharing like CPU, Memory among multiple nodes, tools to monitor, assess and evaluate data under hierarchical Hadoop cluster is needed. By analyzing text, audios, videos information, Periodic reports will be generated to assist the students, teachers and Government. In this case, a software framework is required to process big data stored in hierarchical nodes. As the architecture of Hadoop, an open source software framework doesn't support processing the data stored in hierarchical nodes. This case study proposes the Hierarchical Hadoop cluster to alter the way of using ICT. The proposed work helps in monitoring, reporting the usage of ICT and also acts a help desk to address the issues of the educational institutions. This establishes a novel communication media by generating reports on text, audio, video information based on analysis.



  • Keywords

    Hadoop, cluster, map reduce processing, Ed-Media.

  • References

      [1] Govindarajan K, Somasundaram TS & Kumar VS, “Continuous clusteringin big data learning analytics”, IEEE Fifth International Conference on Technology for Education (T4E), (2013), pp.61–64.

      [2] Pulamolu MKK, “A novel resource allocation using dynamic heterogeneity priority based flow shop algorithm in yarn”, Alexandria Engineering Journal, (2017).

      [3] Bu X, Rao J & Xu CZ, “Coordinated self-configuration of virtual machines and appliances using a model-free learning approach”, IEEE transactions on parallel and distributed systems, Vol.24, No.4, (2013), pp.681–690.

      [4] Stavrinides GL, Duro FR, Karatza HD, Blas JG & Carretero J, Different aspects of workflow scheduling in large-scale distributed systems”, Simulation Modelling Practice and Theory, Vol.70 (2017), pp.120–134.

      [5] Guo Y, Bland W, Balaji P & Zhou X, “Fault tolerant mapreduce-mpi for hpc clusters”, Proceedings of the International Conference for High Performance Computing Networking, Storage and Analysis, (2015).

      [6] Guo Y, Rao J, Jiang C & Zhou X, “Flexslot: Moving hadoop into the cloud withflexible slot management”, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, (2014), pp. 959–969.

      [7] Cheng D, Rao J, Guo Y, Jiang C & Zhou X, “Improving performance of heterogeneous map reduce clusters with adaptive task tuning”, IEEE Transactions on Parallel and Distributed Systems, Vol.28, No.3, (2017), pp.774–786.

      [8] Tang S, Lee BS, He B & Liu H, “Long-term resource fairness: Towards economic fairness on pay-as-you-use computing systems”, Proceedings of the 28th ACM international conference on Supercomputing, (2014), pp. 251–260.

      [9] Lin J, Liang F, Lu X, Zha L & Xu Z, “Modeling and designing fault-tolerance mechanisms for mpi-based map reduce data computing framework”, IEEE First International Conference on Big Data Computing Service and Applications (Big Data Service), (2015), pp. 176–183.

      [10] Moschakis IA & Karatza HD, “Multi-criteria scheduling of bag-of-tasks applications on heterogeneous interlinked clouds with simulated annealing”, Journal of Systems and Software, Vol.101, (2015), pp.1–14.

      [11] Wang K, Liu N, Sadooghi I, Yang X, Zhou X, Li T, Lang M, Sun XH & Raicu I, “Overcoming hadoop scaling limitations through distributed task execution”, IEEE International Conference on Cluster Computing (CLUSTER), (2015), pp.236–245.

      [12] Difallah DE, Demartini G & Cudré-Mauroux P, “Scheduling human intelligence tasksin multi-tenant crowd-powered systems”, Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, (2016), pp.855–865.

      [13] Stavrinides GL & Karatza HD, “Scheduling real-time parallel applications in saas clouds in the presence of transient software failures”, International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), (2016), pp. 1–8.

      [14] Yao Y, Wang J, Sheng B, Tan CC & Mi N, “Self-adjusting slot configurations for homogeneous and heterogeneous hadoop clusters”, IEEE Transactions on Cloud Computing, Vol.5, No.2, (2017), pp.344–357.

      [15] Cheng D, Guo Y & Zhou X, “Self-tuning batching with dvfs for improving performance and energy efficiency in servers”, IEEE 21st International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), (2013), pp. 40–49.

      [16] Zhou F, Pham H, Yue J, Zou H & Yu W, “Sfmapreduce: An optimized map reduce framework for small files”, IEEE International Conference on Networking, Architecture and Storage (NAS), (2015), pp. 23–32.

      [17] Kambatla K, Pathak A & Pucha H, “Towards optimizing hadoop provisioning in the cloud”, HotCloud, Vol.9, (2009).

      [18] Sharma B, Wood T & Das CR, “Hybridmr: A hierarchical mapreduce scheduler forhybrid data centers”, IEEE 33rdInternational Conference on Distributed Computing Systems (ICDCS), (2013), pp.102–111.

      [19] Nair S & Mehta J, “Clustering with apache hadoop”, Proceedings of the International Conference & Workshop on Emerging Trends in Technology, (2011), pp. 505–509.

      [20] Pulamolu KK, Bhavani T & Subramanian DV, “Intra-Tenant resource sharing in yarn based on weighted arithmetic mean”, International Conference on Networks & Advances in Computational Technologies (NetACT), (2017), pp.262-265.

      [21] Pulamolu MKK, “An efficient resource optimization in intra-tenant heterogeneous hadoop cluster”, IEEE Conference, International Conference on Intelligent Computing and Control Systems–ICCS, (2017).

      [22] Subramanian DV & Kumar KP, “Fuzzy based modeling for an effective it security policy management”, SAI Computing Conference (SAI), (2016), pp.173–181.

      [23] Ibrahim AH, Faheem HEDM, Mahdy YB & Hedar AR, “Resource allocation algorithm for gpus in a private cloud”, International Journal of Cloud Computing, Vol.5, No.1-2, (2016), pp.45–56.

      [24] Al-Ayyoub M, Daraghmeh M, Jararweh Y & Althebyan Q, “Towards improving resource management in cloud systems using a multi-agent framework”, International Journal of Cloud Computing, Vol.5, No.1-2, (2016), pp.112–133.

      [25] Subramanian DV, Geetha A, Mehata K & Hussain KM, “Kmsystem evaluation using four dimensional metric model, database and restful resources”, International Journal on Web Service Computing, Vol.3, No.3, (2012).

      [26] Stavrinides GL & Karatza HD, “Scheduling different types of applications in a saas cloud”, Proceedings of the 6th International Symposium on Business Modeling and Software Design (BMSD⣙16), (2016), pp. 144–151.

      [27] Li B, Zhao H & Lv Z, “Parallel isodata clustering of remote sensing images based on map reduce”, International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery(CyberC), (2010), pp.380–383.




Article ID: 12174
DOI: 10.14419/ijet.v7i2.21.12174

Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.