An Efficient Data Replication Scheme for Hadoop Distributed File System
2018-05-31 -
Hadoop Data locality, Data Replication, Data Placement -
A Distributed file system (DFS) is a storage component of a distributed system (DS). DS consists of multiple autonomous nodes connected via a communication network to solve large problems and to achieve more computing power. One of the design requirement of any DS is to provide replicas. In this paper, we propose a new replication algorithm which is more reliable than the existing replication algorithm used in DFS. The advantages of our proposed replication algorithm by incrementing nodes sequentially (RAINS) is that it distributes the storage load equally among all the nodes sequentially and it guarantees a replica copy in case two racks in a DS are down. This feature is not available in the existing DFS. We have compared existing replication algorithm used by Hadoop distributed file system (HDFS) with our proposed RAINS algorithm. The experimental results indicate that our proposed RAINS algorithm performs better when more number of racks failed in the DS.
Lakshmi Siva Rama Krishna, T., Priyanka, J., Nikhil Teja, N., Mahiya Sultana, S., & Jabber, B. (2018). An Efficient Data Replication Scheme for Hadoop Distributed File System. International Journal of Engineering & Technology, 7(2.32), 167-169.
Accepted date: 2018-07-10
Published date: 2018-05-31