Dynamic Selection of Optimal Cloud Service Provider for Big Data Applications

    Big data analytics and Cloud computing are the two most imperative innovations in the current IT industry. In a surprise, these technologies come up together to convey the effective outcomes to various business organizations. However, big data analytics require a huge amount of resources for storage and computation. The storage cost is massively increased on the input amounts of data and requires innovative algorithms to reduce the cost to store the data in a specific data centers in a cloud. In Today’s IT Industry, Cloud Computing has emerged as a popular paradigm to host customer, enterprise data and many other distributed applications. Cloud Service Providers (CSPs) store huge amounts of data and numerous distributed applications with different cost. For example Amazon provides storage services at a fraction of TB/month and each CSP having different Service Level Agreements with different storage offers. Customers are interested in reliable SLAs and it increases the cost since the number of replicas are more. The CSPs are attracting the users for initial storage/put operations and get operations from the cloud becomes hurdle and subsequently increases the cost. CSPs provides these services by maintaining multiple datacenters at multiple locations throughout the world. These datacenters provide distinctive get/put latencies and unit costs for resource reservation and utilization. The way of choosing distinctive CSPs data centers, becomes tricky for cloud users those who are using the distributed application globally i.e. online social networks.  In has mainly two challenges. Firstly, allocating the data to different datacenters to satisfy the SLO including the latency. Secondly, how one can reserve the remote resource i.e. memory with less cost. In this paper we have derived a new model to minimize the cost by satisfying the SLOs with integer programming. Additionally, we proposed an algorithm to store the data in a data center by minimizing the cost among different data centers and the computation of cost for put/get latencies. Our simulation works shows that the cost is minimized for resource reservation and utilization among different datacenters.



    Storage issues, CSPs, Optimal Selection, Service Level Objectives.

