Designing and analyzing highly scalable and reliable of full fledged parallel algorithm for computing strongly connected com-ponents to analyze social graphs

 
 
 
  • Abstract
  • Keywords
  • References
  • PDF
  • Abstract


    With the recent advent of Big Data, developing efficient distributed algorithms for computing Strongly Connected Components of a large dataset has received increasing interests. For example, social networks, information networks and communication networks such as the communities of people that have formed on those networks, what community a person belongs or finding cyclic de-pendencies in the graph.

    Apache Giraph is an open-source implementation of Google’s Pregel. It is an iterative and real-time graph processing engine designed to be scalable, fault tolerant and highly efficient. This framework provides an accurate platform for the development of parallel algorithms in a distributed environ-ment. It adopts a vertex-centric programming model inspired by Bulk Synchronous Parallel model. A strongly connected component is a maximal sub graph in which all vertices are reachable from every other vertex. Maximal means that it is the largest possible sub graph. It is not possible to find another vertex anywhere in the graph such that it could be added to the sub graph and all the verti-ces in the sub graph would still be connected. In a directed graph G, a pair of vertices u and v are said to be strongly connected to each other if there is a path in each direction between them. Here, we have implemented a parallel algorithm which is based on the new paradigm of graph decomposi-tion for computing strongly connected components. The final outcome mainly focuses on the reduc-tion of total communication costs.

     


  • Keywords


    Apache Giraph; Graph; Strongly Connected; Cost; Parallel; Graph Processing.

  • References


      [1] F. Pellegrini , “Current challenges in parallel graph partitioning”,

      [2] Comptes Rendus Mécanique, vol. 339, no. 2-3, pp. 90-95, 2011.

      [3] Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: A system for large-scale graph processing. In: ACM International Conference on the Management of Data (SIGMOD), ACM (2010) 135–146.

      [4] Gregor, D., Lumsdaine, A.: The Parallel BGL: A Generic Library for Distributed Graph Computations. In: Parallel Object-Oriented Scientific Computing (POOSC). (2005).

      [5] A. Lumsdaine, D. Gregor, B. Hendrickson and J. Berry, “Challenges in Parallel Graph Processing”, Parallel Processing Letters, vol. 17, no. 1, pp. 5-20, 2007.

      [6] Ediger, D., Bader, D.: Investigating Graph Algorithms in the BSP Model on the Cray XMT. In: Workshop on Multithreaded Architectures and Applications (MTAAP). (2013).

      [7] Harish, P., Narayanan, P.J.: Accelerating large graph algorithms on the gpu using cuda. In IEEE High performance computing (HiPC). (2007).

      [8] Chen, R., Weng, X., He, B., Yang, M.: Large graph processing in the cloud. In: ACM International Conference on the Management of Data (SIGMOD), ACM (2010) 1123–1126.

      [9] J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters”, Proceedings of Sixth Symposium on Operating Systems Design and Implementation, San Francisco, California, USA, 2004.

      [10] F. N. Afrati, A. Das Sarma, S. Salihoglu and J. D. Ullman, “Vision Paper: Towards an Understanding of the Limits of Map-Reduce Computation”, Proceedings of Cloud Futures 2012 Workshop, Berkeley, California, USA, 2012

      [11] Sabeur Aridhia, Alberto Montresorb, Yannis Velegrakisb: BLADYG: A Graph Processing Framework for Large Dynamic Graphs, University of Lorraine, LORIA, Campus Scienti_que, BP 239, 54506 Vandoeuvre-l_es-Nancy, France University of Trento, Italy arXiv:1701.00546v1 [cs.DC] 2 Jan 2017.

      [12] Arijit Khan, Sameh Elnikety: Systems for BigGraphs, 40th International Conference on Very Large Data Bases, September 1st 5th 2014, Hangzhou, China. Proceedings of the VLDB Endowment, Vol. 7, No. 13.


 

View

Download

Article ID: 11354
 
DOI: 10.14419/ijet.v7i2.12.11354




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.