Community detection (or clustering) in large-scale graph is an important problem in graph mining. Communities reveal interesting characteristics of a network. Louvain is an efficient sequential algorithm but fails to scale emerging large-scale data. Developing distributed-memory parallel algorithms is challenging because of inter-process communication and load-balancing issues. In this work, we design a shared memory-based algorithm using OpenMP, which shows a 4-fold speedup but is limited to available physical cores. Our second algorithm is an MPI-based parallel algorithm that scales to a moderate number of processors. We also implement a hybrid algorithm combining both. Finally, we incorporate dynamic load-balancing in our final algorithm DPLAL (Distributed Parallel Louvain Algorithm with Load-balancing). DPLAL overcomes the performance bottleneck of the previous algorithms, shows around 12-fold speedup scaling to a larger number of processors. Overall, we present the challenges, our solutions, and the empirical performance of our algorithms for several large real-world networks.
Identifer | oai:union.ndltd.org:uno.edu/oai:scholarworks.uno.edu:td-3755 |
Date | 23 May 2019 |
Creators | Sattar, Naw Safrin |
Publisher | ScholarWorks@UNO |
Source Sets | University of New Orleans |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | University of New Orleans Theses and Dissertations |
Page generated in 0.0022 seconds