Agglomerative Clustering techniques work by recursively merging graph vertices into communities, to maximize a clustering quality metric. The metric of Modularity coined by Newman and Girvan, measures the cluster quality based on the premise that, a cluster has collections of vertices more strongly connected internally than would occur from random chance. Various fast and efficient algorithms for community detection based on modularity maximization have been developed for static graphs. However, since many (contemporary) networks are not static but rather evolve over time, the static approaches are rendered inappropriate for clustering of dynamic graphs. Modularity optimization in changing graphs is a relatively new field that entails the need to develop efficient algorithms for detection and maintenance of a community structure while minimizing the “Size of change” and computational effort. The objective of this work was to develop an efficient dynamic agglomerative clustering algorithm that attempts to maximize modularity while minimizing the “size of change” in the transitioning community structure. First we briefly discuss the previous memoryless dynamic reagglomeration approach with localized vertex freeing and illustrate its performance and limitations. Then
we describe the new backtracking algorithm followed by its performance results and observations. In experimental analysis of both typical and pathological cases, we evaluate and justify various backtracking and agglomeration strategies in context of
the graph structure and incoming stream topologies. Evaluation of the algorithm on social network datasets, including Facebook (SNAP) and PGP Giant Component networks shows significantly improved performance over its conventional static counterpart in terms of execution time, Modularity and Size of Change.
Identifer | oai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/55065 |
Date | 27 May 2016 |
Creators | Godbole, Pushkar J. |
Contributors | Bader, David |
Publisher | Georgia Institute of Technology |
Source Sets | Georgia Tech Electronic Thesis and Dissertation Archive |
Language | en_US |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Page generated in 0.002 seconds