Return to search

Efficient Community Detection for Large Scale Networks via Sub-sampling

Many real-world systems can be represented as network-graphs. Some of the networks have an inherent community structure based on interactions. The problem of identifying this grouping structure given a graph is termed as community detection problem which has certain existing algorithms. This thesis contributes by providing specific improvements to various community detection algorithms such as spectral clustering and extreme point algorithm. One of the main contributions is proposing a new sub-sampling method to make existing spectral clustering method scalable by reducing the computational complexity. Also, we have implemented extreme points algorithm for a general multiple communities detection case along with a sub-sampling based version to reduce the computational complexity. We have also developed spectral clustering algorithm for popularity-adjusted block model (PABM) model based graphs to make the algorithm exact thus improving its accuracy. / Master of Science / We live in an increasingly interconnected world, where agents constantly interact with each other. This general agent-interaction framework describes many important systems, such as social interpersonal systems, protein interaction systems, trade and financial systems, power grids, and the World Wide Web, to name a few. By denoting agents as nodes and their interconnections as links, any such system can be represented as a network. Such networks or graphs provide a powerful and universal representation for analyzing a wide variety of systems spanning a remarkable range of scientific disciplines. Networks act as conduits for many kinds of transmissions. For instance, they are influential in the dissemination of ideas, adoption of technologies, helping find jobs and spread of diseases. Thus networks play a critical role both in providing information and helping make decisions making them a crucial part of the Data and Decisions Destination Area. A well-known feature of many networks is community structure. Nodes in a network are often found to belong to groups or communities that exhibit similar behavior. The identification of this community structure, called community detection, is an important problem with many critical applications. For example, communities in a protein interaction network often correspond to functional groups. This thesis focuses on cutting-edge methods for community detection in networks. The main approach is efficient community detection via sub-sampling. This is applied to two different approaches. The first approach is optimization of a modularity function using a low-rank approximation for multiple communities. The second approach is a spectral clustering where we aim to formulate an algorithm for community detection by exploiting the eigenvectors of the network adjacency matrix.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/81862
Date18 January 2018
CreatorsBellam, Venkata Pavan Kumar
ContributorsElectrical and Computer Engineering, Sengupta, Srijan, Huang, Jia-Bin, Abbott, A. Lynn
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0021 seconds