Global ETD Search

1	Fast and scalable triangle counting in graph streams: the hybrid approach Singh, Paramvir 14 December 2020 (has links) Triangle counting is a major graph problem with several applications in social network analysis, anomaly detection, etc. A considerable amount of work has contributed to approximately computing the global triangle counts using several computational models. One of the most popular streaming models considered is Edge Streaming in which the edges arrive in the form of a graph stream. We categorize the existing literature into two categories: Fixed Memory (FM) approach, and Fixed Probability (FP) approach. As the size of the graphs grows, several challenges arise such as memory space limitations, and prohibitively long running time. Therefore, both FM and FP categories exhibit some limitations. FP algorithms fail to scale for massive graphs. We identified a limitation of FM category $i.e.$ FM algorithms have higher computational time than their FP variants. In this work, we present a new category called the Hybrid approach that overcomes the limitations of both FM and FP approaches. We present two new algorithms that belong to the hybrid category: Neighbourhood Hybrid Multisampling (NHMS) and Triest/ThinkD Hybrid Sampling (THS) for estimating the number of global triangles in graphs. These algorithms are highly scalable and have better running time than FM and FP variants. We experimentally show that both NHMS and THS outperform state-of-the-art algorithms in space-efficient environments. / Graduate Graph mining Triangle counting approximation algorithms Edge Streaming
2	Parallel Mining and Analysis of Triangles and Communities in Big Networks Arifuzzaman, S M. 19 August 2016 (has links) A network (graph) is a powerful abstraction for interactions among entities in a system. Examples include various social, biological, collaboration, citation, and co-purchase networks. Real-world networks are often characterized by an abundance of triangles and the existence of well-structured communities. Thus, counting triangles and detecting communities in networks have become important algorithmic problems in network mining and analysis. In the era of big data, the network data emerged from numerous scientific disciplines are very large. Online social networks such as Twitter and Facebook have millions to billions of users. Such massive networks often do not fit in the main memory of a single machine, and the existing sequential methods might take a prohibitively large runtime. This motivates the need for scalable parallel algorithms for mining and analysis. We design MPI-based distributed-memory parallel algorithms for counting triangles and detecting communities in big networks and present related analysis. The dissertation consists of four parts. In Part I, we devise parallel algorithms for counting and enumerating triangles. The first algorithm employs an overlapping partitioning scheme and novel load-balancing schemes leading to a fast algorithm. We also design a space-efficient algorithm using non-overlapping partitioning and an efficient communication scheme. This space efficiency allows the algorithm to work on even larger networks. We then present our third parallel algorithm based on dynamic load balancing. All these algorithms work on big networks, scale to a large number of processors, and demonstrate very good speedups. An important property, very related to triangles, of many real-world networks is high transitivity, which states that two nodes having common neighbors tend to become neighbors themselves. In Part II, we characterize networks by quantifying the number of common neighbors and demonstrate its relationship to community structure of networks. In Part III, we design parallel algorithms for detecting communities in big networks. We propose efficient load balancing and communication approaches, which lead to fast and scalable algorithms. Finally, in Part IV, we present scalable parallel algorithms for a useful graph preprocessing problem-- converting edge list to adjacency list. We present non-trivial parallelization with efficient HPC-based techniques leading to fast and space-efficient algorithms. / Ph. D. Network Mining Parallel Algorithm Triangle Counting Community Detection Big Data

Search results

Fast and scalable triangle counting in graph streams: the hybrid approach

Parallel Mining and Analysis of Triangles and Communities in Big Networks