31 |
The crossing number of a graph in the planeWinterbach, Wynand 03 1900 (has links)
Thesis (MSc (Mathematical Sciences. Applied Mathematics))--University of Stellenbosch, 2005. / Heuristics for obtaining upper bounds on crossing numbers of small graphs in the plane, and a heuristic for obtaining lower bounds to the crossing numbers of large graphs in the plane are presented in this thesis. It is shown that the two-page book layout framework is effective for deriving general upper bounds, and that it may also be used to obtain exact results for the crossing numbers of graphs. The upper bound algorithm is based on the well-kown optimization heuristics of tabu search, genetic algorithms and neural networks for obtaining two-page book layouts with few resultant edge crossings. The lower bound algorithm is based on the notion of embedding a graph into another graph, and, to the best knowledge of the author, it is the first known lower bound algorithm for the corssing number of a graph. It utilizes Dijkstra's shortest paths algorithm to embed one graph into another, in such a fashion as to minimize edge and vertex congestion values. The upper bound algorithms that were developed in this thesis were applied to all non-planar complete multipartite graphs of orders 6-13. A catalogue of drawings of these graphs with low numbers of crossings is provided in the thesis. Lower bounds on the crossing numbers of these graphs were also computed, using lowerbounds that are known for a number of complete multipartite graphs, as well as the fact that lower bounds on the crossing numbers of the subgraphs of a graph G, are lower bounds on the crossing number of G. A reference implementation of the Garey-Johnson algorithm is supplied, and finally, it is shown that Szekely's algorithm for computing the independent-odd crossing number may be converted into a heuristic algorithm for deriving upper bounds on the plane crossing number of a graph. This thesis also provides a thorough survey of results known for the crossing number of a graph in the plane. The survey especially focuses on algorithmic issues that have been proposed by researchers in the field of crossing number research.
|
32 |
Centrality measures and analyzing dot-product graphsErdos, Dora 22 July 2016 (has links)
In this thesis we investigate two topics in data mining on graphs; in the first part we investigate the notion of centrality in graphs, in the second part we look at reconstructing graphs from aggregate information.
In many graph related problems the goal is to rank nodes based on an importance score. This score is in general referred to as node centrality. In Part I. we start by giving a novel and more efficient algorithm for computing betweenness centrality. In many applications not an individual node but rather a set of nodes is chosen to perform some task. We generalize the notion of centrality to groups of nodes.
While group centrality was first formally defined by Everett and Borgatti (1999), we are the first to pose it as a combinatorial optimization problem; find a group of k nodes with largest centrality. We give an algorithm for solving this optimization problem for a general notion of centrality that subsumes various instantiations of centrality that find paths in the graph. We prove that this problem is NP-hard for specific centrality definitions and we provide a universal algorithm for this problem that can be modified to optimize the specific measures. We also investigate the problem of increasing node centrality by adding or deleting edges in the graph. We conclude this part by solving the optimization problem for two specific applications; one for minimizing redundancy in information propagation networks and one for optimizing the expected number of interceptions of a group in a random navigational network.
In the second part of the thesis we investigate what we can infer about a bipartite graph if only some aggregate information -- the number of common neighbors among each pair of nodes -- is given. First, we observe that the given data is equivalent to the dot-product of the adjacency vectors of each node. Based on this knowledge we develop an algorithm that is based on SVD-decomposition, that is capable of almost perfectly reconstructing graphs from such neighborhood data. We investigate two versions of this problem, in the versions the dot-product of nodes with themselves, e.g. the node degrees, are either known or hidden.
|
33 |
Preserving large cuts in fully dynamic graphsWasim, Omer 21 May 2020 (has links)
This thesis initiates the study of the MAX-CUT problem in fully dynamic graphs. Given a graph $G=(V,E)$, we present the first fully dynamic algorithms to maintain a $\frac{1}{2}$-approximate cut in sublinear update time under edge insertions and deletions to $G$. Our results include the following deterministic algorithms: i) an $O(\Delta)$ \textit{worst-case} update time algorithm, where $\Delta$ denotes the maximum degree of $G$ and ii) an $O(m^{1/2})$ amortized update time algorithm where $m$ denotes the maximum number of edges in $G$ during any sequence of updates. \\ \indent We also give the following randomized algorithms when edge updates come from an oblivious adversary: i) a $\tilde{O}(n^{2/3})$ update time algorithm\footnote{Throughout this thesis, $\tilde{O}$ hides a $O(\text{polylog}(n))$ factor.} to maintain a $\frac{1}{2}$-approximate cut, and ii) a $\min\{\tilde{O}(n^{2/3}), \tilde{O}(\frac{n^{{3/2}+2c_0}}{m^{1/2}})\}$ worst case update time algorithm which maintains a $(\frac{1}{2}-o(1))$-approximate cut for any constant $c_0>0$ with high probability. The latter algorithm is obtained by designing a fully dynamic algorithm to maintain a sparse subgraph with sublinear (in $n$) maximum degree which approximates all large cuts in $G$ with high probability. / Graduate
|
34 |
Approximate Partially Dynamic Directed Densest SubgraphRichard Zou Li (15361858) 29 April 2023 (has links)
<p>The densest subgraph problem is an important problem with both theoretical and practical significance. We consider a variant of the problem, the directed densest subgraph problem, under the partially dynamic setting of edge insertions only. We give a algorithm maintaining a (1-ε)-approximate directed densest subgraph in O(log<sup>3</sup>n/ε<sup>6</sup>) amortized time per edge insertion, based on earlier work by Chekuri and Quanrud. This result partially improves on an earlier result by Sawlani and Wang, which guarantees O(log<sup>5</sup>n/ε<sup>7</sup>) worst case time for edge insertions and deletions.</p>
|
35 |
Harnessing Simulated Data with GraphsMaia, Henrique Teles January 2022 (has links)
Physically accurate simulations allow for unlimited exploration of arbitrarily crafted environments. From a scientific perspective, digital representations of the real world are useful because they make it easy validate ideas. Virtual sandboxes allow observations to be collected at-will, without intricate setting up for measurements or needing to wait on the manufacturing, shipping, and assembly of physical resources. Simulation techniques can also be utilized over and over again to test the problem without expending costly materials or producing any waste.
Remarkably, this freedom to both experiment and generate data becomes even more powerful when considering the rising adoption of data-driven techniques across engineering disciplines. These are systems that aggregate over available samples to model behavior, and thus are better informed when exposed to more data. Naturally, the ability to synthesize limitless data promises to make approaches that benefit from datasets all the more robust and desirable.
However, the ability to readily and endlessly produce synthetic examples also introduces several new challenges. Data must be collected in an adaptive format that can capture the complete diversity of states achievable in arbitrary simulated configurations while too remaining amenable to downstream applications. The quantity and zoology of observations must also straddle a range which prevents overfitting but is descriptive enough to produce a robust approach. Pipelines that naively measure virtual scenarios can easily be overwhelmed by trying to sample an infinite set of available configurations. Variations observed across multiple dimensions can quickly lead to a daunting expansion of states, all of which must be processed and solved. These and several other concerns must first be addressed in order to safely leverage the potential of boundless simulated data.
In response to these challenges, this thesis proposes to wield graphs in order to instill structure over digitally captured data, and curb the growth of variables. The paradigm of pairing data with graphs introduced in this dissertation serves to enforce consistency, localize operators, and crucially factor out any combinatorial explosion of states. Results demonstrate the effectiveness of this methodology in three distinct areas, each individually offering unique challenges and practical constraints, and together showcasing the generality of the approach. Namely, studies observing state-of-the-art contributions in design for additive manufacturing, side-channel security threats, and large-scale physics based contact simulations are collectively achieved by harnessing simulated datasets with graph algorithms.
|
36 |
NETWORK ALIGNMENT USING TOPOLOGICAL AND NODE EMBEDDING FEATURESAljohara Fahad Almulhim (19200211) 03 September 2024 (has links)
<p dir="ltr">In today's big data environment, development of robust knowledge discovery solutions depends on integration of data from various sources. For example, intelligence agencies fuse data from multiple sources to identify criminal activities; e-commerce platforms consolidate user activities on various platforms and devices to build better user profile; scientists connect data from various modality to develop new drugs, and treatments. In all such activities, entities from different data sources need to be aligned---first, to ensure accurate analysis and more importantly, to discover novel knowledge regarding these entities. If the data sources are networks, aligning entities from different sources leads to the task of network alignment, which is the focus of this thesis. The main objective of this task is to find an optimal one-to-one correspondence among nodes in two or more networks utilizing graph topology and nodes/edges attributes. </p><p dir="ltr">In existing works, diverse computational schemes have been adopted for solving the network alignment task; these schemes include finding eigen-decomposition of similarity matrices, solving quadratic assignment problems via sub-gradient optimization, and designing iterative greedy matching techniques. Contemporary works approach this problem using a deep learning framework by learning node representations to identify matches. Node matching's key challenges include computational complexity and scalability. However, privacy concerns or unavailability often prevent the utilization of node attributes in real-world scenarios. In light of this, we aim to solve this problem by relying solely on the graph structure, without the need for prior knowledge, external attributes, or guidance from landmark nodes. Clearly, topology-based matching emerges as a hard problem when compared to other network matching tasks.</p><p dir="ltr">In this thesis, I propose two original works to solve network topology-based alignment task. The first work, Graphlet-based Alignment (Graphlet-Align), employs a topological approach to network alignment. Graphlet-Align represents each node with a local graphlet count based signature and use that as feature for deriving node to node similarity across a pair of networks. By using these similarity values in a bipartite matching algorithm Graphlet-Align obtains a preliminary alignment. It then uses high-order information extending to k-hop neighborhood of a node to further refine the alignment, achieving better accuracy. We validated Graphlet-Align's efficacy by applying it to various large real-world networks, achieving accuracy improvements ranging from $20\%$ to $72\%$ over state-of-the-art methods on both duplicated and noisy graphs.</p><p dir="ltr">Expanding on this paradigm that focuses solely on topology for solving graph alignment, in my second work, I develop a self-supervised learning framework known as Self-Supervised Topological Alignment (SST-Align). SST-Align uses graphlet-based signature for creating self-supervised node alignment labels, and then use those labels to generate node embedding vectors of both the networks in a joint space from which node alignment task can be effectively and accurately solved. It starts with an optimization process that applies average pooling on top of the extracted graphlet signature to construct an initial node assignment. Next, a self-supervised Siamese network architecture utilizes both the initial node assignment and graph convolutional networks to generate node embeddings through a contrastive loss. By applying kd-tree similarity to the two networks' embeddings, we achieve the final node mapping. Extensive testing on real-world graph alignment datasets shows that our developed methodology has competitive results compared to seven existing competing models in terms of node mapping accuracy. Additionally, we establish the Ablation Study to evaluate the two-stage accuracy, excluding the learning representation part and comparing the mapping accuracy accordingly.</p><p dir="ltr">This thesis enhances the theoretical understanding of topological features in the analysis of graph data for network alignment task, hence facilitating future advancements toward the field.</p>
|
37 |
A high-performance framework for analyzing massive complex networksMadduri, Kamesh 08 July 2008 (has links)
Graphs are a fundamental and widely-used abstraction for representing data. We can analytically study interesting aspects of real-world complex systems such as the Internet, social systems, transportation networks, and biological interaction data by modeling them as graphs. Graph-theoretic and combinatorial problems are also pervasive in scientific computing and engineering applications. In this dissertation, we address the problem of analyzing large-scale complex networks that represent interactions between hundreds of thousands to billions of entities. We present SNAP, a new high-performance computational framework for efficiently processing graph-theoretic queries on massive datasets.
Graph analysis is computationally very different from traditional scientific computing, and solving massive graph-theoretic problems on current high performance computing systems is challenging due to several reasons. First, real-world graphs are often characterized by a low diameter and unbalanced degree distributions, and are difficult to partition on parallel systems. Second, parallel algorithms for solving graph-theoretic problems are typically memory intensive, and the memory accesses are fine-grained and highly irregular. The primary contributions of this dissertation are the design and implementation of novel parallel graph algorithms for traversal, shortest paths, and centrality computations, optimized for the small-world network topology, and high-performance multithreaded architectures and multicore servers. SNAP (Small-world Network Analysis and Partitioning) is a modular, open-source framework for the exploratory analysis and partitioning of large-scale networks. With SNAP, we demonstrate the capability to process massive graphs with billions of vertices and edges, and achieve up to two orders of magnitude speedup over state-of-the-art network analysis approaches. We also design a new parallel computing benchmark for characterizing the performance of graph-theoretic problems on high-end systems; study data representations for dynamic graph problems on parallel systems; and apply algorithms in SNAP to solve real-world problems in social network analysis and systems biology.
|
38 |
Mapping parallel graph algorithms to throughput-oriented architecturesMcLaughlin, Adam 07 January 2016 (has links)
The stagnant performance of single core processors, increasing size of data sets, and variety of structure in information has made the domain of parallel and high-performance computing especially crucial. Graphics Processing Units (GPUs) have recently become an exciting alternative to traditional CPU architectures for applications in this domain. Although GPUs are designed for rendering graphics, research has found that the GPU architecture is well-suited to algorithms that search and analyze unstructured, graph-based data, offering up to an order of magnitude greater memory bandwidth over their CPU counterparts.
This thesis focuses on GPU graph analysis from the perspective that algorithms should be efficient on as many classes of graphs as possible, rather than being specialized to a specific class, such as social networks or road networks. Using betweenness centrality, a popular analytic used to find prominent entities of a network, as a motivating example, we show how parallelism, distributed computing, hybrid and on-line algorithms, and dynamic algorithms can all contribute to substantial improvements in the performance and energy-efficiency of these computations. We further generalize this approach and provide an abstraction that can be applied to a whole class of graph algorithms that require many simultaneous breadth-first searches. Finally, to show that our findings can be applied in real-world scenarios, we apply these techniques to the problem of verifying that a multiprocessor complies with its memory consistency model.
|
39 |
On the maximum degree chromatic number of a graphNieuwoudt, Isabelle 12 1900 (has links)
ENGLISH ABSTRACT: Determining the (classical) chromatic number of a graph (i.e. finding the smallest number of colours with
which the vertices of a graph may be coloured so that no two adjacent vertices receive the same colour)
is a well known combinatorial optimization problem and is widely encountered in scheduling problems.
Since the late 1960s the notion of the chromatic number has been generalized in several ways by relaxing
the restriction of independence of the colour classes. / AFRIKAANSE OPSOMMING: Die bepaling van die (klassieke) chromatiese getal van ’n grafiek (naamlik die kleinste aantal kleure
waarmee die punte van ’n grafiek gekleur kan word sodat geen twee naasliggende punte dieselfde kleur
ontvang nie) is ’n bekende kombinatoriese optimeringsprobleem wat wyd in skeduleringstoepassings
te¨egekom word. Sedert die laat 1960s is die definisie van die chromatiese getal op verskeie maniere
veralgemeen deur die vereiste van onafhanklikheid van die kleurklasse te verslap. / Thesis (DPhil)--Stellenbosch University, 2007.
|
40 |
Méthodes combinatoires de reconstruction de réseaux phylogénétiques / Combinatorial Methods for Phylogenetic Network ReconstructionGambette, Philippe 30 November 2010 (has links)
Les réseaux phylogénétiques généralisent le modèle de l'arbre pour décrire l'évolution, en permettant à des arêtes entre les branches de l'arbre d'exprimer des échanges de matériel génétique entre espèces coexistantes. De nombreuses approches combinatoires - fondées sur la manipulation d'ensembles finis d'objets mathématiques - ont été conçues pour reconstruire ces réseaux à partir de données extraites de plusieurs arbres de gènes contradictoires. Elles se divisent en plusieurs catégories selon le type de données en entrées (triplets, quadruplets, clades ou bipartitions) et les restrictions de structure sur les réseaux reconstruits. Nous analysons en particulier la structure d'une classe de réseaux restreints, les réseaux de niveau k, et adaptons ce paramètre de niveau au contexte non enraciné. Nous donnons aussi de nouvelles méthodes combinatoires pour reconstruire des réseaux phylogénétiques, à partir de clades - méthode implémentée dans le logiciel Dendroscope - ou de quadruplets. Nous étudions les limites de ces méthodes combinatoires (explosion de complexité, bruit et silence dans les données, ambiguïté des réseaux reconstruits) et la façon de les prendre en compte, en particulier par un pré-traitement des données. Finalement, nous illustrons les résultats de ces méthodes de reconstruction sur des données réelles avant de conclure sur leur utilisation dans une méthodologie globale qui intègre des aspects statistiques. / Phylogenetic networks generalize the tree concept to model Evolution, by allowing edges between branches inside the tree to reflect genetic material exchanges between coexisting species. Lots of combinatorial approaches have been designed to reconstruct networks from data extracted from a set of contradictory gene trees. These approaches can be divided into several categories depending on the kind of input, i.e. triplets, quartets, clusters and splits, and on the kind of structure restrictions they impose on reconstructed networks.We particularly analyze the structure of one class of such restricted networks, namely level-k phylogenetic networks, and adapt this level parameter to the unrooted context. We also give new combinatorial methods to reconstruct phylogenetic networks from clusters - implemented in Dendroscope - or quartets. We study the limits of combinatorial methods (complexity explosion, noise and silence in the data, ambiguity in the reconstucted network), and the way to tackle them, in particular with an appropriate data preprocessing. Finally we illustrate the results of these reconstruction methods on a dataset, and we conclude on how to use them in a global methodology which integrates statistical aspects.
|
Page generated in 0.0592 seconds