Global ETD Search

1	APPROXIMATION ALGORITHMS FOR MAXIMUM VERTEX-WEIGHTED MATCHING Ahmed I Al Herz (8072036) 03 December 2019 (has links) <div>We consider the maximum vertex-weighted matching problem (MVM), in which non-negative weights are assigned to the vertices of a graph, and the weight of a matching is the sum of the weights of the matched vertices. Vertex-weighted matchings arise in many applications, including internet advertising, facility scheduling, constraint satisfaction, the design of network switches, and computation of sparse bases for the null space or the column space of a matrix. Let m be the number of edges, n number of vertices, and D the maximum degree of a vertex in the graph. We design two exact algorithms for the MVM problem with time complexities of O(mn) and O(Dmn). The new exact algorithms use a maximum cardinality matching as an initial matching, after which the weight of the matching is increased using weight-increasing paths.</div><div><br></div><div>Although MVM problems can be solved exactly in polynomial time, exact MVM algorithms are still slow in practice for large graphs with millions and even billions of edges. Hence we investigate several approximation algorithms for MVM in this thesis. First we show that a maximum vertex-weighted matching can be approximated within an approximation ratio arbitrarily close to one, to k/(k + 1), where k is related to the length of augmenting or weight-increasing paths searched by the algorithm. We identify two main approaches for designing approximation algorithms for MVM. The first approach is direct; vertices are sorted in non-increasing order of weights, and then the algorithm searches for augmenting paths of restricted length that reach a heaviest vertex. (In this approach each vertex is processed once). The second approach repeatedly searches for augmenting paths and increasing paths, again of restricted length, until none can be found. In this second, iterative approach, a vertex may need to be processed multiple times. We design two approximation algorithms based on the direct approach with approximation ratios of 1/2 and 2/3. The time complexities of the 1/2-approximation algorithm is O(m + n log n), and that of the 2/3-approximation algorithm is O(mlogD). Employing the second approach, we design 1/2- and 2/3-approximation algorithms for MVM with time complexities of O(Dm) and O(D<sup>2</sup>m), respectively. We show that the iterative algorithm can be generalized to nd a k/(k+1)-approximate MVM with a time complexity of O(D<sup>k</sup>m). In addition, we design parallel 1/2- and 2/3-approximation algorithms for a shared memory programming model, and introduce a new technique for locking augmenting paths to avoid deadlock and related problems. </div><div><br></div><div>MVM problems may be solved using algorithms for the maximum edge-weighted matching (MEM) by assigning to each edge a weight equal to the sum of the vertex weights on its endpoints. However, our results will show that this is one way to generate MEM problems that are difficult to solve. On such problems, exact MEM algorithms may require run times that are a factor of a thousand or more larger than the time of an exact MVM algorithm. Our results show the competitiveness of the new exact algorithms by demonstrating that they outperform MEM exact algorithms. Specifically, our fastest exact algorithm runs faster than the fastest MEM implementation by a factor of 37 and 18 on geometric mean, using two different sets of weights on our test problems. In some instances, the factor can be higher than 500. Moreover, extensive experimental results show that the MVM approximation algorithm outperforms an MEM approximation algorithm with the same approximation ratio, with respect to matching weight and run time. Indeed, our results show that the MVM approximation algorithm outperforms the corresponding MEM algorithm with respect to these metrics in both serial and parallel settings.</div> Read more Theoretical Computer Science Analysis of Algorithms and Complexity Applied Discrete Mathematics Vertex-weighted matching Graph algorithms Approximation Algorithms
2	ALGORITHMS FOR DEGREE-CONSTRAINED SUBGRAPHS AND APPLICATIONS S M Ferdous (11804924) 19 December 2021 (has links) A degree-constrained subgraph construction (DCS) problem aims to find an optimal spanning subgraph (w.r.t an objective function) subject to certain degree constraints on the vertices. DCS generalizes many combinatorial optimization problems such as Matchings and Edge Covers and has many practical and real-world applications. This thesis focuses on DCS problems where there are only upper and lower bounds on the degrees, known as b-matching and b-edge cover problems, respectively. We explore linear and submodular functions as the objective functions of the subgraph construction.<br><br>The contributions of this thesis involve both the design of new approximation algorithms for these DCS problems, and also their applications to real-world contexts.<br>We designed, developed, and implemented several approximation algorithms for DCS problems. Although some of these problems can be solved exactly in polynomial time, often these algorithms are expensive, tedious to implement, and have little to no concurrency. On the contrary, many of the approximation algorithms developed here run in nearly linear time, are simple to implement, and are concurrent. Using the local dominance framework, we developed the first parallel algorithm submodular b-matching. For weighted b-edge cover, we improved the classic Greedy algorithm using the lazy evaluation technique. We also propose and analyze several approximation algorithms using the primal-dual linear programming framework and reductions to matching. We evaluate the practical performance of these algorithms through extensive experimental results.<br><br>The second contribution of the thesis is to utilize the novel algorithms in real-world applications. We employ submodular b-matching to generate a balanced task assignment for processors to build Fock matrices in the NWChemEx quantum chemistry software. Our load-balanced assignment results in a four-fold speedup per iteration of the Fock matrix computation and scales to 14,000 cores of the Summit supercomputer at Oak Ridge National Laboratory. Using approximate b-edge cover, we propose the first shared-memory and distributed-memory parallel algorithms for the adaptive anonymity problem. Minimum weighted b-edge cover and maximum weight b-matching are shown to be applicable to constructing graphs from datasets for machine learning tasks. We provide a mathematical optimization framework connecting the graph construction problem to the DCS problem. Read more Applied Computer Science Analysis of Algorithms and Complexity Applied Discrete Mathematics Combinatorial Scientific Computing Approximation Algorithms Parallel algorithms computational science and eingeering Graphs Applied algorithms
3	Automatic Reasoning Techniques for Non-Serializable Data-Intensive Applications Gowtham Kaki (7022108) 14 August 2019 (has links) <div> <div> <div> <p>The performance bottlenecks in modern data-intensive applications have induced database implementors to forsake high-level abstractions and trade-off simplicity and ease of reasoning for performance. Among the first casualties of this trade-off are the well-known ACID guarantees, which simplify the reasoning about concurrent database transactions. ACID semantics have become increasingly obsolete in practice due to serializable isolation – an integral aspect of ACID, being exorbitantly expensive. Databases, including the popular commercial offerings, default to weaker levels of isolation where effects of concurrent transactions are visible to each other. Such weak isolation guarantees, however, are extremely hard to reason about, and have led to serious safety violations in real applications. The problem is further complicated in a distributed setting with asynchronous state replications, where high availability and low latency requirements compel large-scale web applications to embrace weaker forms of consistency (e.g., eventual consistency) besides weak isolation. Given the serious practical implications of safety violations in data-intensive applications, there is a pressing need to extend the state-of-the-art in program verification to reach non- serializable data-intensive applications operating in a weakly-consistent distributed setting. </p> <p>This thesis sets out to do just that. It introduces new language abstractions, program logics, reasoning methods, and automated verification and synthesis techniques that collectively allow programmers to reason about non-serializable data-intensive applications in the same way as their serializable counterparts. The contributions </p> </div> </div> <div> <div> <p>xi </p> </div> </div> </div> <div> <div> <div> <p>made are broadly threefold. Firstly, the thesis introduces a uniform formal model to reason about weakly isolated (non-serializable) transactions on a sequentially consistent (SC) relational database machine. A reasoning method that relates the semantics of weak isolation to the semantics of the database program is presented, and an automation technique, implemented in a tool called ACIDifier is also described. The second contribution of this thesis is a relaxation of the machine model from sequential consistency to a specifiable level of weak consistency, and a generalization of the data model from relational to schema-less or key-value. A specification language to express weak consistency semantics at the machine level is described, and a bounded verification technique, implemented in a tool called Q9 is presented that bridges the gap between consistency specifications and program semantics, thus allowing high-level safety properties to be verified under arbitrary consistency levels. The final contribution of the thesis is a programming model inspired by version control systems that guarantees correct-by-construction <i>replicated data types</i> (RDTs) for building complex distributed applications with arbitrarily-structured replicated state. A technique based on decomposing inductively-defined data types into <i>characteristic relations</i> is presented, which is used to reason about the semantics of the data type under state replication, and eventually derive its correct-by-construction replicated variant automatically. An implementation of the programming model, called Quark, on top of a content-addressable storage is described, and the practicality of the programming model is demonstrated with help of various case studies. </p> </div> </div> </div> Read more Distributed Computing Applied Discrete Mathematics Concurrent Programming Programming Languages Data Structures Database Management transactions Concurrency control Distributed Computing System verification method database management system Model Checking and Simulation Programming languages
4	3D OBJECT DETECTION USING VIRTUAL ENVIRONMENT ASSISTED DEEP NETWORK TRAINING Ashley S Dale (8771429) 07 January 2021 (has links) <div> <div> <div> <p>An RGBZ synthetic dataset consisting of five object classes in a variety of virtual environments and orientations was combined with a small sample of real-world image data and used to train the Mask R-CNN (MR-CNN) architecture in a variety of configurations. When the MR-CNN architecture was initialized with MS COCO weights and the heads were trained with a mix of synthetic data and real world data, F1 scores improved in four of the five classes: The average maximum F1-score of all classes and all epochs for the networks trained with synthetic data is F1∗ = 0.91, compared to F1 = 0.89 for the networks trained exclusively with real data, and the standard deviation of the maximum mean F1-score for synthetically trained networks is σ∗ <sub>F1 </sub>= 0.015, compared to σF 1 = 0.020 for the networks trained exclusively with real data. Various backgrounds in synthetic data were shown to have negligible impact on F1 scores, opening the door to abstract backgrounds and minimizing the need for intensive synthetic data fabrication. When the MR-CNN architecture was initialized with MS COCO weights and depth data was included in the training data, the net- work was shown to rely heavily on the initial convolutional input to feed features into the network, the image depth channel was shown to influence mask generation, and the image color channels were shown to influence object classification. A set of latent variables for a subset of the synthetic datatset was generated with a Variational Autoencoder then analyzed using Principle Component Analysis and Uniform Manifold Projection and Approximation (UMAP). The UMAP analysis showed no meaningful distinction between real-world and synthetic data, and a small bias towards clustering based on image background. </p></div></div></div> Read more Computer Engineering Software Engineering Applied Computer Science Information Systems Signal Processing Computer Graphics Computer Vision Image Processing Pattern Recognition and Data Mining Simulation and Modelling Virtual Reality and Related Simulation Applied Discrete Mathematics Coding and Information Theory Conceptual Modelling Information Engineering and Theory Machine Learning Mask R-CNN Artificial Intelligence Image Processing 3D Image Multispectral Data Signal Processing CNN DNN Object Detection Threat Detection Virtual Environments Synthetic Dataset F1 score image segmentation methods Image Segmentation RGBD RGBD video RGBZ Algorithms MS COCO Microsoft Common Objects in Context Transfer Learning

1

Page generated in 0.0708 seconds