The minimum cost bipartite matching problem is a well-studied optimization problem in computer science and operations research, with wide-ranging applications in fields such as machine learning, economics, transportation, logistics and biology. A special instance of this problem is the computation of the p-Wasserstein distance which we define next. Given a complete bipartite graph with two disjoint sets of n points in d-dimensional Euclidean space and an integer p ≥ 1, let the cost of an edge be the p-th power of the Euclidean distance between its endpoints. The objective of this problem is to find a minimum-cost matching in this complete bipartite graph. The Hungarian algorithm is a classical method that solves this problem in O(n^3) time. There are many algorithms that have a run time better than that of the Hungarian algorithm if the graphs have non-negative integer edge costs bounded by C. Since the input points have real-valued coordinates and the Euclidean distances can be irrational, such algorithms only return an approximate matching. Thus, the Hungarian algorithm remains the fastest known algorithm to compute an exact matching. In this thesis, we implement a new algorithm in the divide and conquer framework that computes the exact p-Wasserstein distance and has a run time asymptotically better than the Hungarian algorithm for stochastic point sets. Inspired by the techniques used in the algorithm, we also design an alternate version of the Hungarian algorithm that uses a grid- based approach. Our experimental analysis shows that both of our algorithms significantly outperform the classical Hungarian algorithm. / Master of Science / Suppose we have two sets of equal number of items and a list of compatible pairs of items, where a pair is considered compatible if its items belong to different sets. A perfect matching is a subset of compatible pairs where each item is paired with exactly one other item. When trying to find a perfect matching, there may be multiple options, and minimizing the cost of the perfect matching is often desired. This is referred to as the minimum cost bipartite matching problem, which is extensively studied due to its importance in algorithmic theory and operations research. A special instance of this problem is the calculation of the p- Wasserstein distance. It has many practical applications in fields such as machine learning, economics, transportation, logistics and biology. The Hungarian algorithm is the only known algorithm that can compute the exact p-Wasserstein distance. Therefore, our focus is to develop exact algorithms for this problem that perform better than the Hungarian algorithm and can scale efficiently to large datasets.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/115662 |
Date | 06 July 2023 |
Creators | Gattani, Akshaykumar Gopalkrishna |
Contributors | Computer Science and Applications, Raghvendra, Sharath, Heath, Lenwood S., Ji, Bo |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Thesis |
Format | ETD, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0024 seconds