Spelling suggestions: "subject:"wasserstein"" "subject:"laserstein""
1 |
A Risk-Oriented Clustering Approach for Asset Categorization and Risk MeasurementLiu, Lu 18 July 2019 (has links)
When faced with market risk for investments and portfolios, people often calculate the
risk measure, which is a real number mapping to each random payoff. There are many ways to quantify the potential risk, among which the most important input is the features from future performance. Future distributions are unknown and thus always estimated from historical Profit and Loss (P&L) distributions. However, past data may not be appropriate for estimating the future; risk measures generated from single historical distributions can be subject to error. To overcome these shortcomings, one natural way implemented is to identify and categorize similar assets whose Profit and Loss distributions can be used as alternative scenarios.
In practice, one of the most common and intuitive categorizations is sector, based on industry. It is widely agreed that companies in the same sector share the same, or related, business types and operating characteristics. But in the field of risk management, sector-based categorization does not necessarily mean assets are grouped in terms of their risk profiles, and we show that risk measures in the same sector tend to have large variation. Although improved risk measures related to the distribution ambiguity has been discussed at length, we seek to develop a more risk-oriented categorization by providing a new clustering approach. Furthermore, our method can better inform us of the potential risk and the extreme worst-case scenario within the same category.
|
2 |
Algorithms for Optimal Transport and Wasserstein DistancesSchrieber, Jörn 14 February 2019 (has links)
No description available.
|
3 |
Weak Solutions to a Fractional Fokker-Planck Equation via Splitting and Wasserstein Gradient FlowBowles, Malcolm 22 August 2014 (has links)
In this thesis, we study a linear fractional Fokker-Planck equation that models non-local (`fractional') diffusion in the presence of a potential field. The non-locality is due to the appearance of the `fractional Laplacian' in the corresponding PDE, in place of the classical Laplacian which distinguishes the case of regular (Gaussian) diffusion.
Motivated by the observation that, in contrast to the classical Fokker-Planck equation (describing regular diffusion in the presence of a potential field), there is no natural gradient flow formulation for its fractional counterpart, we prove existence of weak solutions to this fractional Fokker-Planck equation by combining a splitting technique together with a Wasserstein gradient flow formulation. An explicit iterative construction is given, which we prove weakly converges to a weak solution of this PDE. / Graduate
|
4 |
Scalable Combinatorial Algorithms for Optimal Transport Based Similarity MetricsZhang, Kaiyi 22 November 2024 (has links)
Optimal Transport (OT), also known as Wasserstein distance, is a valuable metric for comparing probability distributions. Owing to its appealing statistical properties, researchers in various fields, such as machine learning, use OT within applications. However, computing both exact and approximate OT is computationally expensive and impractical for large datasets. Furthermore, OT is sensitive to small noise in the input distributions. In this document, we propose to use combinatorial methods to design scalable and noise-resistant solutions for OT.
We present four key contributions in this work. First, we introduce a novel combinatorial parallel algorithm for approximating OT, which achieves a parallel time complexity of $O(log n/varepsilon^2)$, where $n$ is the input size and $varepsilon$ is the addtitive error, Our algorithm outperforms the state-of-the-art in experiments. Second, we propose a new concept, OT-profile, representing the function of minimum partial optimal transport cost $sigma_alpha$ versus the transported mass $alpha$. This can be used to identify outliers in real-world data. The utility of OT-profile is demonstrated in outlier detection and PU-learning jobs and outperforms the state-of-the-art. Third, building upon the OT-profile, we propose a new OT-based metric for comparing distributions that is more robust to noise. This metric preserves desirable properties while reducing its sensitivity to noise for high $p$ values, providing a robust solution for real-world datasets. Lastly, we have developed a Python library that integrates our algorithms and methods into a user-friendly framework, making it easier for practitioners to adopt our methods. Our work enhances the computational efficiency and robustness of OT, making it practical for machine learning applicaitons. / Doctor of Philosophy / Imagine a task in which you are required to fill up a set of holes with piles of sand using the least amount of effort. The idea behind this is Optimal Transport (OT), also known as the Wasserstein distance. This distance is a popular math tool that measures the similarity (or distance) between two probability distributions. This tool has many applications in different areas, including machine learning and data analysis. For example, it is employed in generative models to measure the difference between generated and real images, or in analyzing real-word datasets containing noise.
However, the practice of OT faces two major problems. First, the computation becomes expensive with larger datasets, making it infeasible for large-scale applications. Therefore, it is important to make the computation of OT more efficient in processing a large volume of data. Second, OT is sensitive to outliers or noise in the input distributions, which can significantly influence the results.
In this thesis, we try to address these challenges through four main contributions: First, we present novel parallel combinatorial algorithms that reduce the OT computational burden. Our approach achieves a state-of-the-art time complexity and, at the same time, is easy to implement. Second, we introduce an innovative method for computing an 'OT-profile', which enables the to identify the outliers from input distributions. This approach improves the robustness of OT in real-world scenarios where data always contains noise and perturbation.
Third, building on the OT-profile concept, we propose a new robust OT-based metric for comparing distributions in the presence of noise. To facilitate the impact of these advancements, we have open-sourced a Python library that implements our methods in a user-friendly interface, making it easier for researchers and practitioners to apply our solutions to their data analysis tasks. This work not only advances the theoretical analysis of OT but also contributes to future practical applications. By addressing the two challenges in scalability and robustness, we made the application of OT more robust and easier with large-scale data.
|
5 |
Comportement asymptotique de processus avec sauts et applications pour des modèles avec branchement / Asymptotic behavior of jump processes and applications for branching modelsCloez, Bertrand 14 June 2013 (has links)
L'objectif de ce travail est d'étudier le comportement en temps long d'un modèle de particules avec une interaction de type branchement. Plus précisément, les particules se déplacent indépendamment suivant une dynamique markovienne jusqu'au temps de branchement, où elles donnent naissance à de nouvelles particules dont la position dépend de celle de leur mère et de son nombre d'enfants. Dans la première partie de ce mémoire nous omettons le branchement et nous étudions le comportement d'une seule lignée. Celle-ci est modélisée via un processus de Markov qui peut admettre des sauts, des parties diffusives ou déterministes par morceaux. Nous quantifions la convergence de ce processus hybride à l'aide de la courbure de Wasserstein, aussi nommée courbure grossière de Ricci. Cette notion de courbure, introduite récemment par Joulin, Ollivier, et Sammer correspond mieux à l'étude des processus avec sauts. Nous établissons une expression du gradient du semigroupe des processus de Markov stochastiquement monotone, qui nous permet d'expliciter facilement leur courbure. D'autres bornes fines de convergence en distance de Wasserstein et en variation totale sont aussi établies. Dans le même contexte, nous démontrons qu'un processus de Markov, qui change de dynamique suivant un processus discret, converge rapidement vers un équilibre, lorsque la moyenne des courbures des dynamiques sous-jacentes est strictement positive. Dans la deuxième partie de ce mémoire, nous étudions le comportement de toute la population de particules. Celui-ci se déduit du comportement d'une seule lignée grâce à une formule many-to-one, c'est-à-dire un changement de mesure de type Girsanov. Via cette transformation, nous démontrons une loi des grands nombres et établissons une limite macroscopique, pour comparer nos résultats aux résultats déjà connus en théorie des équations aux dérivées partielles. Nos résultats sont appliqués sur divers modèles ayant des applications en biologie et en informatique. Parmi ces modèles, nous étudierons le comportement en temps long de la plus grande particule dans un modèle simple de population structurée en taille / The aim of this work is to study the long time behavior of a branching particle model. More precisely, the particles move independently from each other following a Markov dynamics until the branching event. When one of these events occurs, the particle produces some random number of individuals whose position depends on the position of its mother and her number of offspring. In the first part of this thesis, we only study one particle line and we ignore the branching mechanism. So we are interested by the study of a Markov process which can jump, diffuse or be piecewise deterministic. The long time behavior of these hybrid processes is described with the notion of Wasserstein or coarse Ricci curvature. This notion of curvature, introduced by Joulin, Ollivier and Sammer, is more appropriate for the study of processes with jumps. We establish an expression of the gradient of the Markov semigroup of stochastically monotone processes which gives the curvature of these processes. Others sharp bounds of convergence, in Wasserstein distance and total variation distance, are also established. In the same way, we prove that if a Markov process evolves according to one of finitely many underlying Markovian dynamics, with a choice of dynamics that changes at the jump times of a second Markov process, then it is exponentially ergodic, under the assumption that the mean of the curvature of the underlying dynamics is positive. In the second part of the work, we study all the population. Its behaviour can be deduced to the study of the first part using a Girsavov-type transform which is called a many-to-one formula. Using this relation, we establish a law of large numbers and a macroscopic limit, in order to compare our results to the well know results on deterministic setting. Several examples, based on biology and computer science problems, illustrate our results, including the study of the largest individual in a size-structured population model
|
6 |
Comportamiento asintótico de los procesos de Markov deterministas por pedazosChristen, Alejandra January 2012 (has links)
Doctor en Ciencias de la Ingeniería, Mención Modelación Matemática / En esta tesis doctoral se abordan dos problemas relacionados con el comportamiento en tiempo largo de los procesos de Markov deterministas por pedazos (PDMP). En primer lugar se estudia el comportamiento asintótico de un PDMP general en relación con el comportamiento y propiedades de una cadena de Markov a tiempo discreto embuída. Este problema se desarrolla en el Capítulo 1. En segundo lugar, se considera un PDMP específico llamado Proceso del tamaño de ventana del TCP (sigla en inglés del protocolo de control de transmisión usado en internet).
El objetivo en este caso es encontrar tasas de convergencia explícitas al equilibrio. Este problema se estudia en el Capítulo 2.
Con respecto al primer problema, en el Cap´ıtulo 1 se relacionan las propiedades de recurrencia positiva y las medidas de probabilidad invariantes de un proceso PDMP general con las de una cadena espacio-tiempo discreta, formada por las posiciones post-salto del proceso y las longitudes de tiempo entre saltos. Esta cadena discreta se obtiene de manera simple a partir de las características locales que definen el proceso a tiempo continuo y contiene más información que la cadena discreta post-salto que ha sido habitualmente considerada. Utilizando esta cadena espacio-tiempo se puede definir un nuevo proceso a tiempo continuo asociado, formado por tres coordenadas: el proceso continuo propiamente dicho, la longitud de tiempo trancurrido desde el último tiempo de salto y la longitud de tiempo que falta para el siguiente tiempo de salto, en analogía con los procesos edad y vida residual de teoría de renovación. En este capítulo se describe completamente el equilibrio de este proceso asociado y se establece un resultado análogo de la waiting time paradox de teoría de renovación en el contexto de los PDMP.
Para el segundo problema, en el Capítulo 2 se obtienen tasas de convergencia exponencial al equilibrio en distancia Wasserstein y en la norma en variación total. Estos resultados se basan en algunos argumentos de acoplamiento nuevos y dan una respuesta a una pregunta importante sobre el protocolo de transmisión de internet TCP, que es el entender cómo la congestión del tamaño de ventana del TCP alcanza equilibrio en tiempo largo.
|
7 |
Hamiltonian systems and the calculus of differential forms on the Wasserstein spaceKim, Hwa Kil 01 June 2009 (has links)
This thesis consists of two parts. In the first part, we study stability properties of Hamiltonian systems on the Wasserstein space. Let H be a Hamiltonian satisfying conditions imposed in the work of Ambrosio and Gangbo. We regularize H via Moreau-Yosida approximation to get H[subscript Tau] and denote by μ[subscript Tau] a solution of system with the new Hamiltonian H[subscript Tau] . Suppose H[subscript Tau] converges to H as τ tends to zero. We show μ[subscript Tau] converges to μ and μ is a solution of a Hamiltonian system which is corresponding to the Hamiltonian H. At the end of first part, we give a sufficient condition for the uniqueness of Hamiltonian systems. In the second part, we develop a general theory of differential forms on the Wasserstein space. Our main result is to prove an analogue of Green's theorem for 1-forms and show that every closed 1-form on the Wasserstein space is exact. If the Wasserstein space were a manifold in the classical sense, this result wouldn't be worthy of mention. Hence, the first cohomology group, in the sense of de Rham, vanishes.
|
8 |
Convergência em distância Mallows ponderada com aplicações em somas parciais e processos empíricosSoares, Wembesom Mendes 06 July 2015 (has links)
Tese (doutorado)—Universidade de Brasília, Instituto de Ciências Exatas, 2015. / Submitted by Fernanda Percia França (fernandafranca@bce.unb.br) on 2015-12-21T13:35:23Z
No. of bitstreams: 1
2015_WembesomMendesSoares.pdf: 531592 bytes, checksum: 8e5a84b56c9fa404fbeabc136346a98e (MD5) / Approved for entry into archive by Patrícia Nunes da Silva(patricia@bce.unb.br) on 2016-01-08T11:03:16Z (GMT) No. of bitstreams: 1
2015_WembesomMendesSoares.pdf: 531592 bytes, checksum: 8e5a84b56c9fa404fbeabc136346a98e (MD5) / Made available in DSpace on 2016-01-08T11:03:16Z (GMT). No. of bitstreams: 1
2015_WembesomMendesSoares.pdf: 531592 bytes, checksum: 8e5a84b56c9fa404fbeabc136346a98e (MD5) / Nesta tese, usamos a conexão entre a distância Mallows, d_(F;G), e a convergência em distribuição para estender resultados assintóticos envolvendo somas parciais, mesmo quando d_(F;G) = 1. Para isso, produzimos vários resultados matemáticos envolvendo as distribuições ponderadas e a distância Mallows ponderada, d_;w(F;G). Provamos também a convergência em distância Mallows do processo empírico geral, um tipo particular de somas parciais baseadas numa distribuição empírica Fn, para uma variável aleatória Gaussiana e, por meio do processo quantil empírico, atestamos o mesmo modo de convergência da estatística nd22 ;w(Fn; F) para um funcional ponderado de Pontes Brownianas. ______________________________________________________________________________________________ ABSTRACT / The connection between Mallows distance, d_(F;G), and the convergence in distribution has been sucessfully used to establish asymptotic results for partial sums. We further explore this connection by making use of the weighted Mallows distance, d_;w(F;G), to extend the results for the extreme cases when d_(F;G) = 1. Our results are applied to heavy-tailed partial sums and to partial sums that arise in the context of empirical processes and their related functionals.
|
9 |
Measure Transport Approaches for Data Visualization and Learning / データの可視化と機械学習に対する測度変換によるアプローチSEGUY, Vivien Pierre François 23 July 2018 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第21318号 / 情博第675号 / 新制||情||117(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 山本 章博, 教授 山下 信雄, 教授 田中 利幸, 上田 修功 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
10 |
Robust Exact Algorithms for the Euclidean Bipartite Matching ProblemGattani, Akshaykumar Gopalkrishna 06 July 2023 (has links)
The minimum cost bipartite matching problem is a well-studied optimization problem in computer science and operations research, with wide-ranging applications in fields such as machine learning, economics, transportation, logistics and biology. A special instance of this problem is the computation of the p-Wasserstein distance which we define next. Given a complete bipartite graph with two disjoint sets of n points in d-dimensional Euclidean space and an integer p ≥ 1, let the cost of an edge be the p-th power of the Euclidean distance between its endpoints. The objective of this problem is to find a minimum-cost matching in this complete bipartite graph. The Hungarian algorithm is a classical method that solves this problem in O(n^3) time. There are many algorithms that have a run time better than that of the Hungarian algorithm if the graphs have non-negative integer edge costs bounded by C. Since the input points have real-valued coordinates and the Euclidean distances can be irrational, such algorithms only return an approximate matching. Thus, the Hungarian algorithm remains the fastest known algorithm to compute an exact matching. In this thesis, we implement a new algorithm in the divide and conquer framework that computes the exact p-Wasserstein distance and has a run time asymptotically better than the Hungarian algorithm for stochastic point sets. Inspired by the techniques used in the algorithm, we also design an alternate version of the Hungarian algorithm that uses a grid- based approach. Our experimental analysis shows that both of our algorithms significantly outperform the classical Hungarian algorithm. / Master of Science / Suppose we have two sets of equal number of items and a list of compatible pairs of items, where a pair is considered compatible if its items belong to different sets. A perfect matching is a subset of compatible pairs where each item is paired with exactly one other item. When trying to find a perfect matching, there may be multiple options, and minimizing the cost of the perfect matching is often desired. This is referred to as the minimum cost bipartite matching problem, which is extensively studied due to its importance in algorithmic theory and operations research. A special instance of this problem is the calculation of the p- Wasserstein distance. It has many practical applications in fields such as machine learning, economics, transportation, logistics and biology. The Hungarian algorithm is the only known algorithm that can compute the exact p-Wasserstein distance. Therefore, our focus is to develop exact algorithms for this problem that perform better than the Hungarian algorithm and can scale efficiently to large datasets.
|
Page generated in 0.0435 seconds