Global ETD Search

211	Optimization tools for non-asymptotic statistics in exponential families Le Priol, Rémi 04 1900 (has links) Les familles exponentielles sont une classe de modèles omniprésente en statistique. D'une part, elle peut modéliser n'importe quel type de données. En fait la plupart des distributions communes en font partie : Gaussiennes, variables catégoriques, Poisson, Gamma, Wishart, Dirichlet. D'autre part elle est à la base des modèles linéaires généralisés (GLM), une classe de modèles fondamentale en apprentissage automatique. Enfin les mathématiques qui les sous-tendent sont souvent magnifiques, grâce à leur lien avec la dualité convexe et la transformée de Laplace. L'auteur de cette thèse a fréquemment été motivé par cette beauté. Dans cette thèse, nous faisons trois contributions à l'intersection de l'optimisation et des statistiques, qui tournent toutes autour de la famille exponentielle. La première contribution adapte et améliore un algorithme d'optimisation à variance réduite appelé ascension des coordonnées duales stochastique (SDCA), pour entraîner une classe particulière de GLM appelée champ aléatoire conditionnel (CRF). Les CRF sont un des piliers de la prédiction structurée. Les CRF étaient connus pour être difficiles à entraîner jusqu'à la découverte des technique d'optimisation à variance réduite. Notre version améliorée de SDCA obtient des performances favorables comparées à l'état de l'art antérieur et actuel. La deuxième contribution s'intéresse à la découverte causale. Les familles exponentielles sont fréquemment utilisées dans les modèles graphiques, et en particulier dans les modèles graphique causaux. Cette contribution mène l'enquête sur une conjecture spécifique qui a attiré l'attention dans de précédents travaux : les modèles causaux s'adaptent plus rapidement aux perturbations de l'environnement. Nos résultats, obtenus à partir de théorèmes d'optimisation, soutiennent cette hypothèse sous certaines conditions. Mais sous d'autre conditions, nos résultats contredisent cette hypothèse. Cela appelle à une précision de cette hypothèse, ou à une sophistication de notre notion de modèle causal. La troisième contribution s'intéresse à une propriété fondamentale des familles exponentielles. L'une des propriétés les plus séduisantes des familles exponentielles est la forme close de l'estimateur du maximum de vraisemblance (MLE), ou maximum a posteriori (MAP) pour un choix naturel de prior conjugué. Ces deux estimateurs sont utilisés presque partout, souvent sans même y penser. (Combien de fois calcule-t-on une moyenne et une variance pour des données en cloche sans penser au modèle Gaussien sous-jacent ?) Pourtant la littérature actuelle manque de résultats sur la convergence de ces modèles pour des tailles d'échantillons finis, lorsque l'on mesure la qualité de ces modèles avec la divergence de Kullback-Leibler (KL). Pourtant cette divergence est la mesure de différence standard en théorie de l'information. En établissant un parallèle avec l'optimisation, nous faisons quelques pas vers un tel résultat, et nous relevons quelques directions pouvant mener à des progrès, tant en statistiques qu'en optimisation. Ces trois contributions mettent des outil d'optimisation au service des statistiques dans les familles exponentielles : améliorer la vitesse d'apprentissage de GLM de prédiction structurée, caractériser la vitesse d'adaptation de modèles causaux, estimer la vitesse d'apprentissage de modèles omniprésents. En traçant des ponts entre statistiques et optimisation, cette thèse fait progresser notre maîtrise de méthodes fondamentales d'apprentissage automatique. / Exponential families are a ubiquitous class of models in statistics. On the one hand, they can model any data type. Actually, the most common distributions are exponential families: Gaussians, categorical, Poisson, Gamma, Wishart, or Dirichlet. On the other hand, they sit at the core of generalized linear models (GLM), a foundational class of models in machine learning. They are also supported by beautiful mathematics thanks to their connection with convex duality and the Laplace transform. This beauty is definitely responsible for the existence of this thesis. In this manuscript, we make three contributions at the intersection of optimization and statistics, all revolving around exponential families. The first contribution adapts and improves a variance reduction optimization algorithm called stochastic dual coordinate ascent (SDCA) to train a particular class of GLM called conditional random fields (CRF). CRF are one of the cornerstones of structured prediction. CRF were notoriously hard to train until the advent of variance reduction techniques, and our improved version of SDCA performs favorably compared to the previous state-of-the-art. The second contribution focuses on causal discovery. Exponential families are widely used in graphical models, and in particular in causal graphical models. This contribution investigates a specific conjecture that gained some traction in previous work: causal models adapt faster to perturbations of the environment. Using results from optimization, we find strong support for this assumption when the perturbation is coming from an intervention on a cause, and support against this assumption when perturbation is coming from an intervention on an effect. These pieces of evidence are calling for a refinement of the conjecture. The third contribution addresses a fundamental property of exponential families. One of the most appealing properties of exponential families is its closed-form maximum likelihood estimate (MLE) and maximum a posteriori (MAP) for a natural choice of conjugate prior. These two estimators are used almost everywhere, often unknowingly -- how often are mean and variance computed for bell-shaped data without thinking about the Gaussian model they underly? Nevertheless, literature to date lacks results on the finite sample convergence property of the information (Kulback-Leibler) divergence between these estimators and the true distribution. Drawing on a parallel with optimization, we take some steps towards such a result, and we highlight directions for progress both in statistics and optimization. These three contributions are all using tools from optimization at the service of statistics in exponential families: improving upon an algorithm to learn GLM, characterizing the adaptation speed of causal models, and estimating the learning speed of ubiquitous models. By tying together optimization and statistics, this thesis is taking a step towards a better understanding of the fundamentals of machine learning. Apprentissage automatique famille exponentielle divergence de Bregman statistiques non-asymptotiques taux de convergence dualité convexe optimisation stochastique réduction de variance prédiction structurée causalité Machine learning exponential families Bregman divergence non-asymptotic statistics sample complexity, convex duality stochastic optimization variance reduction structured prediction causality
212	Decentralized Algorithms for Wasserstein Barycenters Dvinskikh, Darina 29 October 2021 (has links) In dieser Arbeit beschäftigen wir uns mit dem Wasserstein Baryzentrumproblem diskreter Wahrscheinlichkeitsmaße sowie mit dem population Wasserstein Baryzentrumproblem gegeben von a Fréchet Mittelwerts von der rechnerischen und statistischen Seiten. Der statistische Fokus liegt auf der Schätzung der Stichprobengröße von Maßen zur Berechnung einer Annäherung des Fréchet Mittelwerts (Baryzentrum) der Wahrscheinlichkeitsmaße mit einer bestimmten Genauigkeit. Für empirische Risikominimierung (ERM) wird auch die Frage der Regularisierung untersucht zusammen mit dem Vorschlag einer neuen Regularisierung, die zu den besseren Komplexitätsgrenzen im Vergleich zur quadratischen Regularisierung beiträgt. Der Rechenfokus liegt auf der Entwicklung von dezentralen Algorithmen zurBerechnung von Wasserstein Baryzentrum: duale Algorithmen und Sattelpunktalgorithmen. Die Motivation für duale Optimierungsmethoden ist geschlossene Formen für die duale Formulierung von entropie-regulierten Wasserstein Distanz und ihren Derivaten, während, die primale Formulierung nur in einigen Fällen einen Ausdruck in geschlossener Form hat, z.B. für Gaußsches Maß. Außerdem kann das duale Orakel, das den Gradienten der dualen Darstellung für die entropie-regulierte Wasserstein Distanz zurückgibt, zu einem günstigeren Preis berechnet werden als das primale Orakel, das den Gradienten der (entropie-regulierten) Wasserstein Distanz zurückgibt. Die Anzahl der dualen Orakel rufe ist in diesem Fall ebenfalls weniger, nämlich die Quadratwurzel der Anzahl der primalen Orakelrufe. Im Gegensatz zum primalen Zielfunktion, hat das duale Zielfunktion Lipschitz-stetig Gradient aufgrund der starken Konvexität regulierter Wasserstein Distanz. Außerdem untersuchen wir die Sattelpunktformulierung des (nicht regulierten) Wasserstein Baryzentrum, die zum Bilinearsattelpunktproblem führt. Dieser Ansatz ermöglicht es uns auch, optimale Komplexitätsgrenzen zu erhalten, und kann einfach in einer dezentralen Weise präsentiert werden. / In this thesis, we consider the Wasserstein barycenter problem of discrete probability measures as well as the population Wasserstein barycenter problem given by a Fréchet mean from computational and statistical sides. The statistical focus is estimating the sample size of measures needed to calculate an approximation of a Fréchet mean (barycenter) of probability distributions with a given precision. For empirical risk minimization approaches, the question of the regularization is also studied along with proposing a new regularization which contributes to the better complexity bounds in comparison with the quadratic regularization. The computational focus is developing decentralized algorithms for calculating Wasserstein barycenters: dual algorithms and saddle point algorithms. The motivation for dual approaches is closed-forms for the dual formulation of entropy-regularized Wasserstein distances and their derivatives, whereas the primal formulation has a closed-form expression only in some cases, e.g., for Gaussian measures.Moreover, the dual oracle returning the gradient of the dual representation forentropy-regularized Wasserstein distance can be computed for a cheaper price in comparison with the primal oracle returning the gradient of the (entropy-regularized) Wasserstein distance. The number of dual oracle calls in this case will be also less, i.e., the square root of the number of primal oracle calls. Furthermore, in contrast to the primal objective, the dual objective has Lipschitz continuous gradient due to the strong convexity of regularized Wasserstein distances. Moreover, we study saddle-point formulation of the non-regularized Wasserstein barycenter problem which leads to the bilinear saddle-point problem. This approach also allows us to get optimal complexity bounds and it can be easily presented in a decentralized setup. optimaler Transport Wasserstein Baryzentrum stochastische Optimierung dezentrale Optimierung Orakel erster Ordnung optimal transport Wasserstein barycenter stochastic optimization decentralized optimization distributed optimization primal-dual methods first-order oracle 510 Mathematik SK 800 ddc:510 ddc:519
213	[en] ENSURING RESERVE DEPLOYMENT IN HYDROTHERMAL POWER SYSTEMS PLANNING / [pt] GARANTINDO A ENTREGABILIDADE DE RESERVAS NO PLANEJAMENTO DE SISTEMAS DE POTÊNCIA HIDROTÉRMICOS ARTHUR DE CASTRO BRIGATTO 03 November 2016 (has links) [pt] Atualmente a metodologia correspondente ao estado da arte utilizada para o planejamento de médio-/longo-prazo da operação de sistemas elétricos de potência é a Programação Dual Dinâmica Estocástica (PDDE). No entanto, a tratabilidade computacional proporcionada por este método ainda requer simplificaçõeses consideráveis de detalhes de sistemas reais de maneira a atingir performaces aceitáveis em aplicações práticas. Simplificações feitas no estágio de planejamento em contraste com a implementação das decisões podem induzir políticas temporalmente inconsistentes e, consequentemente, um gap de sub-otimalidade. Inconsisência temporal em planejamento hidrotérmico pode ser induzida, por exemplo, ao assumir um coeficiente de produtividade constante para as hidrelétricas, ao agregar os reservatórios, ao negligenciar a segunda lei de Kirchhoff e neglienciando-se critérios de segurança em modelos de planejamento. As mesmas restrições são posteriormente consideradas na etapa de implementação do sistema. Esse fato pode estar envolvido com esvaziamento não planejado de reservatórios e entregabilidade inadequada de reservas girantes. Ambos podem levar a altos custos operacionais. Além disso, o sistema pode ficar exposto a um risco sistêmico de racionamento e em última instâcia, blackouts. O gap de sub-otimalidade pode também levar a distorções em mercados de energia. Assim, é razoável que as consequências da inconstência temporal em sistemas hidrotérmicos sejam estudadas. Nesse sentido, este trabalho propõe uma extensão de trabalhos já realizados relacionados à inconsistência temporal para medir os efeitos de simplificações de modelagem em modelos de planejamento resolvidos pela PDDE. A abordagem proposta consiste em usar um modelo simplificado para o planejamento do sistema, que é feito pela avaliação da função de recurso, e um modelo detalhado para a sua operação. Estudos de caso envolvendo simplificações em modelagem de linhas de transmissão e critérios de segurança são realizados. No entanto, o foco deste trabalho se dará na segunda fonte, já que a mesma apresenta maior complexidade na caracterização do efeito. No entanto, a incorporação de critérios de segurança é um grande desafio para operadores de sistemas elétricos, pois o tamanho do modelo tende a crescer exponencialmente quando critérios de segurança reforçados são aplicados. Motivado por isso, o principal objetivo deste trabalho é propor uma nova abordagem ao problema que permite que critérios de segurança possam ser incorporados em modelos de planejamento e consequentemente garantir a entregabilidade de reservas em políticas de planejamento. A formulação do problema é uma extensão multiperiodo e estocástica the modelos de Otimização Robusta Ajustável que já foram propostos na literatura para resolver o problema relacionado à dimensionalidade para um período. A metodologia de solução involve um algoritmo híbrido Robusto-PDDE que por meio do compartilhamento de estados de contingência ativos entre os períodos e cenários de afluência é capaz de atingir tratabilidade computacional. Com a nova abordagem proposta, é possível (i) resolver o problema de agendamento ótimo das reservas em sistemas hidrotérmicos garantindo a entregabilidade das reservas em um critério n - K e (ii) calcular o custo e os efeitos negativos de se negligenciar critérios de segurança no planejamento. / [en] The current state of the art method used for medium/long-term planning studies of hydrothermal power system operation is the Stochastic Dual Dynamic Programming (SDDP) algorithm. The computational savings provided by this method notwithstanding, it still relies on major system simplifications to achieve acceptable performances in practical applications. Simplifications in the planning stage in contrast to the actual implementation might induce time inconsistent policies and, consequently, a sub-optimality gap. Time inconsistency in hydrothermal planning might be induced by, for instance, assuming a constant coefficient production for hydro plants, reservoir aggregation, neglecting Kirchhoff s voltage law, and neglecting security criteria in planning models, which are then incorporated in implementating models. Unaccounted for reservoir depletion and inadequate spinning reserve deliverability situations that were observed in the Brazilian power system might be induced by time inconsistency. And this can lead to higher operational costs. Both these consequences are utterly negative since they pose the system to a great systemic risk of energy rationing or ultimately, system blackouts. In addition, the suboptimility gap may also lead to energy markets distortions. Hence, it seems reasonable that further investigations on consequences of time inconsistency in hydrothermal planning should be undertaken. Along these lines, this work proposes an extension to previous work on the subject of time inconsistency to measure the effects of modeling simplifications in the SDDP framework for hydrothermal operation planning. The approach consists of using a simplified model for planning the system, which is done by means of the assessment of the recourse (cost-to-go) function, and a detailed model for its operation (implementation of the policy). Case studies involving simplifications in transmission lines modeling and in security criteria are carried out. Nevertheless, the focus of this work is on the later source as it is more difficult to address due to the complexity involved in the characterization of this effect. However, incorporating security criteria in planning models poses a major challenge to system operators. This is because the size of the model tends to grow exponentially as tighter security criteria are adopted. Motivated by this, the main objective of this work is to propose a new framework that allows security criteria to be incorporated in planning models and consequently ensure reserve deliverability in planning policies. The problem formulation is a multiperiod stochastic extension of Adjustable Robust Optimization (ARO) based models already proposed in literature to successfully address the dimensionality issue regarding the incorporation of security criteria n - K and its variants. The solution methodology involves a hybrid Robust-SDDP algorithm that by means of sharing active contingency states amongst periods and possible inflow scenarios in the SDDP algorithm is capable of achieving computational tractability. Then, with the proposed approach it is possible to (i) address the optimal scheduling of energy and reserve in hydrothermal power systems ensuring reserve deliverability under an n - K security criterion and (ii) assess the cost and side effects of disregarding security criteria in the planning stage. [pt] OTIMIZACAO ESTOCASTICA [pt] INCONSISTENCIA TEMPORAL [pt] OTIMIZACAO ROBUSTA [en] STOCHASTIC OPTIMIZATION [en] TIME INCONSISTENCY [en] ROBUST OPTIMIZATION

Page generated in 0.0973 seconds