Global ETD Search

21	PHYSICS INFORMED MACHINE LEARNING METHODS FOR UNCERTAINTY QUANTIFICATION Sharmila Karumuri (14226875) 17 May 2024 (has links) <p>The need to carry out Uncertainty quantification (UQ) is ubiquitous in science and engineering. However, carrying out UQ for real-world problems is not straightforward and they require a lot of computational budget and resources. The objective of this thesis is to develop computationally efficient approaches based on machine learning to carry out UQ. Specifically, we addressed two problems.</p> <p><br></p> <p>The first problem is, it is difficult to carry out Uncertainty propagation (UP) in systems governed by elliptic PDEs with spatially varying uncertain fields in coefficients and boundary conditions. Here as we have functional uncertainties, the number of uncertain parameters is large. Unfortunately, in these situations to carry out UP we need to solve the PDE a large number of times to obtain convergent statistics of the quantity governed by the PDE. However, solving the PDE by a numerical solver repeatedly leads to a computational burden. To address this we proposed to learn the surrogate of the solution of the PDE in a data-free manner by utilizing the physics available in the form of the PDE. We represented the solution of the PDE as a deep neural network parameterized function in space and uncertain parameters. We introduced a physics-informed loss function derived from variational principles to learn the parameters of the network. The accuracy of the learned surrogate is validated against the corresponding ground truth estimate from the numerical solver. We demonstrated the merit of using our approach by solving UP problems and inverse problems faster than by using a standard numerical solver.</p> <p><br></p> <p>The second problem we focused on in this thesis is related to inverse problems. State of the art approach to solving inverse problems involves posing the inverse problem as a Bayesian inference task and estimating the distribution of input parameters conditioned on the observed data (posterior). Markov Chain Monte Carlo (MCMC) methods and variational inference methods provides us ways to estimate the posterior. However, these inference techniques need to be re-run whenever a new set of observed data is given leading to a computational burden. To address this, we proposed to learn a Bayesian inverse map i.e., the map from the observed data to the posterior. This map enables us to do on-the-fly inference. We demonstrated our approach by solving various examples and we validated the posteriors learned from our approach against corresponding ground truth posteriors from the MCMC method.</p> Physics-informed machine learning Physics-informed neural networks High-dimensional uncertainty propagation Curse of dimensionality Energy functional Inverse problems On-the-fly inference Bayesian inverse map
22	Efficient estimation using the characteristic function : theory and applications with high frequency data Kotchoni, Rachidi 05 1900 (has links) The attached file is created with Scientific Workplace Latex / Nous abordons deux sujets distincts dans cette thèse: l'estimation de la volatilité des prix d'actifs financiers à partir des données à haute fréquence, et l'estimation des paramétres d'un processus aléatoire à partir de sa fonction caractéristique. Le chapitre 1 s'intéresse à l'estimation de la volatilité des prix d'actifs. Nous supposons que les données à haute fréquence disponibles sont entachées de bruit de microstructure. Les propriétés que l'on prête au bruit sont déterminantes dans le choix de l'estimateur de la volatilité. Dans ce chapitre, nous spécifions un nouveau modèle dynamique pour le bruit de microstructure qui intègre trois propriétés importantes: (i) le bruit peut être autocorrélé, (ii) le retard maximal au delà duquel l'autocorrélation est nulle peut être une fonction croissante de la fréquence journalière d'observations; (iii) le bruit peut avoir une composante correlée avec le rendement efficient. Cette dernière composante est alors dite endogène. Ce modèle se différencie de ceux existant en ceci qu'il implique que l'autocorrélation d'ordre 1 du bruit converge vers 1 lorsque la fréquence journalière d'observation tend vers l'infini. Nous utilisons le cadre semi-paramétrique ainsi défini pour dériver un nouvel estimateur de la volatilité intégrée baptisée "estimateur shrinkage". Cet estimateur se présente sous la forme d'une combinaison linéaire optimale de deux estimateurs aux propriétés différentes, l'optimalité étant défini en termes de minimisation de la variance. Les simulations indiquent que l'estimateur shrinkage a une variance plus petite que le meilleur des deux estimateurs initiaux. Des estimateurs sont également proposés pour les paramètres du modèle de microstructure. Nous clôturons ce chapitre par une application empirique basée sur des actifs du Dow Jones Industrials. Les résultats indiquent qu'il est pertinent de tenir compte de la dépendance temporelle du bruit de microstructure dans le processus d'estimation de la volatilité. Les chapitres 2, 3 et 4 s'inscrivent dans la littérature économétrique qui traite de la méthode des moments généralisés. En effet, on rencontre en finance des modèles dont la fonction de vraisemblance n'est pas connue. On peut citer en guise d'exemple la loi stable ainsi que les modèles de diffusion observés en temps discrets. Les méthodes d'inférence basées sur la fonction caractéristique peuvent être envisagées dans ces cas. Typiquement, on spécifie une condition de moment basée sur la différence entre la fonction caractéristique (conditionnelle) théorique et sa contrepartie empirique. Le défit ici est d'exploiter au mieux le continuum de conditions de moment ainsi spécifié pour atteindre la même efficacité que le maximum de vraisemblance dans les inférences. Ce défit a été relevé par Carrasco et Florens (2000) qui ont proposé la procédure CGMM (continuum GMM). La fonction objectif que ces auteurs proposent est une forme quadratique hilbertienne qui fait intervenir l'opérateur inverse de covariance associé au continuum de condition de moments. Cet opérateur inverse est régularisé à la Tikhonov pour en assurer l'existence globale et la continuité. Carrasco et Florens (2000) ont montré que l'estimateur obtenu en minimisant cette forme quadratique est asymptotiquement aussi efficace que l'estimateur du maximum de vraisemblance si le paramètre de régularisation (α) tend vers zéro lorsque la taille de l'échatillon tend vers l'infini. La nature de la fonction objectif du CGMM soulève deux questions importantes. La première est celle de la calibration de α en pratique, et la seconde est liée à la présence d'intégrales multiples dans l'expression de la fonction objectif. C'est à ces deux problématiques qu'essayent de répondent les trois derniers chapitres de la présente thèse. Dans le chapitre 2, nous proposons une méthode de calibration de α basée sur la minimisation de l'erreur quadratique moyenne (EQM) de l'estimateur. Nous suivons une approche similaire à celle de Newey et Smith (2004) pour calculer un développement d'ordre supérieur de l'EQM de l'estimateur CGMM de sorte à pouvoir examiner sa dépendance en α en échantillon fini. Nous proposons ensuite deux méthodes pour choisir α en pratique. La première se base sur le développement de l'EQM, et la seconde se base sur des simulations Monte Carlo. Nous montrons que la méthode Monte Carlo délivre un estimateur convergent de α optimal. Nos simulations confirment la pertinence de la calibration de α en pratique. Le chapitre 3 essaye de vulgariser la théorie du chapitre 2 pour les modèles univariés ou bivariés. Nous commençons par passer en revue les propriétés de convergence et de normalité asymptotique de l'estimateur CGMM. Nous proposons ensuite des recettes numériques pour l'implémentation. Enfin, nous conduisons des simulations Monte Carlo basée sur la loi stable. Ces simulations démontrent que le CGMM est une méthode fiable d'inférence. En guise d'application empirique, nous estimons par CGMM un modèle de variance autorégressif Gamma. Les résultats d'estimation confirment un résultat bien connu en finance: le rendement est positivement corrélé au risque espéré et négativement corrélé au choc sur la volatilité. Lorsqu'on implémente le CGMM, une difficulté majeure réside dans l'évaluation numérique itérative des intégrales multiples présentes dans la fonction objectif. Les méthodes de quadrature sont en principe parmi les plus précises que l'on puisse utiliser dans le présent contexte. Malheureusement, le nombre de points de quadrature augmente exponentiellement en fonction de la dimensionalité (d) des intégrales. L'utilisation du CGMM devient pratiquement impossible dans les modèles multivariés et non markoviens où d≥3. Dans le chapitre 4, nous proposons une procédure alternative baptisée "reéchantillonnage dans le domaine fréquentielle" qui consiste à fabriquer des échantillons univariés en prenant une combinaison linéaire des éléments du vecteur initial, les poids de la combinaison linéaire étant tirés aléatoirement dans un sous-espace normalisé de ℝ^{d}. Chaque échantillon ainsi généré est utilisé pour produire un estimateur du paramètre d'intérêt. L'estimateur final que nous proposons est une combinaison linéaire optimale de tous les estimateurs ainsi obtenus. Finalement, nous proposons une étude par simulation et une application empirique basées sur des modèles autorégressifs Gamma. Dans l'ensemble, nous faisons une utilisation intensive du bootstrap, une technique selon laquelle les propriétés statistiques d'une distribution inconnue peuvent être estimées à partir d'un estimé de cette distribution. Nos résultats empiriques peuvent donc en principe être améliorés en faisant appel aux connaissances les plus récentes dans le domaine du bootstrap. / In estimating the integrated volatility of financial assets using noisy high frequency data, the time series properties assumed for the microstructure noise determines the proper choice of the volatility estimator. In the first chapter of the current thesis, we propose a new model for the microstructure noise with three important features. First of all, our model assumes that the noise is L-dependent. Secondly, the memory lag L is allowed to increase with the sampling frequency. And thirdly, the noise may include an endogenous part, that is, a piece that is correlated with the latent returns. The main difference between this microstructure model and existing ones is that it implies a first order autocorrelation that converges to 1 as the sampling frequency goes to infinity. We use this semi-parametric model to derive a new shrinkage estimator for the integrated volatility. The proposed estimator makes an optimal signal-to-noise trade-off by combining a consistent estimators with an inconsistent one. Simulation results show that the shrinkage estimator behaves better than the best of the two combined ones. We also propose some estimators for the parameters of the noise model. An empirical study based on stocks listed in the Dow Jones Industrials shows the relevance of accounting for possible time dependence in the noise process. Chapters 2, 3 and 4 pertain to the generalized method of moments based on the characteristic function. In fact, the likelihood functions of many financial econometrics models are not known in close form. For example, this is the case for the stable distribution and a discretely observed continuous time model. In these cases, one may estimate the parameter of interest by specifying a moment condition based on the difference between the theoretical (conditional) characteristic function and its empirical counterpart. The challenge is then to exploit the whole continuum of moment conditions hence defined to achieve the maximum likelihood efficiency. This problem has been solved in Carrasco and Florens (2000) who propose the CGMM procedure. The objective function of the CGMM is a quadrqtic form on the Hilbert space defined by the moment function. That objective function depends on a Tikhonov-type regularized inverse of the covariance operator associated with the moment function. Carrasco and Florens (2000) have shown that the estimator obtained by minimizing the proposed objective function is asymptotically as efficient as the maximum likelihood estimator provided that the regularization parameter (α) converges to zero as the sample size goes to infinity. However, the nature of this objective function raises two important questions. First of all, how do we select α in practice? And secondly, how do we implement the CGMM when the multiplicity (d) of the integrals embedded in the objective-function d is large. These questions are tackled in the last three chapters of the thesis. In Chapter 2, we propose to choose α by minimizing the approximate mean square error (MSE) of the estimator. Following an approach similar to Newey and Smith (2004), we derive a higher-order expansion of the estimator from which we characterize the finite sample dependence of the MSE on α. We provide two data-driven methods for selecting the regularization parameter in practice. The first one relies on the higher-order expansion of the MSE whereas the second one uses only simulations. We show that our simulation technique delivers a consistent estimator of α. Our Monte Carlo simulations confirm the importance of the optimal selection of α. The goal of Chapter 3 is to illustrate how to efficiently implement the CGMM for d≤2. To start with, we review the consistency and asymptotic normality properties of the CGMM estimator. Next we suggest some numerical recipes for its implementation. Finally, we carry out a simulation study with the stable distribution that confirms the accuracy of the CGMM as an inference method. An empirical application based on the autoregressive variance Gamma model led to a well-known conclusion: investors require a positive premium for bearing the expected risk while a negative premium is attached to the unexpected risk. In implementing the characteristic function based CGMM, a major difficulty lies in the evaluation of the multiple integrals embedded in the objective function. Numerical quadratures are among the most accurate methods that can be used in the present context. Unfortunately, the number of quadrature points grows exponentially with d. When the data generating process is Markov or dependent, the accurate implementation of the CGMM becomes roughly unfeasible when d≥3. In Chapter 4, we propose a strategy that consists in creating univariate samples by taking a linear combination of the elements of the original vector process. The weights of the linear combinations are drawn from a normalized set of ℝ^{d}. Each univariate index generated in this way is called a frequency domain bootstrap sample that can be used to compute an estimator of the parameter of interest. Finally, all the possible estimators obtained in this fashion can be aggregated to obtain the final estimator. The optimal aggregation rule is discussed in the paper. The overall method is illustrated by a simulation study and an empirical application based on autoregressive Gamma models. This thesis makes an extensive use of the bootstrap, a technique according to which the statistical properties of an unknown distribution can be estimated from an estimate of that distribution. It is thus possible to improve our simulations and empirical results by using the state-of-the-art refinements of the bootstrap methodology. Integrated volatility Volatilité intégré Method of moment Méthode des moments Microstructure noise Bruit de microstructure Realized Kernel Volatilité réalisée à Noyaux Shrinkage estimator Continuum of moment conditions Continuum de conditions de moments Characteristic function Fonction caracteristique Curse of dimensionality Malédiction de la dimensionalité Stochastic expansion Expansion stochastique Bootstrap Bootstrap
23	Apprentissage machine efficace : théorie et pratique Delalleau, Olivier 03 1900 (has links) Malgré des progrès constants en termes de capacité de calcul, mémoire et quantité de données disponibles, les algorithmes d'apprentissage machine doivent se montrer efficaces dans l'utilisation de ces ressources. La minimisation des coûts est évidemment un facteur important, mais une autre motivation est la recherche de mécanismes d'apprentissage capables de reproduire le comportement d'êtres intelligents. Cette thèse aborde le problème de l'efficacité à travers plusieurs articles traitant d'algorithmes d'apprentissage variés : ce problème est vu non seulement du point de vue de l'efficacité computationnelle (temps de calcul et mémoire utilisés), mais aussi de celui de l'efficacité statistique (nombre d'exemples requis pour accomplir une tâche donnée). Une première contribution apportée par cette thèse est la mise en lumière d'inefficacités statistiques dans des algorithmes existants. Nous montrons ainsi que les arbres de décision généralisent mal pour certains types de tâches (chapitre 3), de même que les algorithmes classiques d'apprentissage semi-supervisé à base de graphe (chapitre 5), chacun étant affecté par une forme particulière de la malédiction de la dimensionalité. Pour une certaine classe de réseaux de neurones, appelés réseaux sommes-produits, nous montrons qu'il peut être exponentiellement moins efficace de représenter certaines fonctions par des réseaux à une seule couche cachée, comparé à des réseaux profonds (chapitre 4). Nos analyses permettent de mieux comprendre certains problèmes intrinsèques liés à ces algorithmes, et d'orienter la recherche dans des directions qui pourraient permettre de les résoudre. Nous identifions également des inefficacités computationnelles dans les algorithmes d'apprentissage semi-supervisé à base de graphe (chapitre 5), et dans l'apprentissage de mélanges de Gaussiennes en présence de valeurs manquantes (chapitre 6). Dans les deux cas, nous proposons de nouveaux algorithmes capables de traiter des ensembles de données significativement plus grands. Les deux derniers chapitres traitent de l'efficacité computationnelle sous un angle différent. Dans le chapitre 7, nous analysons de manière théorique un algorithme existant pour l'apprentissage efficace dans les machines de Boltzmann restreintes (la divergence contrastive), afin de mieux comprendre les raisons qui expliquent le succès de cet algorithme. Finalement, dans le chapitre 8 nous présentons une application de l'apprentissage machine dans le domaine des jeux vidéo, pour laquelle le problème de l'efficacité computationnelle est relié à des considérations d'ingénierie logicielle et matérielle, souvent ignorées en recherche mais ô combien importantes en pratique. / Despite constant progress in terms of available computational power, memory and amount of data, machine learning algorithms need to be efficient in how they use them. Although minimizing cost is an obvious major concern, another motivation is to attempt to design algorithms that can learn as efficiently as intelligent species. This thesis tackles the problem of efficient learning through various papers dealing with a wide range of machine learning algorithms: this topic is seen both from the point of view of computational efficiency (processing power and memory required by the algorithms) and of statistical efficiency (n umber of samples necessary to solve a given learning task).The first contribution of this thesis is in shedding light on various statistical inefficiencies in existing algorithms. Indeed, we show that decision trees do not generalize well on tasks with some particular properties (chapter 3), and that a similar flaw affects typical graph-based semi-supervised learning algorithms (chapter 5). This flaw is a form of curse of dimensionality that is specific to each of these algorithms. For a subclass of neural networks, called sum-product networks, we prove that using networks with a single hidden layer can be exponentially less efficient than when using deep networks (chapter 4). Our analyses help better understand some inherent flaws found in these algorithms, and steer research towards approaches that may potentially overcome them. We also exhibit computational inefficiencies in popular graph-based semi-supervised learning algorithms (chapter 5) as well as in the learning of mixtures of Gaussians with missing data (chapter 6). In both cases we propose new algorithms that make it possible to scale to much larger datasets. The last two chapters also deal with computational efficiency, but in different ways. Chapter 7 presents a new view on the contrastive divergence algorithm (which has been used for efficient training of restricted Boltzmann machines). It provides additional insight on the reasons why this algorithm has been so successful. Finally, in chapter 8 we describe an application of machine learning to video games, where computational efficiency is tied to software and hardware engineering constraints which, although often ignored in research papers, are ubiquitous in practice. Efficacité computationnelle Computational efficiency Efficacité statistique Statistical efficiency Malédiction de la dimensionalité Curse of dimensionality Arbres de décision Decision trees Réseaux de neurones Neural networks Graph-based semi-supervised learning Divergence contrastive Contrastive divergence Mélanges de Gaussiennes Mixtures of Gaussians Appariement de joueurs Matchmaking
24	Apprentissage machine efficace : théorie et pratique Delalleau, Olivier 03 1900 (has links) Malgré des progrès constants en termes de capacité de calcul, mémoire et quantité de données disponibles, les algorithmes d'apprentissage machine doivent se montrer efficaces dans l'utilisation de ces ressources. La minimisation des coûts est évidemment un facteur important, mais une autre motivation est la recherche de mécanismes d'apprentissage capables de reproduire le comportement d'êtres intelligents. Cette thèse aborde le problème de l'efficacité à travers plusieurs articles traitant d'algorithmes d'apprentissage variés : ce problème est vu non seulement du point de vue de l'efficacité computationnelle (temps de calcul et mémoire utilisés), mais aussi de celui de l'efficacité statistique (nombre d'exemples requis pour accomplir une tâche donnée). Une première contribution apportée par cette thèse est la mise en lumière d'inefficacités statistiques dans des algorithmes existants. Nous montrons ainsi que les arbres de décision généralisent mal pour certains types de tâches (chapitre 3), de même que les algorithmes classiques d'apprentissage semi-supervisé à base de graphe (chapitre 5), chacun étant affecté par une forme particulière de la malédiction de la dimensionalité. Pour une certaine classe de réseaux de neurones, appelés réseaux sommes-produits, nous montrons qu'il peut être exponentiellement moins efficace de représenter certaines fonctions par des réseaux à une seule couche cachée, comparé à des réseaux profonds (chapitre 4). Nos analyses permettent de mieux comprendre certains problèmes intrinsèques liés à ces algorithmes, et d'orienter la recherche dans des directions qui pourraient permettre de les résoudre. Nous identifions également des inefficacités computationnelles dans les algorithmes d'apprentissage semi-supervisé à base de graphe (chapitre 5), et dans l'apprentissage de mélanges de Gaussiennes en présence de valeurs manquantes (chapitre 6). Dans les deux cas, nous proposons de nouveaux algorithmes capables de traiter des ensembles de données significativement plus grands. Les deux derniers chapitres traitent de l'efficacité computationnelle sous un angle différent. Dans le chapitre 7, nous analysons de manière théorique un algorithme existant pour l'apprentissage efficace dans les machines de Boltzmann restreintes (la divergence contrastive), afin de mieux comprendre les raisons qui expliquent le succès de cet algorithme. Finalement, dans le chapitre 8 nous présentons une application de l'apprentissage machine dans le domaine des jeux vidéo, pour laquelle le problème de l'efficacité computationnelle est relié à des considérations d'ingénierie logicielle et matérielle, souvent ignorées en recherche mais ô combien importantes en pratique. / Despite constant progress in terms of available computational power, memory and amount of data, machine learning algorithms need to be efficient in how they use them. Although minimizing cost is an obvious major concern, another motivation is to attempt to design algorithms that can learn as efficiently as intelligent species. This thesis tackles the problem of efficient learning through various papers dealing with a wide range of machine learning algorithms: this topic is seen both from the point of view of computational efficiency (processing power and memory required by the algorithms) and of statistical efficiency (n umber of samples necessary to solve a given learning task).The first contribution of this thesis is in shedding light on various statistical inefficiencies in existing algorithms. Indeed, we show that decision trees do not generalize well on tasks with some particular properties (chapter 3), and that a similar flaw affects typical graph-based semi-supervised learning algorithms (chapter 5). This flaw is a form of curse of dimensionality that is specific to each of these algorithms. For a subclass of neural networks, called sum-product networks, we prove that using networks with a single hidden layer can be exponentially less efficient than when using deep networks (chapter 4). Our analyses help better understand some inherent flaws found in these algorithms, and steer research towards approaches that may potentially overcome them. We also exhibit computational inefficiencies in popular graph-based semi-supervised learning algorithms (chapter 5) as well as in the learning of mixtures of Gaussians with missing data (chapter 6). In both cases we propose new algorithms that make it possible to scale to much larger datasets. The last two chapters also deal with computational efficiency, but in different ways. Chapter 7 presents a new view on the contrastive divergence algorithm (which has been used for efficient training of restricted Boltzmann machines). It provides additional insight on the reasons why this algorithm has been so successful. Finally, in chapter 8 we describe an application of machine learning to video games, where computational efficiency is tied to software and hardware engineering constraints which, although often ignored in research papers, are ubiquitous in practice. Efficacité computationnelle Computational efficiency Efficacité statistique Statistical efficiency Malédiction de la dimensionalité Curse of dimensionality Arbres de décision Decision trees Réseaux de neurones Neural networks Graph-based semi-supervised learning Divergence contrastive Contrastive divergence Mélanges de Gaussiennes Mixtures of Gaussians Appariement de joueurs Matchmaking
25	Efficient estimation using the characteristic function : theory and applications with high frequency data Kotchoni, Rachidi 05 1900 (has links) Nous abordons deux sujets distincts dans cette thèse: l'estimation de la volatilité des prix d'actifs financiers à partir des données à haute fréquence, et l'estimation des paramétres d'un processus aléatoire à partir de sa fonction caractéristique. Le chapitre 1 s'intéresse à l'estimation de la volatilité des prix d'actifs. Nous supposons que les données à haute fréquence disponibles sont entachées de bruit de microstructure. Les propriétés que l'on prête au bruit sont déterminantes dans le choix de l'estimateur de la volatilité. Dans ce chapitre, nous spécifions un nouveau modèle dynamique pour le bruit de microstructure qui intègre trois propriétés importantes: (i) le bruit peut être autocorrélé, (ii) le retard maximal au delà duquel l'autocorrélation est nulle peut être une fonction croissante de la fréquence journalière d'observations; (iii) le bruit peut avoir une composante correlée avec le rendement efficient. Cette dernière composante est alors dite endogène. Ce modèle se différencie de ceux existant en ceci qu'il implique que l'autocorrélation d'ordre 1 du bruit converge vers 1 lorsque la fréquence journalière d'observation tend vers l'infini. Nous utilisons le cadre semi-paramétrique ainsi défini pour dériver un nouvel estimateur de la volatilité intégrée baptisée "estimateur shrinkage". Cet estimateur se présente sous la forme d'une combinaison linéaire optimale de deux estimateurs aux propriétés différentes, l'optimalité étant défini en termes de minimisation de la variance. Les simulations indiquent que l'estimateur shrinkage a une variance plus petite que le meilleur des deux estimateurs initiaux. Des estimateurs sont également proposés pour les paramètres du modèle de microstructure. Nous clôturons ce chapitre par une application empirique basée sur des actifs du Dow Jones Industrials. Les résultats indiquent qu'il est pertinent de tenir compte de la dépendance temporelle du bruit de microstructure dans le processus d'estimation de la volatilité. Les chapitres 2, 3 et 4 s'inscrivent dans la littérature économétrique qui traite de la méthode des moments généralisés. En effet, on rencontre en finance des modèles dont la fonction de vraisemblance n'est pas connue. On peut citer en guise d'exemple la loi stable ainsi que les modèles de diffusion observés en temps discrets. Les méthodes d'inférence basées sur la fonction caractéristique peuvent être envisagées dans ces cas. Typiquement, on spécifie une condition de moment basée sur la différence entre la fonction caractéristique (conditionnelle) théorique et sa contrepartie empirique. Le défit ici est d'exploiter au mieux le continuum de conditions de moment ainsi spécifié pour atteindre la même efficacité que le maximum de vraisemblance dans les inférences. Ce défit a été relevé par Carrasco et Florens (2000) qui ont proposé la procédure CGMM (continuum GMM). La fonction objectif que ces auteurs proposent est une forme quadratique hilbertienne qui fait intervenir l'opérateur inverse de covariance associé au continuum de condition de moments. Cet opérateur inverse est régularisé à la Tikhonov pour en assurer l'existence globale et la continuité. Carrasco et Florens (2000) ont montré que l'estimateur obtenu en minimisant cette forme quadratique est asymptotiquement aussi efficace que l'estimateur du maximum de vraisemblance si le paramètre de régularisation (α) tend vers zéro lorsque la taille de l'échatillon tend vers l'infini. La nature de la fonction objectif du CGMM soulève deux questions importantes. La première est celle de la calibration de α en pratique, et la seconde est liée à la présence d'intégrales multiples dans l'expression de la fonction objectif. C'est à ces deux problématiques qu'essayent de répondent les trois derniers chapitres de la présente thèse. Dans le chapitre 2, nous proposons une méthode de calibration de α basée sur la minimisation de l'erreur quadratique moyenne (EQM) de l'estimateur. Nous suivons une approche similaire à celle de Newey et Smith (2004) pour calculer un développement d'ordre supérieur de l'EQM de l'estimateur CGMM de sorte à pouvoir examiner sa dépendance en α en échantillon fini. Nous proposons ensuite deux méthodes pour choisir α en pratique. La première se base sur le développement de l'EQM, et la seconde se base sur des simulations Monte Carlo. Nous montrons que la méthode Monte Carlo délivre un estimateur convergent de α optimal. Nos simulations confirment la pertinence de la calibration de α en pratique. Le chapitre 3 essaye de vulgariser la théorie du chapitre 2 pour les modèles univariés ou bivariés. Nous commençons par passer en revue les propriétés de convergence et de normalité asymptotique de l'estimateur CGMM. Nous proposons ensuite des recettes numériques pour l'implémentation. Enfin, nous conduisons des simulations Monte Carlo basée sur la loi stable. Ces simulations démontrent que le CGMM est une méthode fiable d'inférence. En guise d'application empirique, nous estimons par CGMM un modèle de variance autorégressif Gamma. Les résultats d'estimation confirment un résultat bien connu en finance: le rendement est positivement corrélé au risque espéré et négativement corrélé au choc sur la volatilité. Lorsqu'on implémente le CGMM, une difficulté majeure réside dans l'évaluation numérique itérative des intégrales multiples présentes dans la fonction objectif. Les méthodes de quadrature sont en principe parmi les plus précises que l'on puisse utiliser dans le présent contexte. Malheureusement, le nombre de points de quadrature augmente exponentiellement en fonction de la dimensionalité (d) des intégrales. L'utilisation du CGMM devient pratiquement impossible dans les modèles multivariés et non markoviens où d≥3. Dans le chapitre 4, nous proposons une procédure alternative baptisée "reéchantillonnage dans le domaine fréquentielle" qui consiste à fabriquer des échantillons univariés en prenant une combinaison linéaire des éléments du vecteur initial, les poids de la combinaison linéaire étant tirés aléatoirement dans un sous-espace normalisé de ℝ^{d}. Chaque échantillon ainsi généré est utilisé pour produire un estimateur du paramètre d'intérêt. L'estimateur final que nous proposons est une combinaison linéaire optimale de tous les estimateurs ainsi obtenus. Finalement, nous proposons une étude par simulation et une application empirique basées sur des modèles autorégressifs Gamma. Dans l'ensemble, nous faisons une utilisation intensive du bootstrap, une technique selon laquelle les propriétés statistiques d'une distribution inconnue peuvent être estimées à partir d'un estimé de cette distribution. Nos résultats empiriques peuvent donc en principe être améliorés en faisant appel aux connaissances les plus récentes dans le domaine du bootstrap. / In estimating the integrated volatility of financial assets using noisy high frequency data, the time series properties assumed for the microstructure noise determines the proper choice of the volatility estimator. In the first chapter of the current thesis, we propose a new model for the microstructure noise with three important features. First of all, our model assumes that the noise is L-dependent. Secondly, the memory lag L is allowed to increase with the sampling frequency. And thirdly, the noise may include an endogenous part, that is, a piece that is correlated with the latent returns. The main difference between this microstructure model and existing ones is that it implies a first order autocorrelation that converges to 1 as the sampling frequency goes to infinity. We use this semi-parametric model to derive a new shrinkage estimator for the integrated volatility. The proposed estimator makes an optimal signal-to-noise trade-off by combining a consistent estimators with an inconsistent one. Simulation results show that the shrinkage estimator behaves better than the best of the two combined ones. We also propose some estimators for the parameters of the noise model. An empirical study based on stocks listed in the Dow Jones Industrials shows the relevance of accounting for possible time dependence in the noise process. Chapters 2, 3 and 4 pertain to the generalized method of moments based on the characteristic function. In fact, the likelihood functions of many financial econometrics models are not known in close form. For example, this is the case for the stable distribution and a discretely observed continuous time model. In these cases, one may estimate the parameter of interest by specifying a moment condition based on the difference between the theoretical (conditional) characteristic function and its empirical counterpart. The challenge is then to exploit the whole continuum of moment conditions hence defined to achieve the maximum likelihood efficiency. This problem has been solved in Carrasco and Florens (2000) who propose the CGMM procedure. The objective function of the CGMM is a quadrqtic form on the Hilbert space defined by the moment function. That objective function depends on a Tikhonov-type regularized inverse of the covariance operator associated with the moment function. Carrasco and Florens (2000) have shown that the estimator obtained by minimizing the proposed objective function is asymptotically as efficient as the maximum likelihood estimator provided that the regularization parameter (α) converges to zero as the sample size goes to infinity. However, the nature of this objective function raises two important questions. First of all, how do we select α in practice? And secondly, how do we implement the CGMM when the multiplicity (d) of the integrals embedded in the objective-function d is large. These questions are tackled in the last three chapters of the thesis. In Chapter 2, we propose to choose α by minimizing the approximate mean square error (MSE) of the estimator. Following an approach similar to Newey and Smith (2004), we derive a higher-order expansion of the estimator from which we characterize the finite sample dependence of the MSE on α. We provide two data-driven methods for selecting the regularization parameter in practice. The first one relies on the higher-order expansion of the MSE whereas the second one uses only simulations. We show that our simulation technique delivers a consistent estimator of α. Our Monte Carlo simulations confirm the importance of the optimal selection of α. The goal of Chapter 3 is to illustrate how to efficiently implement the CGMM for d≤2. To start with, we review the consistency and asymptotic normality properties of the CGMM estimator. Next we suggest some numerical recipes for its implementation. Finally, we carry out a simulation study with the stable distribution that confirms the accuracy of the CGMM as an inference method. An empirical application based on the autoregressive variance Gamma model led to a well-known conclusion: investors require a positive premium for bearing the expected risk while a negative premium is attached to the unexpected risk. In implementing the characteristic function based CGMM, a major difficulty lies in the evaluation of the multiple integrals embedded in the objective function. Numerical quadratures are among the most accurate methods that can be used in the present context. Unfortunately, the number of quadrature points grows exponentially with d. When the data generating process is Markov or dependent, the accurate implementation of the CGMM becomes roughly unfeasible when d≥3. In Chapter 4, we propose a strategy that consists in creating univariate samples by taking a linear combination of the elements of the original vector process. The weights of the linear combinations are drawn from a normalized set of ℝ^{d}. Each univariate index generated in this way is called a frequency domain bootstrap sample that can be used to compute an estimator of the parameter of interest. Finally, all the possible estimators obtained in this fashion can be aggregated to obtain the final estimator. The optimal aggregation rule is discussed in the paper. The overall method is illustrated by a simulation study and an empirical application based on autoregressive Gamma models. This thesis makes an extensive use of the bootstrap, a technique according to which the statistical properties of an unknown distribution can be estimated from an estimate of that distribution. It is thus possible to improve our simulations and empirical results by using the state-of-the-art refinements of the bootstrap methodology. / The attached file is created with Scientific Workplace Latex Integrated volatility Volatilité intégré Method of moment Méthode des moments Microstructure noise Bruit de microstructure Realized Kernel Volatilité réalisée à Noyaux Shrinkage estimator Continuum of moment conditions Continuum de conditions de moments Characteristic function Fonction caracteristique Curse of dimensionality Malédiction de la dimensionalité Stochastic expansion Expansion stochastique Bootstrap Bootstrap
26	High-Dimensional Data Representations and Metrics for Machine Learning and Data Mining / Reprezentacije i metrike za mašinsko učenje i analizu podataka velikih dimenzija Radovanović Miloš 11 February 2011 (has links) <p>In the current information age, massive amounts of data are gathered, at a rate prohibiting their effective structuring, analysis, and conversion into useful knowledge. This information overload is manifested both in large numbers of data objects recorded in data sets, and large numbers of attributes, also known as high dimensionality. This dis-sertation deals with problems originating from high dimensionality of data representation, referred to as the “curse of dimensionality,” in the context of machine learning, data mining, and information retrieval. The described research follows two angles: studying the behavior of (dis)similarity metrics with increasing dimensionality, and exploring feature-selection methods, primarily with regard to document representation schemes for text classification. The main results of the dissertation, relevant to the first research angle, include theoretical insights into the concentration behavior of cosine similarity, and a detailed analysis of the phenomenon of hubness, which refers to the tendency of some points in a data set to become hubs by being in-cluded in unexpectedly many <em>k</em>-nearest neighbor lists of other points. The mechanisms behind the phenomenon are studied in detail, both from a theoretical and empirical perspective, linking hubness with the (intrinsic) dimensionality of data, describing its interaction with the cluster structure of data and the information provided by class la-bels, and demonstrating the interplay of the phenomenon and well known algorithms for classification, semi-supervised learning, clustering, and outlier detection, with special consideration being given to time-series classification and information retrieval. Results pertaining to the second research angle include quantification of the interaction between various transformations of high-dimensional document representations, and feature selection, in the context of text classification.</p> / <p>U tekućem &bdquo;informatičkom dobu“, masivne količine podataka se<br />sakupljaju brzinom koja ne dozvoljava njihovo efektivno strukturiranje,<br />analizu, i pretvaranje u korisno znanje. Ovo zasićenje informacijama<br />se manifestuje kako kroz veliki broj objekata uključenih<br />u skupove podataka, tako i kroz veliki broj atributa, takođe poznat<br />kao velika dimenzionalnost. Disertacija se bavi problemima koji<br />proizilaze iz velike dimenzionalnosti reprezentacije podataka, često<br />nazivanim &bdquo;prokletstvom dimenzionalnosti“, u kontekstu ma&scaron;inskog<br />učenja, data mining-a i information retrieval-a. Opisana istraživanja<br />prate dva pravca: izučavanje pona&scaron;anja metrika (ne)sličnosti u odnosu<br />na rastuću dimenzionalnost, i proučavanje metoda odabira atributa,<br />prvenstveno u interakciji sa tehnikama reprezentacije dokumenata za<br />klasifikaciju teksta. Centralni rezultati disertacije, relevantni za prvi<br />pravac istraživanja, uključuju teorijske uvide u fenomen koncentracije<br />kosinusne mere sličnosti, i detaljnu analizu fenomena habovitosti koji<br />se odnosi na tendenciju nekih tačaka u skupu podataka da postanu<br />habovi tako &scaron;to bivaju uvr&scaron;tene u neočekivano mnogo lista k najbližih<br />suseda ostalih tačaka. Mehanizmi koji pokreću fenomen detaljno su<br />proučeni, kako iz teorijske tako i iz empirijske perspektive. Habovitost<br />je povezana sa (latentnom) dimenzionalno&scaron;ću podataka, opisana<br />je njena interakcija sa strukturom klastera u podacima i informacijama<br />koje pružaju oznake klasa, i demonstriran je njen efekat na<br />poznate algoritme za klasifikaciju, semi-supervizirano učenje, klastering<br />i detekciju outlier-a, sa posebnim osvrtom na klasifikaciju vremenskih<br />serija i information retrieval. Rezultati koji se odnose na<br />drugi pravac istraživanja uključuju kvantifikaciju interakcije između<br />različitih transformacija vi&scaron;edimenzionalnih reprezentacija dokumenata<br />i odabira atributa, u kontekstu klasifikacije teksta.</p>
27	Compression et inférence des opérateurs intégraux : applications à la restauration d’images dégradées par des flous variables / Approximation and estimation of integral operators : applications to the restoration of images degraded by spatially varying blurs Escande, Paul 26 September 2016 (has links) Le problème de restauration d'images dégradées par des flous variables connaît un attrait croissant et touche plusieurs domaines tels que l'astronomie, la vision par ordinateur et la microscopie à feuille de lumière où les images sont de taille un milliard de pixels. Les flous variables peuvent être modélisés par des opérateurs intégraux qui associent à une image nette u, une image floue Hu. Une fois discrétisé pour être appliqué sur des images de N pixels, l'opérateur H peut être vu comme une matrice de taille N x N. Pour les applications visées, la matrice est stockée en mémoire avec un exaoctet. On voit apparaître ici les difficultés liées à ce problème de restauration des images qui sont i) le stockage de ce grand volume de données, ii) les coûts de calculs prohibitifs des produits matrice-vecteur. Ce problème souffre du fléau de la dimension. D'autre part, dans beaucoup d'applications, l'opérateur de flou n'est pas ou que partialement connu. Il y a donc deux problèmes complémentaires mais étroitement liés qui sont l'approximation et l'estimation des opérateurs de flou. Cette thèse a consisté à développer des nouveaux modèles et méthodes numériques permettant de traiter ces problèmes. / The restoration of images degraded by spatially varying blurs is a problem of increasing importance. It is encountered in many applications such as astronomy, computer vision and fluorescence microscopy where images can be of size one billion pixels. Variable blurs can be modelled by linear integral operators H that map a sharp image u to its blurred version Hu. After discretization of the image on a grid of N pixels, H can be viewed as a matrix of size N x N. For targeted applications, matrices is stored with using exabytes on the memory. This simple observation illustrates the difficulties associated to this problem: i) the storage of a huge amount of data, ii) the prohibitive computation costs of matrix-vector products. This problems suffers from the challenging curse of dimensionality. In addition, in many applications, the operator is usually unknown or only partially known. There are therefore two different problems, the approximation and the estimation of blurring operators. They are intricate and have to be addressed with a global overview. Most of the work of this thesis is dedicated to the development of new models and computational methods to address those issues. Opérateurs intégraux Flou variable Parcimonie Approximation Estimation Fléau de la dimension Restauration Décomposition multi-Échelle Défloutage Déconvolution Problème inverse Grande dimension Interpolation de données éparpillées Produit-Convolution Algorithmes rapides Bruit multiplicatif structuté Mesure de similarité Microscopie Astronomie Integral operators Spatially varying blur Sparsity Approximation Estimation Curse of dimensionality Restoration Multi-Scale approximation Deblurring Deconvolution Inverse problem High-Dimension Scattered data interpolation Product-Convolution Fast algorithms Structured multiplicative noise Similarity measure Microscopy Astronomy 510

Page generated in 0.1177 seconds