11 |
Trajectory ensemble methods for understanding complex stochastic systemsMey, Antonia S. J. S. January 2013 (has links)
This thesis investigates the equilibrium and dynamic properties of stochastic systems of varying complexity. The dynamic properties of lattice models -- the 1-d Ising model and a 3-d protein model -- and equilibrium properties of continuous models -- particles in various potentials -- are presented. Dynamics are studied according to a large deviation formalism, by looking at non-equilibrium ensembles of trajectories, classified according to a dynamical order parameter. The phase structure of the ensembles of trajectories is deduced from the properties of large-deviation functions, representing dynamical free-energies. The 1-d Ising model is studied with Glauber dynamics uncovering the dynamical second-order transition at critical values of the counting field 's', confirming the analytical predictions by Jack and Solich. Next, the dynamics in an external magnetic field are studied, allowing the construction of a dynamic phase diagram in the space of temperature, s-field and magnetic field. The dynamic phase diagram is reminiscent of that of the 2-d Ising model. In contrast, Kawasaki dynamics give rise to a dynamical phase structure similar to the one observed in kinetically constrained models. The dynamics of a lattice protein model, represented by a self avoiding walk with three different Hamiltonians, are studied. For the uniform Go Hamiltonian all dynamics occurs between non-native and native trajectories, whereas for heterogeneous Hamiltonians and Full interaction Hamiltonians a first-order dynamical transition to sets of trapping trajectories is observed in the s-ensemble. The model is studied exhaustively for a particular sequence, constructing a qualitative phase diagram, from which a more general dynamic behaviour is extrapolated. Lastly, an estimator for equilibrium expectations, represented by a transition matrix in an extended space between temperatures and a set of discrete states obtained through the discretisation of a continuous space, is proposed. It is then demonstrated that this estimator outperforms conventional multi-temperature ensemble estimates by up to three orders of magnitude, by considering three models of increasing complexity: diffusive particles in a double-well potential, a multidimensional folding potential and a molecular dynamics simulations of alanine dipeptide.
|
12 |
Frequency and time domain analysis of networks of interacting processes : what can be achieved from records of short durationAllehiany, Faiza Mohammad January 2012 (has links)
Abstract Recently, there has been increasing interest in investigating the interrelationships among the component stochastic processes of a dynamical system. The applications of these studies are to be found in various fields such as Economics, Neuroscience, Engineering and Neurobiology. Also the determination of the direction of the information flow is one of the important subjects studied widely. These investigations have usually been implemented in the time and frequency domains. Consequently, several mathematical and statistical procedures have been developed to facilitate these analyses. The aim of this thesis is to discuss the relationships between stochastic processes of a relatively short time duration. Specifically, the research concerns the analysis of the electrical activity of the dysfunctional brain, where the available data belong to a right-handed focal epileptic patient. EEG signals are recorded directly from the scalp using numbered electrodes according to the International 10/20 system introduced by Jasper [1958]. The analysis is only performed for processes of the left hemisphere as they represent the dominant hemisphere. Moreover, since each region of the brain is responsible for a special function, we have chosen five processes to represent the five main lobes of the brain; the frontal lobe, the central region, the parietal lobe, the occipital lobe and the temporal lobe. The analyses of these signals are carried out using four spectral density estimation procedures, namely the multivariate autoregressive model of order 2; the average of periodograms of adjacent segments of the single record; the smoothed periodogram approach for the entire record; and the multi-taper method. Thereafter comparisons among the results of these methods are made. The strength of the correlation between signals is measured by coherence and partial coherence functions. Also, the Granger causality concept is implemented for these data in the form of determining the direction of the information flow between these signals using the partial directed coherence (PDC) proposed by Baccala and Sameshima [2001] using the statistical level of significance suggested by Schelter et al.[2005]. The structure of the causal influences produced by the PDC shows that there are statistically significant reciprocal causal effects between processes representing the brain's region, the frontal lobe, the central area, the parietal lobe and the temporal lobe. However, there are two uni-directed causal influence relations, one is between the central area and the occipital lobe and the second one is between the occipital and temporal lobes. The indirect causal influences are detected between these processes throughout the process representing the temporal lobe. Generally, the values of the PDC in the anterior-posterior direction are larger than the values of the PDC in the opposite direction. Also, the causal influences of each process on the temporal lobe process is larger than the causal influences in the opposite direction. The spectral analyses show that the estimated power spectra and coherences of these signals are approximately peak in the delta wave band of frequency [1, 4) Hz. The significant non-zero estimated coherences are captured between the brain's lobes except for the occipital lobe which is uncorrelated with any of the other lobes. The depth of non-zero significant estimated coherences is given by partial coherence, which measures the strength of the estimated coherence between any two processes after removing the linear influence of one or more other processes. For the current data, we found that the depth of correlations depends on the spectral estimation method adopted. For example, the depth of correlation is of order 2 for the method of averaging across periodograms of adjacent segments of the single record and the method of smoothed periodogram of the entire single record and is of order one for the multi-taper method. However, the depth of correlations is unknown for the multivariate autoregressive model of order 2. The comparisons made between the results of the four spectral estimation methods mentioned previously, indicated that MVAR is not sensitive to rapid changes occurring in the signal such as the effect of the notch filter at 60Hz and a calibration signal at 47Hz, while the other three methods exhibited good sensitivity to these changes with different strengths of responses. Furthermore, the smoothed periodogram and the multi-taper methods persistently detect the notch filter effect at 60Hz in the ordinary estimated coherence curves, while the method of averaging across periodograms of adjacent segments of the single record does not.
|
13 |
Théorèmes limites pour des marches aléatoires markoviennes conditionnées à rester positives / Limit theorems for Markov walk conditioned to stay positiveLauvergnat, Ronan 08 September 2017 (has links)
On considère une marche aléatoire réelle dont les accroissements sont construits à partir d’une chaîne de Markov définie sur un espace abstrait. Sous des hypothèses de centrage de la marche et de décroissance rapide de la dépendance de la chaîne de Markov par rapport à son passé (de type trou spectral), on se propose d’étudier le premier instant pour lequel une telle marche markovienne passe dans les négatifs. Plus précisément, on établit que le comportement asymptotique de la probabilité de survie est inversement proportionnel à la racine carrée du temps. On étend également à nos modèles markoviens le résultat des marches aléatoires aux accroissements indépendants suivant : la loi asymptotique de la marche aléatoire renormalisée et conditionnée à rester positive est la loi de Rayleigh. Dans un deuxième temps, on restreint notre modèle aux cas où la chaîne de Markov définissant les accroissements de la marche aléatoire est à valeurs dans un espace d’états fini. Sous cette hypothèse et lorsque que la marche est dite non-lattice, on complète nos résultats par des théorèmes locaux pour la marche aléatoire conjointement avec le fait qu’elle soit restée positive. Enfin on applique ces développements aux processus de branchement soumis à un environnement aléatoire, lui-même défini à partir d’une chaîne de Markov à valeurs dans un espace d’états fini. On établit le comportement asymptotique de la probabilité de survie du processus dans le cas critique et les trois cas sous-critiques (fort, intermédiaire et faible) / We consider a real random walk whose increments are constructed by a Markov chain definedon an abstract space. We suppose that the random walk is centred and that the dependence of the Markov walk in its past decreases exponentially fast (due to the spectral gap property). We study the first time when the random walk exits the positive half-line and prove that the asymptotic behaviour of the survey probability is inversely proportional to the square root of the time. We extend also to our Markovian model the following result of random walks with independent increments: the asymptotic law of the random walk renormalized and conditioned to stay positive is the Rayleigh law. Subsequently, we restrict our model to the cases when the Markov chain defining the increments of the random walk takes its values on a finite state space. Under this assumption and the condition that the walk is non-lattice, we complete our results giving local theorems for the random walk conditioned to stay positive. Finally, we apply these developments to branching processes under a random environment defined by a Markov chain taking its values on a finite state space. We give the asymptotic behaviour of the survey probability of the process in the critical case and the three subcritical cases (strongly, intermediate and weakly).
|
14 |
Pricing outside barrier options when the monitoring of the barrier starts at a hitting timeMofokeng, Jacob Moletsane 02 1900 (has links)
This dissertation studies the pricing of Outside barrier call options, when their activation starts at a
hitting time. The pricing of Outside barrier options when their activation starts at time zero, and the
pricing of standard barrier options when their activation starts at a hitting time of a pre speci ed
barrier level, have been studied previously (see [21], [24]).
The new work that this dissertation will do is to price Outside barrier call options, where they will be
activated when the triggering asset crosses or hits a pre speci ed barrier level, and also the pricing of
Outside barrier call options where they will be activated when the triggering asset crosses or hits a
sequence of two pre specifed barrier levels. Closed form solutions are derived using Girsanov's theorem
and the re
ection principle. Existing results are derived from the new results, and properties of the new
results are illustrated numerically and discussed. / Mathematical Sciences / M. Sc. (Applied Mathematics)
|
15 |
Stochastic agent-based modelling for reality : dynamic discrete choice analysis with interactionTakama, Takeshi January 2005 (has links)
This D.Phil. thesis develops a new agent-based simulation model to improve the results of analysis, which solely uses discrete choice modelling, as well as to analyse the effects of a road user charging scheme for the Upper Derwent Valley in the Peak District National Park. The advantages of discrete choice analysis are well known. However, results with these conventional conventional approaches, which conduct analysis solely with discrete choice models, can be biased if interaction and learning effects are significant. The Minority Game, in which agents try to choose the option of the minority side, is an appropriate tool to deal with these problems. The situation in the Upper Derwent Valley can be explained with economic game theories and the Minority Game. The two approaches mutually help to analyse the situation in the Upper Derwent Valley leading to the development of a stochastic Minority Game. The stochastic Minority Game was tested with an online game (questionnaire), which was played 3,886 times by response in all around the world. The practical part of this thesis examines the components of the stochastic Minority Game with the data collected around the Upper Derwent Valley. The main data was collected using a stated preference survey. Overall, 700 questionnaires were distributed and 323 of them were returned (i.e. a return rate of 46.1 %). In the practical part, the agent-based model has four sub modules: 1) Multinomial mixed logit model for mode choice, 2) Binary logit model for parking location choice, 3) Markov queue model for parking network, and 4) the Minority Game for parking congestion and learning. This simulation model produces comprehensive outputs including mode choices, congestion levels, and user utilities. The results show that the road user charging scheme reduces car demand in the Upper Derwent Valley and ensures a reduction in congestion at the parking areas. The model also shows that an exemption will increase the utilities of elderly visitors without substantially sacrificing those of younger visitors. In conclusion, the simulation model demonstrated that oversimplification in conventional approaches solely using discrete choice models gave significant biases when real world problems were analysed.
|
16 |
Pricing outside barrier options when the monitoring of the barrier starts at a hitting timeMofokeng, Jacob Moletsane 02 1900 (has links)
This dissertation studies the pricing of Outside barrier call options, when their activation starts at a
hitting time. The pricing of Outside barrier options when their activation starts at time zero, and the
pricing of standard barrier options when their activation starts at a hitting time of a pre speci ed
barrier level, have been studied previously (see [21], [24]).
The new work that this dissertation will do is to price Outside barrier call options, where they will be
activated when the triggering asset crosses or hits a pre speci ed barrier level, and also the pricing of
Outside barrier call options where they will be activated when the triggering asset crosses or hits a
sequence of two pre specifed barrier levels. Closed form solutions are derived using Girsanov's theorem
and the re
ection principle. Existing results are derived from the new results, and properties of the new
results are illustrated numerically and discussed. / Mathematical Sciences / M. Sc. (Applied Mathematics)
|
17 |
Calcul fonctionnel non-anticipatif et applications en finance / Pathwise functional calculus and applications to continuous-time financeRiga, Candia 26 June 2015 (has links)
Cette thèse développe une approche trajectorielle pour la modélisation des marchés financiers en temps continu, sans faire appel à des hypothèses probabilistes ou à des modèles stochastiques. À l'aide du calcul fonctionnel non-anticipatif, nous identifions une classe spéciale de stratégies de trading que nous prouvons être auto-finançantes, selon une notion trajectorielle introduite dans cette thèse, et dont le gain peut être calculé trajectoire par trajectoire comme limite de sommes de Riemann. Avec ces outils, nous proposons un cadre analytique pour analyser la performance et la robustesse de stratégies de couverture dynamique de produits dérivés path-dependent sur en ensemble de scénarios. Ce cadre ne demande aucune hypothèse probabiliste sur la dynamique du processus sous-jacent. Il généralise donc les résultats précédents sur la robustesse de stratégies de couverture dans des modèles de diffusion. Nous obtenons une formule explicite pour l'erreur de couverture dans chaque scénario et nous fournissons des conditions suffisantes qui impliquent la robustesse de la couverture delta-neutre. Nous montrons que la robustesse peut être obtenue dans un ensemble ample de modèles de prix de martingale exponentielle de carré intégrable, avec une condition de convexité verticale sur le payoff. Nous remarquons que les discontinuités de la trajectoire de prix détériorent la performance de la couverture. Le dernier chapitre, indépendant du reste de la thèse, est une étude en collaboration avec Andrea Pascucci et Stefano Pagliarani, où nous proposons une nouvelle méthode pour l'approximation analytique dans des modèles à volatilité locale avec des sauts de type Lévy. / This thesis develops a mathematical framework for the analysis of continuous-time trading strategies which, in contrast to the classical setting of continuous-time finance, does not rely on stochastic integrals or other probabilistic notions.Using the `non-anticipative functional calculus', we first develop a pathwise definition of the gain process for a large class of continuous-time trading strategies which includes delta-hedging strategies, as well as a pathwise definition of the self-financing condition. Using these concepts, we propose a framework for analyzing the performance and robustness of delta-hedging strategies for path-dependent derivatives across a given set of scenarios. Our setting allows for general path-dependent payoffs and does not require any probabilistic assumption on the dynamics of the underlying asset, thereby extending previous results on robustness of hedging strategies in the setting of diffusion models. We obtain a pathwise formula for the hedging error for a general path-dependent derivative and provide sufficient conditions ensuring the robustness of the delta-hedge. We show in particular that robust hedges may be obtained in a large class of continuous exponential martingale models under a vertical convexity condition on the payoff functional. We also show that discontinuities in the underlying asset always deteriorate the hedging performance. These results are applied to the case of Asian options and barrier options. The last chapter, independent of the rest of the thesis, proposes a novel method, jointly developed with Andrea Pascucci and Stefano Pagliarani, for analytical approximations in local volatility models with Lévy jumps.
|
18 |
Algorithms for the resolution of stochastic control problems in high dimension by using probabilistic and max-plus methods / Algorithmes de résolution de problèmes de contrôle stochastique en grande dimension par une association de méthodes probabilistes et max-plus.Fodjo, Eric 13 July 2018 (has links)
Les problèmes de contrôle stochastique optimal à horizon fini forment une classe de problèmes de contrôle optimal où interviennent des processus stochastiques considérés sur un intervalle de temps borné. Tout comme beaucoup de problème de contrôle optimal, ces problèmes sont résolus en utilisant le principe de la programmation dynamique qui induit une équation aux dérivées partielles (EDP) appelée équation d'Hamilton-Jacobi-Bellman. Les méthodes basées sur la discrétisation de l’espace sous forme de grille, les méthodes probabilistes ou plus récemment les méthodes max-plus peuvent alors être utilisées pour résoudre cette équation. Cependant, le premier type de méthode est mis en défaut quand un espace à dimension grande est considéré à cause de la malédiction de la dimension tandis que le deuxième type de méthode ne permettait jusqu'ici que de résoudre des problèmes où la non linéarité de l'équation aux dérivées partielles par rapport à la Hessienne n'est pas trop forte. Quant au troisième type de méthode, il entraine une explosion de la complexité de la fonction valeur. Nous introduisons dans cette thèse deux nouveaux schémas probabilistes permettant d'agrandir la classe des problèmes pouvant être résolus par les méthodes probabilistes. L'une est adaptée aux EDP à coefficients bornés tandis que l'autre peut être appliqué aux EDP à coefficients bornés ou non bornés. Nous prouvons la convergence des deux schémas probabilistes et obtenons des estimées de l'erreur de convergence dans le cas d'EDP à coefficients bornés. Nous donnons également quelques résultats sur le comportement du deuxième schéma dans le cas d'EDP à coefficients non bornés. Ensuite, nous introduisons une méthode complètement nouvelle pour résoudre les problèmes de contrôle stochastique optimal à horizon fini que nous appelons la méthode max-plus probabiliste. Elle permet d'utiliser le caractère non linéaire des méthodes max-plus dans un contexte probabiliste tout en contrôlant la complexité de la fonction valeur. Une application au calcul du prix de sur-réplication d'une option dans un modèle de corrélation incertaine est donnée dans le cas d’un espace à dimension 2 et 5. / Stochastic optimal control problems with finite horizon are a class of optimal control problems where intervene stochastic processes in a bounded time. As many optimal control problems, they are often solved using a dynamic programming approach which results in a second order Partial Differential Equation (PDE) called the Hamilton-Jacobi-Bellman equation. Grid-based methods, probabilistic methods or more recently max-plus methods can be used then to solve this PDE. However, the first type of methods default in a space of high dimension because of the curse of dimensionality while the second type of methods allowed till now to solve only problems where the nonlinearity of the PDE with respect to the second order derivatives is not very high. As for the third type of method, it results in an explosion of the complexity of the value function. We introduce two new probabilistic schemes in order to enlarge the class of problems that can be solved with probabilistic methods. One is adapted to PDE with bounded coefficients while the other can be applied to PDE with bounded or unbounded coefficients. We prove the convergence of the two probabilistic scheme and obtain error estimates in the case of a PDE with bounded coefficients. We also give some results about the behavior of the second probabilistic scheme in the case of a PDE with unbounded coefficients. After that, we introduce a completely new type of method to solve stochastic optimal control problems with finite horizon that we call the max-plus probabilistic method. It allows to add the non linearity feature of max-plus methods to a probabilistic method while controlling the complexity of the value function. An application to the computation of the optimal super replication price of an option in an uncertain correlation model is given in a 5 dimensional space.
|
19 |
Discrétisation de processus à des temps d’arrêt et Quantification d'incertitude pour des algorithmes stochastiques / Discretization of processes at stopping times and Uncertainty quantification of stochastic approximation limitsStazhynski, Uladzislau 12 December 2018 (has links)
Cette thèse contient deux parties qui étudient deux sujets différents. Les Chapitres 1-4 sont consacrés aux problèmes de discrétisation de processus à des temps d’arrêt. Dans le Chapitre 1 on étudie l'erreur de discrétisation optimale pour des intégrales stochastiques par rapport à une semimartingale brownienne multidimensionnelle continue. Dans ce cadre on établit une borne inférieure trajectorielle pour la variation quadratique renormalisée de l'erreur. On fournit une suite de temps d’arrêt qui donne une discrétisation asymptotiquement optimale. Cette suite est définie comme temps de sortie d'ellipsoïdes aléatoires par la semimartingale. Par rapport aux résultats précédents on permet une classe de semimartingales assez large. On démontre qui la borne inférieure est exacte. Dans le Chapitre 2 on étudie la version adaptative au modèle de la discrétisation optimale d’intégrales stochastique. Dans le Chapitre 1 la construction de la stratégie optimale utilise la connaissance du coefficient de diffusion de la semimartingale considérée. Dans ce travail on établit une stratégie de discrétisation asymptotiquement optimale qui est adaptative au modèle et n'utilise pas aucune information sur le modèle. On démontre l'optimalité pour une classe de grilles de discrétisation assez générale basée sur les technique de noyau pour l'estimation adaptative. Dans le Chapitre 3 on étudie la convergence en loi des erreurs de discrétisation renormalisées de processus d’Itô pour une classe concrète et assez générale de grilles de discrétisation données par des temps d’arrêt. Les travaux précédents sur le sujet considèrent seulement le cas de dimension 1. En plus ils concentrent sur des cas particuliers des grilles, ou démontrent des résultats sous des hypothèses abstraites. Dans notre travail on donne explicitement la distribution limite sous une forme claire et simple, les résultats sont démontré dans le cas multidimensionnel pour le processus et pour l'erreur de discrétisation. Dans le Chapitre 4 on étudie le problème d'estimation paramétrique pour des processus de diffusion basée sur des observations à temps d’arrêt. Les travaux précédents sur le sujet considèrent que des temps d'observation déterministes, fortement prévisibles ou aléatoires indépendants du processus. Sous des hypothèses faibles on construit une suite d'estimateurs consistante pour une classe large de grilles d'observation données par des temps d’arrêt. On effectue une analyse asymptotique de l'erreur d'estimation. En outre, dans le cas du paramètre de dimension 1, pour toute suite d'estimateurs qui vérifie un TCL sans biais, on démontre une borne inférieure uniforme pour la variance asymptotique; on montre que cette borne est exacte. Les Chapitres 5-6 sont consacrés au problème de quantification d'incertitude pour des limites d'approximation stochastique. Dans le Chapitre 5 on analyse la quantification d'incertitude pour des limites d'approximation stochastique (SA). Dans notre cadre la limite est définie comme un zéro d'une fonction donnée par une espérance. Cette espérance est prise par rapport à une variable aléatoire pour laquelle le modèle est supposé de dépendre d'un paramètre incertain. On considère la limite de SA comme une fonction de cette paramètre. On introduit un algorithme qui s'appelle USA (Uncertainty for SA). C'est une procédure en dimension croissante pour calculer les coefficients de base d'expansion de chaos de cette fonction dans une base d'un espace de Hilbert bien choisi. La convergence de USA dans cet espace de Hilbert est démontré. Dans le Chapitre 6 on analyse le taux de convergence dans L2 de l'algorithme USA développé dans le Chapitre 5. L'analyse est non trivial à cause de la dimension infinie de la procédure. Le taux obtenu dépend du modèle et des paramètres utilisés dans l'algorithme USA. Sa connaissance permet d'optimiser la vitesse de croissance de la dimension dans USA. / This thesis consists of two parts which study two separate subjects. Chapters 1-4 are devoted to the problem of processes discretization at stopping times. In Chapter 1 we study the optimal discretization error of stochastic integrals, driven by a multidimensional continuous Brownian semimartingale. In this setting we establish a path wise lower bound for the renormalized quadratic variation of the error and we provide a sequence of discretization stopping times, which is asymptotically optimal. The latter is defined as hitting times of random ellipsoids by the semimartingale at hand. In comparison with previous available results, we allow a quite large class of semimartingales and we prove that the asymptotic lower bound is attainable. In Chapter 2 we study the model-adaptive optimal discretization error of stochastic integrals. In Chapter 1 the construction of the optimal strategy involved the knowledge about the diffusion coefficient of the semimartingale under study. In this work we provide a model-adaptive asymptotically optimal discretization strategy that does not require any prior knowledge about the model. In Chapter 3 we study the convergence in distribution of renormalized discretization errors of Ito processes for a concrete general class of random discretization grids given by stopping times. Previous works on the subject only treat the case of dimension 1. Moreover they either focus on particular cases of grids, or provide results under quite abstract assumptions with implicitly specified limit distribution. At the contrast we provide explicitly the limit distribution in a tractable form in terms of the underlying model. The results hold both for multidimensional processes and general multidimensional error terms. In Chapter 4 we study the problem of parametric inference for diffusions based on observations at random stopping times. We work in the asymptotic framework of high frequency data over a fixed horizon. Previous works on the subject consider only deterministic, strongly predictable or random, independent of the process, observation times, and do not cover our setting. Under mild assumptions we construct a consistent sequence of estimators, for a large class of stopping time observation grids. Further we carry out the asymptotic analysis of the estimation error and establish a Central Limit Theorem (CLT) with a mixed Gaussian limit. In addition, in the case of a 1-dimensional parameter, for any sequence of estimators verifying CLT conditions without bias, we prove a uniform a.s. lower bound on the asymptotic variance, and show that this bound is sharp. In Chapters 5-6 we study the problem of uncertainty quantification for stochastic approximation limits. In Chapter 5 we analyze the uncertainty quantification for the limit of a Stochastic Approximation (SA) algorithm. In our setup, this limit is defined as the zero of a function given by an expectation. The expectation is taken w.r.t. a random variable for which the model is assumed to depend on an uncertain parameter. We consider the SA limit as a function of this parameter. We introduce the so-called Uncertainty for SA (USA) algorithm, an SA algorithm in increasing dimension for computing the basis coefficients of a chaos expansion of this function on an orthogonal basis of a suitable Hilbert space. The almost-sure and Lp convergences of USA, in the Hilbert space, are established under mild, tractable conditions. In Chapter 6 we analyse the L2-convergence rate of the USA algorithm designed in Chapter 5.The analysis is non-trivial due to infinite dimensionality of the procedure. Moreover, our setting is not covered by the previous works on infinite dimensional SA. The obtained rate depends non-trivially on the model and the design parameters of the algorithm. Its knowledge enables optimization of the dimension growth speed in the USA algorithm, which is the key factor of its efficient performance.
|
20 |
Non-Convex Optimization for Latent Data Models : Algorithms, Analysis and Applications / Optimisation Non Convexe pour Modèles à Données Latentes : Algorithmes, Analyse et ApplicationsKarimi, Belhal 19 September 2019 (has links)
De nombreux problèmes en Apprentissage Statistique consistent à minimiser une fonction non convexe et non lisse définie sur un espace euclidien. Par exemple, les problèmes de maximisation de la vraisemblance et la minimisation du risque empirique en font partie.Les algorithmes d'optimisation utilisés pour résoudre ce genre de problèmes ont été largement étudié pour des fonctions convexes et grandement utilisés en pratique.Cependant, l'accrudescence du nombre d'observation dans l'évaluation de ce risque empirique ajoutée à l'utilisation de fonctions de perte de plus en plus sophistiquées représentent des obstacles.Ces obstacles requièrent d'améliorer les algorithmes existants avec des mis à jour moins coûteuses, idéalement indépendantes du nombre d'observations, et d'en garantir le comportement théorique sous des hypothèses moins restrictives, telles que la non convexité de la fonction à optimiser.Dans ce manuscrit de thèse, nous nous intéressons à la minimisation de fonctions objectives pour des modèles à données latentes, ie, lorsque les données sont partiellement observées ce qui inclut le sens conventionnel des données manquantes mais est un terme plus général que cela.Dans une première partie, nous considérons la minimisation d'une fonction (possiblement) non convexe et non lisse en utilisant des mises à jour incrémentales et en ligne. Nous proposons et analysons plusieurs algorithmes à travers quelques applications.Dans une seconde partie, nous nous concentrons sur le problème de maximisation de vraisemblance non convexe en ayant recourt à l'algorithme EM et ses variantes stochastiques. Nous en analysons plusieurs versions rapides et moins coûteuses et nous proposons deux nouveaux algorithmes du type EM dans le but d'accélérer la convergence des paramètres estimés. / Many problems in machine learning pertain to tackling the minimization of a possibly non-convex and non-smooth function defined on a Many problems in machine learning pertain to tackling the minimization of a possibly non-convex and non-smooth function defined on a Euclidean space.Examples include topic models, neural networks or sparse logistic regression.Optimization methods, used to solve those problems, have been widely studied in the literature for convex objective functions and are extensively used in practice.However, recent breakthroughs in statistical modeling, such as deep learning, coupled with an explosion of data samples, require improvements of non-convex optimization procedure for large datasets.This thesis is an attempt to address those two challenges by developing algorithms with cheaper updates, ideally independent of the number of samples, and improving the theoretical understanding of non-convex optimization that remains rather limited.In this manuscript, we are interested in the minimization of such objective functions for latent data models, ie, when the data is partially observed which includes the conventional sense of missing data but is much broader than that.In the first part, we consider the minimization of a (possibly) non-convex and non-smooth objective function using incremental and online updates.To that end, we propose several algorithms exploiting the latent structure to efficiently optimize the objective and illustrate our findings with numerous applications.In the second part, we focus on the maximization of non-convex likelihood using the EM algorithm and its stochastic variants.We analyze several faster and cheaper algorithms and propose two new variants aiming at speeding the convergence of the estimated parameters.
|
Page generated in 0.0243 seconds