Spelling suggestions: "subject:"quantile estimation"" "subject:"quantiles estimation""
1 |
Intra-hour wind power variability assessment using the conditional range metric : quantification, forecasting and applicationsBoutsika, Thekla 09 September 2013 (has links)
The research presented herein concentrates on the quantification, assessment and forecasting of intra-hour wind power variability. Wind power is intrinsically variable and, due to the increase in wind power penetration levels, the level of intra-hour wind power variability is expected to increase as well. Existing metrics used in wind integration studies fail to efficiently capture intra-hour wind power variation. As a result, this can lead to an underestimation of intra-hour wind power variability with adverse effects on power systems, especially their reliability and economics. One major research focus in this dissertation is to develop a novel variability metric which can effectively quantify intra-hour wind power variability. The proposed metric, termed conditional range metric (CRM), quantifies wind power variability using the range of wind power output over a time period. The metric is termed conditional because the range of wind power output is conditioned on the time interval length k and on the wind power average production l[subscript j] over the given time interval. Using statistical analysis and optimization approaches, a computational algorithm to obtain a unique p[superscript th] quantile of the conditional range metric is given, turning the proposed conditional range metric into a probabilistic intra-hour wind power variability metric. The probabilistic conditional range metric CRM[subscript k,l subscript j,p] assists power system operators and wind farm owners in decision making under uncertainty, since decisions involving wind power variability can be made based on the willingness to accept a certain level of risk [alpha] = 1 - p. An extensive performance analysis of the conditional range metric on real-world wind power and wind speed data reveals how certain variables affect intra-hour wind power variability. Wind power variability over a time frame is found to increase with increasing time frame size and decreasing wind farm size, and is highest at mid production wind power levels. Moreover, wind turbines connected through converters to the grid exhibit lower wind power variability compared to same size simple induction generators, while wind power variability is also found to decrease slightly with increasing wind turbine size. These results can lead to improvements in existing or definitions of new wind power management techniques. Moreover, the comparison of the conditional range metric to the commonly used step-changes statistics reveals that, on average, the conditional range metric can accommodate intra-hour wind power variations for an additional 15% of hours within a given year, significantly benefiting power system reliability. The other major research focus in this dissertation is on providing intrahour wind power variability forecasts. Wind power variability forecasts use pth CRM quantiles estimates to construct probabilistic intervals within which future wind power output will lie, conditioned on the forecasted average wind power production. One static and two time-adaptive methods are used to obtain p[superscript th] CRM quantiles estimates. All methods produce quantile estimates of acceptable reliability, with average expected deviations from nominal proportions close to 1%. Wind power variability forecasts can serve as joint-chance constraints in stochastic optimization problems, which opens the door to numerous applications of the conditional range metric. A practical example application uses the conditional range metric to estimate the size of an energy storage system (ESS). Using a probabilistic forecast of wind power hourly averages and historical data on intra-hour wind power variability, the proposed methodology estimates the size of an ESS which minimizes deviations from the forecasted hourly average. The methodology is evaluated using real-world wind power data. When the estimated ESS capacities are compared to the ESS capacities obtained from the actual data, they exhibit coverage rates which are very close to the nominal ones, with an average absolute deviation less than 1.5%. / text
|
2 |
Sequential Analysis of Quantiles and Probability Distributions by Replicated SimulationsEickhoff, Mirko January 2007 (has links)
Discrete event simulation is well known to be a powerful approach to investigate behaviour of complex dynamic stochastic systems, especially when the system is analytically not tractable. The estimation of mean values has traditionally been the main goal of simulation output analysis, even though it provides limited information about the analysed system's performance. Because of its complexity, quantile analysis is not as frequently applied, despite its ability to provide much deeper insights into the system of interest. A set of quantiles can be used to approximate a cumulative distribution function, providing fuller information about a given performance characteristic of the simulated system. This thesis employs the distributed computing power of multiple computers by proposing new methods for sequential and automated analysis of quantile-based performance measures of such dynamic systems. These new methods estimate steady state quantiles based on replicating simulations on clusters of workstations as simulation engines. A general contribution to the problem of the length of the initial transient is made by considering steady state in terms of the underlying probability distribution. Our research focuses on sequential and automated methods to guarantee a satisfactory level of confidence of the final results. The correctness of the proposed methods has been exhaustively studied by means of sequential coverage analysis. Quantile estimates are used to investigate underlying probability distributions. We demonstrate that synchronous replications greatly assist this kind of analysis.
|
3 |
Essays on Machine Learning in Risk Management, Option Pricing, and Insurance EconomicsFritzsch, Simon 05 July 2022 (has links)
Dealing with uncertainty is at the heart of financial risk management and asset pricing. This cumulative dissertation consists of four independent research papers that study various aspects of uncertainty, from estimation and model risk over the volatility risk premium to the measurement of unobservable variables.
In the first paper, a non-parametric estimator of conditional quantiles is proposed that builds on methods from the machine learning literature. The so-called leveraging estimator is discussed in detail and analyzed in an extensive simulation study. Subsequently, the estimator is used to quantify the estimation risk of Value-at-Risk and Expected Shortfall models. The results suggest that there are significant differences in the estimation risk of various GARCH-type models while in general estimation risk for the Expected Shortfall is higher than for the Value-at-Risk.
In the second paper, the leveraging estimator is applied to realized and implied volatility estimates of US stock options to empirically test if the volatility risk premium is priced in the cross-section of option returns. A trading strategy that is long (short) in a portfolio with low (high) implied volatility conditional on the realized volatility yields average monthly returns that are economically and statistically significant.
The third paper investigates the model risk of multivariate Value-at-Risk and Expected Shortfall models in a comprehensive empirical study on copula GARCH models. The paper finds that model risk is economically significant, especially high during periods of financial turmoil, and mainly due to the choice of the copula.
In the fourth paper, the relation between digitalization and the market value of US insurers is analyzed. Therefore, a text-based measure of digitalization building on the Latent Dirichlet Allocation is proposed. It is shown that a rise in digitalization efforts is associated with an increase in market valuations.:1 Introduction
1.1 Motivation
1.2 Conditional quantile estimation via leveraging optimal quantization
1.3 Cross-section of option returns and the volatility risk premium
1.4 Marginals versus copulas: Which account for more model risk in multivariate risk forecasting?
1.5 Estimating the relation between digitalization and the market value of
insurers
2 Conditional Quantile Estimation via Leveraging Optimal Quantization
2.1 Introduction
2.2 Optimal quantization
2.3 Conditional quantiles through leveraging optimal quantization
2.4 The hyperparameters N, λ, and γ
2.5 Simulation study
2.6 Empirical application
2.7 Conclusion
3 Cross-Section of Option Returns and the Volatility Risk Premium
3.1 Introduction
3.2 Capturing the volatility risk premium
3.3 Empirical study
3.4 Robustness checks
3.5 Conclusion
4 Marginals Versus Copulas: Which Account for More Model Risk in Multivariate Risk Forecasting?
4.1 Introduction
4.2 Market risk models and model risk
4.3 Data
4.4 Analysis of model risk
4.5 Model risk for models in the model confidence set
4.6 Model risk and backtesting
4.7 Conclusion
5 Estimating the Relation Between Digitalization and the Market Value of
Insurers
5.1 Introduction
5.2 Measuring digitalization using LDA
5.3 Financial data & empirical strategy
5.4 Estimation results
5.5 Conclusion
|
4 |
Estimation des limites d'extrapolation par les lois de valeurs extrêmes. Application à des données environnementales / Estimation of extrapolation limits based on extreme-value distributions.Application to environmental data.Albert, Clément 17 December 2018 (has links)
Cette thèse se place dans le cadre de la Statistique des valeurs extrêmes. Elle y apporte trois contributions principales. L'estimation des quantiles extrêmes se fait dans la littérature en deux étapes. La première étape consiste à utiliser une approximation des quantiles basée sur la théorie des valeurs extrêmes. La deuxième étape consiste à estimer les paramètres inconnus de l'approximation en question, et ce en utilisant les valeurs les plus grandes du jeu de données. Cette décomposition mène à deux erreurs de nature différente, la première étant une erreur systémique de modèle, dite d'approximation ou encore d'extrapolation, la seconde consituant une erreur d'estimation aléatoire. La première contribution de cette thèse est l'étude théorique de cette erreur d'extrapolation mal connue.Cette étude est menée pour deux types d'estimateur différents, tous deux cas particuliers de l'approximation dite de la "loi de Pareto généralisée" : l'estimateur Exponential Tail dédié au domaine d'attraction de Gumbel et l'estimateur de Weissman dédié à celui de Fréchet.Nous montrons alors que l'erreur en question peut s'interpréter comme un reste d'ordre un d'un développement de Taylor. Des conditions nécessaires et suffisantes sont alors établies de telle sorte que l'erreur tende vers zéro quand la taille de l'échantillon augmente. De manière originale, ces conditions mènent à une division du domaine d'attraction de Gumbel en trois parties distinctes. En comparaison, l'erreur d'extrapolation associée à l'estimateur de Weissman présente un comportement unifié sur tout le domaine d'attraction de Fréchet. Des équivalents de l'erreur sont fournis et leur comportement est illustré numériquement. La deuxième contribution est la proposition d'un nouvel estimateur des quantiles extrêmes. Le problème est abordé dans le cadre du modèle ``log Weibull-tail'' généralisé, où le logarithme de l'inverse du taux de hasard cumulé est supposé à variation régulière étendue. Après une discussion sur les conséquences de cette hypothèse, nous proposons un nouvel estimateur des quantiles extrêmes basé sur ce modèle. La normalité asymptotique dudit estimateur est alors établie et son comportement en pratique est évalué sur données réelles et simulées.La troisième contribution de cette thèse est la proposition d'outils permettant en pratique de quantifier les limites d'extrapolation d'un jeu de données. Dans cette optique, nous commençons par proposer des estimateurs des erreurs d'extrapolation associées aux approximations Exponential Tail et Weissman. Après avoir évalué les performances de ces estimateurs sur données simulées, nous estimons les limites d'extrapolation associées à deux jeux de données réelles constitués de mesures journalières de variables environnementales. Dépendant de l'aléa climatique considéré, nous montrons que ces limites sont plus ou moins contraignantes. / This thesis takes place in the extreme value statistics framework. It provides three main contributions to this area. The extreme quantile estimation is a two step approach. First, it consists in proposing an extreme value based quantile approximation. Then, estimators of the unknown quantities are plugged in the previous approximation leading to an extreme quantile estimator.The first contribution of this thesis is the study of this previous approximation error. These investigations are carried out using two different kind of estimators, both based on the well-known Generalized Pareto approximation: the Exponential Tail estimator dedicated to the Gumbel maximum domain of attraction and the Weissman estimator dedicated to the Fréchet one.It is shown that the extrapolation error can be interpreted as the remainder of a first order Taylor expansion. Necessary and sufficient conditions are then provided such that this error tends to zero as the sample size increases. Interestingly, in case of the so-called Exponential Tail estimator, these conditions lead to a subdivision of Gumbel maximum domain of attraction into three subsets. In constrast, the extrapolation error associated with Weissmanestimator has a common behavior over the whole Fréchet maximum domain of attraction. First order equivalents of the extrapolation error are thenderived and their accuracy is illustrated numerically.The second contribution is the proposition of a new extreme quantile estimator.The problem is addressed in the framework of the so-called ``log-Generalized Weibull tail limit'', where the logarithm of the inverse cumulative hazard rate function is supposed to be of extended regular variation. Based on this model, a new estimator of extreme quantiles is proposed. Its asymptotic normality is established and its behavior in practice is illustrated on both real and simulated data.The third contribution of this thesis is the proposition of new mathematical tools allowing the quantification of extrapolation limits associated with a real dataset. To this end, we propose estimators of extrapolation errors associated with the Exponentail Tail and the Weissman approximations. We then study on simulated data how these two estimators perform. We finally use these estimators on real datasets to show that, depending on the climatic phenomena,the extrapolation limits can be more or less stringent.
|
5 |
Συμβολή στη στατιστική συμπερασματολογία για τις κατανομές γάμα και αντίστροφη κανονική με χρήση της εμπειρικής ροπογεννήτριας συνάρτησης / Contribution to statistical inference for the Gamma distributions and the Inverse Gaussian distributions using the empirical moment generating functionΚαλλιώρας, Αθανάσιος Γ. 01 September 2008 (has links)
Το αντικείμενο της παρούσας διατριβής είναι η διερεύνηση μεθόδων στατιστικής συμπερασματολογίας για την προσαρμογή και έλεγχο της κατανομής γάμα και της αντίστροφης κανονικής (inverse Gaussian) κατανομής σε δεδομένα με θετική λοξότητα. Τα πρότυπα αυτά χρησιμοποιούνται ευρέως στην ανάλυση αξιοπιστίας και ελέγχου μακροβιότητας καθώς και σε άλλες εφαρμογές.
Αρχικά γίνεται μια περιγραφή εναλλακτικών μεθόδων στατιστικής συμπερασματολογίας για τις διπαραμετρικές και τις τριπαραμετρικές οικογένειες κατανομών γάμα και αντίστροφης κανονικής. Στη συνέχεια διερευνάται η χρήση μεθόδων στατιστικής συμπερασματολογίας για την εκτίμηση των παραμέτρων της διπαραμετρικής γάμα κατανομής με χρήση της εμπειρικής ροπογεννήτριας συνάρτησης. Μέθοδοι εκτιμητικής, όπως είναι η μέθοδος των μικτών ροπών και των γενικευμένων ελαχίστων τετραγώνων, εφαρμόζονται και συγκρίνονται με την μέθοδο της μέγιστης πιθανοφάνειας μέσω πειραμάτων προσομοίωσης Monte Carlo. Επίσης, διερευνώνται έλεγχοι καλής προσαρμογής για τη διπαραμετρική γάμα κατανομή. Οι έλεγχοι αυτοί περιλαμβάνουν τους κλασικούς ελέγχους και έναν έλεγχο που χρησιμοποιεί την εμπειρική ροπογεννήτρια συνάρτηση. Με χρήση πειραμάτων προσομοίωσης Monte Carlo, γίνεται σύγκριση των ελέγχων ως προς το πραγματικό επίπεδο σημαντικότητας και την ισχύ έναντι άλλων λοξών προς τα δεξιά κατανομών. Στη συνέχεια εφαρμόζονται έλεγχοι καλής προσαρμογής γάμα κατανομών σε πραγματικά δεδομένα, τα οποία έχουν αναλυθεί νωρίτερα από άλλους ερευνητές. Για τον έλεγχο της τριπαραμετρικής γάμα κατανομής εφαρμόζεται μόνο ο έλεγχος με χρήση της εμπειρικής ροπογεννήτριας συνάρτησης, αφού δεν είναι γνωστοί κλασικοί έλεγχοι που χρησιμοποιούν την εμπειρική συνάρτηση κατανομής.
Τέλος, γίνεται εκτίμηση ποσοστιαίων σημείων της αντίστροφης κανονικής κατανομής. Αρχικά, εκτιμώνται ποσοστιαία σημεία για την τριπαραμετρική κατανομή και στη συνέχεια εφαρμόζονται δύο μέθοδοι υπολογισμού ποσοστιαίων σημείων για την περίπτωση της διπαραμετρικής κατανομής. Η εκτίμηση των ποσοστιαίων σημείων σε κάθε οικογένεια κατανομών χρησιμοποιεί δύο μεθόδους ενδιάμεσης εκτίμησης των παραμέτρων της κατανομής. Οι μέθοδοι συγκρίνονται ως προς το μέσο τετραγωνικό σφάλμα και τη σχετική μεροληψία με τη βοήθεια πειραμάτων προσομοίωσης. / The subject of the present dissertation is the investigation of procedures of statistical inference for fitting and testing the gamma distribution and inverse Gaussian distribution, with data having positive skewness. These distributions are used widely in reliability analysis and lifetime models as well as in other applications.
In the beginning, we describe alternative methods of statistical inference for the two and three-parameter families of gamma and inverse Gaussian distributions. Then, we examine methods of statistical inference in order to estimate the parameters of the two-parameter gamma distribution using the empirical moment generating function. Estimation procedures, like the method of mixed moments and the method of generalized least squares, are applied and compared with the method of maximum likelihood through Monte Carlo simulations. Also, we investigate goodness of fit tests for the two-parameter gamma distribution. These tests include the classical tests and a test based on the empirical moment generating function. Using Monte Carlo simulations, we compare the actual level of the tests and the power of the tests against skewed to the right distributions. We apply goodness of fit tests of gamma distributions to real life data, which have been examined earlier by other researchers. For the three-parameter gamma distribution we apply only one test using the empirical moment generating function since there are no classical tests using the empirical distribution function.
Finally, we estimate quantiles of the inverse Gaussian distribution. We start estimating quantiles for the three-parameter distribution and then we apply two procedures which estimate quantiles for the two-parameter distribution. The estimates of the quantiles for each family of distributions use two procedures for estimating intermediary the parameters of the distribution. The procedures are compared with respect to the normalized mean square error and the relative bias using simulations.
|
6 |
Optimization Algorithms for Deterministic, Stochastic and Reinforcement Learning SettingsJoseph, Ajin George January 2017 (has links) (PDF)
Optimization is a very important field with diverse applications in physical, social and biological sciences and in various areas of engineering. It appears widely in ma-chine learning, information retrieval, regression, estimation, operations research and a wide variety of computing domains. The subject is being deeply studied both theoretically and experimentally and several algorithms are available in the literature. These algorithms which can be executed (sequentially or concurrently) on a computing machine explore the space of input parameters to seek high quality solutions to the optimization problem with the search mostly guided by certain structural properties of the objective function. In certain situations, the setting might additionally demand for “absolute optimum” or solutions close to it, which makes the task even more challenging.
In this thesis, we propose an optimization algorithm which is “gradient-free”, i.e., does not employ any knowledge of the gradient or higher order derivatives of the objective function, rather utilizes objective function values themselves to steer the search. The proposed algorithm is particularly effective in a black-box setting, where a closed-form expression of the objective function is unavailable and gradient or higher-order derivatives are hard to compute or estimate. Our algorithm is inspired by the well known cross entropy (CE) method. The CE method is a model based search method to solve continuous/discrete multi-extremal optimization problems, where the objective function has minimal structure. The proposed method seeks, in the statistical manifold of the parameters which identify the probability distribution/model defined over the input space to find the degenerate distribution concentrated on the global optima (assumed to be finite in quantity). In the early part of the thesis, we propose a novel stochastic approximation version of the CE method to the unconstrained optimization problem, where the objective function is real-valued and deterministic. The basis of the algorithm is a stochastic process of model parameters which is probabilistically dependent on the past history, where we reuse all the previous samples obtained in the process till the current instant based on discounted averaging. This approach can save the overall computational and storage cost. Our algorithm is incremental in nature and possesses attractive features such as stability, computational and storage efficiency and better accuracy. We further investigate, both theoretically and empirically, the asymptotic behaviour of the algorithm and find that the proposed algorithm exhibits global optimum convergence for a particular class of objective functions.
Further, we extend the algorithm to solve the simulation/stochastic optimization problem. In stochastic optimization, the objective function possesses a stochastic characteristic, where the underlying probability distribution in most cases is hard to comprehend and quantify. This begets a more challenging optimization problem, where the ostentatious nature is primarily due to the hardness in computing the objective function values for various input parameters with absolute certainty. In this case, one can only hope to obtain noise corrupted objective function values for various input parameters. Settings of this kind can be found in scenarios where the objective function is evaluated using a continuously evolving dynamical system or through a simulation. We propose a multi-timescale stochastic approximation algorithm, where we integrate an additional timescale to accommodate the noisy measurements and decimate the effects of the gratuitous noise asymptotically. We found that if the objective function and the noise involved in the measurements are well behaved and the timescales are compatible, then our algorithm can generate high quality solutions.
In the later part of the thesis, we propose algorithms for reinforcement learning/Markov decision processes using the optimization techniques we developed in the early stage. MDP can be considered as a generalized framework for modelling planning under uncertainty. We provide a novel algorithm for the problem of prediction in reinforcement learning, i.e., estimating the value function of a given stationary policy of a model free MDP (with large state and action spaces) using the linear function approximation architecture. Here, the value function is defined as the long-run average of the discounted transition costs. The resource requirement of the proposed method in terms of computational and storage cost scales quadratically in the size of the feature set. The algorithm is an adaptation of the multi-timescale variant of the CE method proposed in the earlier part of the thesis for simulation optimization. We also provide both theoretical and empirical evidence to corroborate the credibility and effectiveness of the approach.
In the final part of the thesis, we consider a modified version of the control problem in a model free MDP with large state and action spaces. The control problem most commonly addressed in the literature is to find an optimal policy which maximizes the value function, i.e., the long-run average of the discounted transition payoffs. The contemporary methods also presume access to a generative model/simulator of the MDP with the hidden premise that observations of the system behaviour in the form of sample trajectories can be obtained with ease from the model. In this thesis, we consider a modified version, where the cost function to be optimized is a real-valued performance function (possibly non-convex) of the value function. Additionally, one has to seek the optimal policy without presuming access to the generative model. In this thesis, we propose a stochastic approximation algorithm for this peculiar control problem. The only information, we presuppose, available to the algorithm is the sample trajectory generated using a priori chosen behaviour policy. The algorithm is data (sample trajectory) efficient, stable, robust as well as computationally and storage efficient. We provide a proof of convergence of our algorithm to a high performing policy relative to the behaviour policy.
|
7 |
Estimateur bootstrap de la variance d'un estimateur de quantile en contexte de population finieMcNealis, Vanessa 12 1900 (has links)
Ce mémoire propose une adaptation lisse de méthodes bootstrap par pseudo-population aux fins d'estimation de la variance et de formation d'intervalles de confiance pour des quantiles de population finie. Dans le cas de données i.i.d., Hall et al. (1989) ont montré que l'ordre de convergence de l'erreur relative de l’estimateur bootstrap de la variance d’un quantile échantillonnal connaît un gain lorsque l'on rééchantillonne à partir d’une estimation lisse de la fonction de répartition plutôt que de la fonction de répartition expérimentale. Dans cet ouvrage, nous étendons le principe du bootstrap lisse au contexte de population finie en le mettant en œuvre au sein des méthodes bootstrap par pseudo-population. Étant donné un noyau et un paramètre de lissage, cela consiste à lisser la pseudo-population dont sont issus les échantillons bootstrap selon le plan de sondage initial. Deux plans sont abordés, soit l'échantillonnage aléatoire simple sans remise et l'échantillonnage de Poisson. Comme l'utilisation des algorithmes proposés nécessite la spécification du paramètre de lissage, nous décrivons une méthode de sélection par injection et des méthodes de sélection par la minimisation d'estimés bootstrap de critères d'ajustement sur une grille de valeurs du paramètre de lissage. Nous présentons des résultats d'une étude par simulation permettant de montrer empiriquement l'efficacité de l'approche lisse par rapport à l'approche standard pour ce qui est de l'estimation de la variance d'un estimateur de quantile et des résultats plus mitigés en ce qui concerne les intervalles de confiance. / This thesis introduces smoothed pseudo-population bootstrap methods for the purposes
of variance estimation and the construction of confidence intervals for finite population
quantiles. In an i.i.d. context, Hall et al. (1989) have shown that resampling from a smoothed
estimate of the distribution function instead of the usual empirical distribution function can
improve the convergence rate of the bootstrap variance estimator of a sample quantile. We
extend the smoothed bootstrap to the survey sampling framework by implementing it in
pseudo-population bootstrap methods. Given a kernel function and a bandwidth, it consists
of smoothing the pseudo-population from which bootstrap samples are drawn using the
original sampling design. Two designs are discussed, namely simple random sampling and
Poisson sampling. The implementation of the proposed algorithms requires the specification
of the bandwidth. To do so, we develop a plug-in selection method along with grid search
selection methods based on bootstrap estimates of two performance metrics. We present the
results of a simulation study which provide empirical evidence that the smoothed approach
is more efficient than the standard approach for estimating the variance of a quantile
estimator together with mixed results regarding confidence intervals.
|
Page generated in 0.1272 seconds