241 |
Considerações sobre a relação entre distribuições de cauda pesada e conflitos de informação em inferencia bayesiana / Considerations on the relation between hevay tailed distributions and conflict of information in bayesian inferenceSantos Junior, James Dean Oliveira dos 13 March 2007 (has links)
Orientadores: Veronica Andrea Gonzales-Lopez, Laura Leticia Ramos Rifo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica / Made available in DSpace on 2018-08-08T04:30:52Z (GMT). No. of bitstreams: 1
SantosJunior_JamesDeanOliveirados_M.pdf: 1844173 bytes, checksum: 122644f8bc0dedaaa7d7633d9b25eb9c (MD5)
Previous issue date: 2006 / Resumo: Em inferência bayesiana lidamos com informações provenientes dos dados e com informações a priori. Eventualmente, um ou mais outliers podem causar um conflito entre as fontes de informação. Basica!llente, resolver um conflito entre as fontes de informações implica em encontrar um conjunto de restrições tais que uma das fontes domine, em certo sentido, as demais. Têm-se utilizado na literatura distribuições amplamente aceitas como sendo de cauda pesada para este fim. Neste trabalho, mostramos as relações existentes entre alguns resultados da teoria de conflitos e as distribuições de caudas pesadas. Também mostramos como podemos resolver conflitos no caso locação utilizando modelos subexponenciais e como utilizar a medida credence para resolver problemas no caso escala / Abstract: In bayesian inference we deal with information proceeding from the data and prior information. Eventually, one ar more outliers can cause a conflict between the sources information. Basically, to decide a conflict between the sources of information implies in finding a set of restrictions such that one of the sources dominates, in certain sense, the outher. Widely distributions have been used in literature as being of heavy tailed for this end. In this work, we show the relations between some results of the theory of conflicts and the heavy tailed distributions. Also we show how we can decide a conflicts in the location case using subexponential models and how to use the measure credence to decide problems in the scale case / Mestrado / Inferencia Bayesiana / Mestre em Estatística
|
242 |
Aplicações de cópulas em modelos de riscos múltiplos dependentes e em modelos de misturas de distribuições / Applications of copula to polyhazard models with dependence and mixture modelsTsai, Rodrigo, 1974- 30 November 2029 (has links)
Orientador: Luiz Koodi Hotta / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica / Made available in DSpace on 2018-08-21T13:55:30Z (GMT). No. of bitstreams: 1
Tsai_Rodrigo_D.pdf: 3859687 bytes, checksum: 1064b1fa05b98307d97763bb79e95de4 (MD5)
Previous issue date: 2012 / Resumo: Nesse trabalho discutimos aplicações de cópulas a modelos de riscos múltiplos com dependência e modelos de misturas de distribuições. Numa primeira parte analisamos a inclusão de dependência entre os fatores de risco do modelo de riscos múltiplos. Os modelos de riscos múltiplos são uma família de modelos flexíveis para representar dados de tempos de vida. Suas maiores vantagens sobre os modelos de risco simples incluem a habilidade de representar funções de taxa de falha com formas não usuais e a facilidade de incluir covariáveis. O objetivo principal dessa parte é modelar a dependência existente entre as causas latentes de falha do modelo de riscos múltiplos por meio de funções de cópulas. A escolha da função de cópulas bem como das funções de distribuição dos tempos latentes de falha resultam numa classe flexível de distribuições de sobrevivência que é capaz de representar funções de taxa de falha de formas multimodais, forma de banheira e contendo efeitos locais dados pela concorrência dos riscos. A identificação e estimação do modelo proposto também são discutidas. Ao eliminar a restrição de suporte positivo para as variáveis latentes, o método pode ser utilizado para gerar uma família rica de distribuições univariadas contendo assimetrias e múltiplas modas. Na segunda parte propomos um modelo de mistura de distribuições generalizado utilizando cópulas. O parâmetro da cópula é útil para definir formas de assimetria e ponderar com maior ou menor peso determinadas regiões do suporte das distribuições componentes para compor a mistura. pesos das distribuições componentes variam no suporte da distribuição e não são restritos à soma unitária. A modelagem resultante acrescenta uma maior flexibilidade aos modelos de misturas na representação de dados com densidades de várias formas multimodais e assimétricas. O modelo tem como casos particulares o modelo de mistura tradicional, o modelo de riscos múltiplos e o modelo de fração de cura. Os modelos são aplicados a dados simulados e reais da literatura. Foram utilizados os métodos de estimação de máxima verossimilhança e os critérios de ajuste de Akaike e Bayesiano para a seleção dos modelos. Os modelos representaram bem os conjuntos de dados analisados em comparação com metodologias propostas na literatura / Abstract: In this work, we discuss the application of copula to polyhazard and mixture models. First we analyse the inclusion of dependence among failure causes in the polyhazard models. The polyhazard models constitute a family of flexible models to represent lifetime data. Their main advantages over single hazard models include the ability to represent hazard rate functions with unusual shapes and the ease of including covariates. The main purpose in this first part is to model the dependence that exists among the latent causes of failure in the polyhazard model by copula functions. The choice of the copula function as well as the latent failure distributions produces a flexible class of survival distributions that is able to model hazard functions with unusual shapes such as bathtub or multimodal curves, while also modelling local effects given by the competing risks. The model identification and estimation are also discussed. Dropping the restriction of positive support for the latent variables, the method can be used to generate a rich family of univariate distributions with asymmetries and multiple modes. In the second part a generalized mixture model using copula functions is proposed. To assemble the mixture model, the parameter of the copula function is used to define asymmetry shapes and to attribute more or less weight to chosen regions of the component distributions. The weights of the component distributions vary on the support of the distribution and are not restricted to the unitary sum. The resulting model increases the flexibility of the mixture models to represent data with densities with several multimodal and asymmetric shapes. Special cases of the model are the traditional mixture models, the polyhazard model, and the cure fraction model. Simulated and empirical data from the literature are analysed by the proposed models. The estimation was done by maximum likelihood methods and the selection of the models used the Akaike and Bayesian criteria. The proposed models exhibited very good fit to the data sets in comparison to other methodologies presented in the literature / Doutorado / Estatistica / Doutor em Estatística
|
243 |
Contributions to nonparametric and semiparametric inference based on statistical depth / Contributions à l'inférence nonparamétrique et semiparamétrique fondée sur la profondeur statistiqueVan Bever, Germain 06 September 2013 (has links)
L'objectif général de cette thèse est d'introduire de nouveaux concepts ou d'étendre certaines procédures statistiques déjà existantes touchant à la notion de profondeur statistique. <p><p>Celle-ci, originellement introduite afin de généraliser la notion de médiane et de fournir naturellement un ordre (depuis un centre, vers l'extérieur) dans un contexte multivarié, a, depuis son développement, démontré ses nombreuses qualités, tant en termes de robustesse, que d'utilité dans de nombreuses procédures inférentielles.<p>Les résultats proposés dans ce travail se développent le long de trois axes.<p><p>Pour commencer, la thèse s'intéresse à la classification supervisée. La profondeur a, en effet, déjà été utilisée avec succès dans ce contexte. Cependant, jusqu'ici, les outils développés restaient limités aux distributions elliptiques, constituant ainsi une sévère restriction des méthodes utilisant les fonctions de profondeur, qui, pour la plupart, sont par essence nonparamétrique. La première partie de cette thèse propose donc une nouvelle méthode de classification, fondée sur la profondeur, dont on montrera qu'elle est essentiellement universellement convergente. En particulier, la règle de discrimination proposée se fonde sur les idées utilisées dans la classification par plus proches voisins, en introduisant cependant des voisinages fondés sur la profondeur, mieux à même de cerner le comportement des populations sous-jacentes.<p><p>Ces voisinages d'un point quelconque, et surtout l'information sur le comportement local de la distribution en ce point qu'ils apportent, ont été réutilisés dans la seconde partie de ce travail. Plusieurs auteurs ont en effet reconnu certaines limitations aux fonctions de profondeur, de par leur caractère global et la difficulté d'étudier par leur biais des distributions multimodales ou à support convexe. Une nouvelle définition de profondeur locale est donc développée et étudiée. Son utilité dans différents problèmes d'inférence est également explorée.<p><p>Enfin, la thèse s'intéresse au paramètre de forme pour les distributions elliptiques. Ce paramètre d'importance est utilisé dans de nombreuses procédures statistiques (analyse en composantes principales, analyse en corrélations canoniques, entre autres) et aucune fonction de profondeur pour celui-ci n'existait à ce jour. La profondeur de forme est donc définie et ses propriétés sont étudiées. En particulier, on montrera que le cadre général de la profondeur paramétrique n'est pas suffisant en raison de la présence du paramètre de nuisance (d'influence non nulle) qu'est l'échelle. Une application inférentielle est présentée dans le cadre des tests d'hypothèses. / Doctorat en Sciences / info:eu-repo/semantics/nonPublished
|
244 |
On the modeling of asset returns and calibration of European option pricing modelsRobbertse, Johannes Lodewickes 07 July 2008 (has links)
Prof. F. Lombard
|
245 |
Extremal Queueing TheoryChen, Yan January 2022 (has links)
Queueing theory has often been applied to study communication and service queueing systems such as call centers, hospital emergency departments and ride-sharing platforms. Unfortunately, it is complicated to analyze queueing systems. That is largely because the arrival and service processes that mainly determine a queueing system are uncertain and must be represented as stochastic processes that are difficult to analyze. In response, service providers might be able to partially capture the main characteristics of systems given partial data information and limited domain knowledge. An effective engineering response is to develop tractable approximations to approximate queueing characteristics of interest that depend on critical partial information. In this thesis, we contribute to developing high-quality approximations by studying tight bounds for the transient and the steady-state mean waiting time given partial information.
We focus on single-server queues and multi-server queues with the unlimited waiting room, the first-come-first-served service discipline, and independent sequences of independent and identically distributed sequences of interarrival times and service times. We assume some partial information is known, e.g., the first two moments of inter-arrival and service time distributions. For the single-server GI/GI/1 model, we first study the tight upper bounds for the mean and higher moments of the steady-state waiting time given the first two moments of the inter-arrival time and service-time distributions. We apply the theory of Tchebycheff systems to obtain sufficient conditions for classical two-point distributions to yield the extreme values. For the tight upper bound of the transient mean waiting time, we formulate the problem as a non-convex non-linear program, derive the gradient of the transient mean waiting time over distributions with finite support, and apply classical non-linear programming theory to characterize stationary points. We then develop and apply a stochastic variant of the conditional gradient algorithm to find a stationary point for any given service-time distribution. We also establish necessary conditions and sufficient conditions for stationary points to be three-point distributions or special two-point distributions.
Our studies indicate that the tight upper bound for the steady-state mean waiting time is attained asymptotically by two-point distributions as the upper mass point of the service-time distribution increases and the probability decreases, while one mass of the inter-arrival time distribution is fixed at 0. We then develop effective numerical and simulation algorithms to compute the tight upper bound. The algorithms are aided by reductions of the special queues with extremal inter-arrival time and extremal service-time distributions to D/GI/1 and GI/D/1 models. Combining these reductions yields an overall representation in terms of a D/RS(D)/1 discrete-time model involving a geometric random sum of deterministic random variables, where the two deterministic random variables have different values, so that the extremal waiting times need not have a lattice distribution. We finally evaluate the tight upper bound to show that it offers a significant improvement over established bounds.
In order to understand queueing performance given only partial information, we propose determining intervals of likely performance measures given that limited information. We illustrate this approach for the steady-state waiting time distribution in the GI/GI/K queue given the first two moments of the inter-arrival time and service time distributions plus additional information about these underlying distributions, including support bounds, higher moments, and Laplace transform values. As a theoretical basis, we apply the theory of Tchebycheff systems to determine extremal models (yielding tight upper and lower bounds) on the asymptotic decay rate of the steady-state waiting-time tail probability, as in the Kingman-Lundberg bound and large deviations asymptotics. We then can use these extremal models to indicate likely intervals of other performance measures. We illustrate by constructing such intervals of likely mean waiting times. Without extra information, the extremal models involve two-point distributions, which yield a wide range for the mean. Adding constraints on the third moment and a transform value produces three-point extremal distributions, which significantly reduce the range, yielding practical levels of accuracy.
|
246 |
Computational Inversion with Wasserstein Distances and Neural Network Induced Loss FunctionsDing, Wen January 2022 (has links)
This thesis presents a systematic computational investigation of loss functions in solving inverse problems of partial differential equations. The primary efforts are spent on understanding optimization-based computational inversion with loss functions defined with the Wasserstein metrics and with deep learning models. The scientific contributions of the thesis can be summarized in two directions.
In the first part of this thesis, we investigate the general impacts of different Wasserstein metrics and the properties of the approximate solutions to inverse problems obtained by minimizing loss functions based on such metrics. We contrast the results to those of classical computational inversion with loss functions based on the 𝐿² and 𝐻⁻ metric. We identify critical parameters, both in the metrics and the inverse problems to be solved, that control the performance of the reconstruction algorithms. We highlight the frequency disparity in the reconstructions with the Wasserstein metrics as well as its consequences, for instance, the pre-conditioning effect, the robustness against high-frequency noise, and the loss of resolution when data used contain random noise. We examine the impact of mass unbalance and conduct a comparative study on the differences and important factors of various unbalanced Wasserstein metrics.
In the second part of the thesis, we propose loss functions formed on a novel offline-online computational strategy for coupling classical least-square computational inversion with modern deep learning approaches for full waveform inversion (FWI) to achieve advantages that can not be achieved with only one component. In a nutshell, we develop an offline learning strategy to construct a robust approximation to the inverse operator and utilize it to produce a viable initial guess and design a new loss function for the online inversion with a new dataset. We demonstrate through both theoretical analysis and numerical simulations that our neural network induced loss functions developed by the coupling strategy improve the loss landscape as well as computational efficiency of FWI with reliable offline training on moderate computational resources in terms of both the size of the training dataset and the computational cost needed.
|
247 |
Non-asymptotic bounds for prediction problems and density estimation.Minsker, Stanislav 05 July 2012 (has links)
This dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active learning.
Given a random couple (X,Y) with unknown distribution P, the goal of binary classification is to predict a label Y based on the observation X. Prediction rule is constructed from a sequence of observations sampled from P. The concept of active learning can be informally characterized as follows: on every iteration, the algorithm is allowed to request a label Y for any instance X which it considers to be the most informative. The contribution of this work consists of two parts: first, we provide the minimax lower bounds for the performance of active learning methods. Second, we propose an active learning algorithm which attains nearly optimal rates over a broad class of underlying distributions and is adaptive with respect to the unknown parameters of the problem.
The second part of this thesis is related to sparse recovery in the framework of dictionary learning. Let (X,Y) be a random couple with unknown distribution P. Given a collection of functions H, the goal of dictionary learning is to construct a prediction rule for Y given by a linear combination of the elements of H. The problem is sparse if there exists a good prediction rule that depends on a small number of functions from H. We propose an estimator of the unknown optimal prediction rule based on penalized empirical risk minimization algorithm. We show that the proposed estimator is able to take advantage of the possible sparse structure of the problem by providing probabilistic bounds for its performance.
|
248 |
Estimating the maximum probability of categorical classes with applications to biological diversity measurementsHuynh, Huy 05 July 2012 (has links)
The study of biological diversity has seen a tremendous growth over the past few decades. Among the commonly used indices capturing both the richness and evenness of a community, the Berger-Parker index, which relates to the maximum proportion of all species, is particularly effective. However, when the number of individuals and species grows without bound this index changes, and it is important to develop statistical tools to measure this change. In this thesis, we introduce two estimators for this maximum: the multinomial maximum and the length of the longest increasing subsequence. In both cases, the limiting distribution of the estimators, as the number of individuals and species simultaneously grows without bound, is obtained. Then, constructing the 95% confidence intervals for the maximum proportion helps improve the comparison of the Berger-Parker index among communities. Finally, we compare the two approaches by examining their associated bias corrected estimators and apply our results to environmental data.
|
249 |
Contribuciones a la dependencia y dimensionalidad en cópulasDíaz, Walter 18 January 2013 (has links)
El concepto de dependencia aparece por todas partes en nuestra tierra y sus habitantes de manera profunda. Son innumerables los ejemplos de fenómenos interdependientes en la naturaleza, así como en aspectos médicos, sociales, políticos, económicos, entre otros. Más aún, la dependencia es obviamente no determinística, sino de naturaleza estocástica. Es por lo anterior que resulta sorprendente que conceptos y medidas de dependencia no hayan recibido suficiente atención en la literatura estadística. Al menos hasta 1966, cuando el trabajo pionero de E.L. Lehmann probó el lema de Hoeffding. Desde entonces, se han publicado algunas generalizaciones de este. Nosotros hemos obtenido una generalización multivariante para funciones de variación acotada que agrupa a las planteadas anteriormente, al establecer la relación entre los planteamiento presentados por Quesada-Molina (1992) y Cuadras (2002b) y extendiendo este último al caso multivariante.
Uno de los conceptos importante en la interpretación estadística esta relacionada con la dimensión. Es por eso que hemos definido la dimensionalidad geométrica de una distribución conjunta H en función del cardinal del conjunto de correlaciones canónicas de H, si H se puede representar mediante una expansión diagonal. La dimensionalidad geométrica ha sido obtenida para algunas de las familias de cópulas más conocidas. Para determinar la dimensionalidad de algunas de las copulas, se utilizaron métodos numéricos. De acuerdo con la dimensionalidad, hemos clasificado a las cópulas en cuatro grupos: las de dimensión cero, finita, numerable o continua. En la mayoría de las cópulas se encontro que poseen dimensión numerable.
Con el uso de dos funciones que satisfacen ciertas condiciones de regularidad, se ha obtenido una extensión generalizada para la cópula Gumbel-Barnett, a la que hemos deducido sus principales propiedades y medidas de dependencia para algunas funciones en particular.
La cópula FGM es una de las cópulas con más aplicabilidad en campos como el análisis financiero, y a la que se le han obtenido un gran número de generalizaciones para el caso simétrico. Nosotros hemos obtenido dos nuevas generalizaciones. La primera fue obtenida al adicionar dos distribuciones auxiliares y la segunda generalización es para el caso asimétrico. En está última caben algunas de las generalizaciones existentes. Para ambos casos se han deducido los rangos admisibles de los parámetros de asociación, las principales propiedades y las medidas de dependencia.
Demostramos que si se conocen las funciones canónicas de una función de distribución, es posible aproximarla a otra función de distribución a través de combinaciones lineales de las funciones canónicas. Como ejemplo, consideramos la cópula FGM en dos dimensiones, en el sentido geométrico, debido a que se conocen sus funciones canónicas, y hemos comprobado numéricamente que su aproximación a otras cópulas con dimensión numerable es aceptablemente bueno. / Contributions to Dependence and Dimensionality in copulas
The concept of dependency is everywhere in our land and its inhabitants in a profound way. There are countless examples of interdependent phenomena in nature, or related to medical, social, political and economic aspects. Moreover, dependence is obviously non deterministic, but stochastic in nature. For this reason, it is surprising that concepts and measures of dependence have not been paid enough attention in the statistical literature; at least until 1966 when the pioneering work of E.L. Lehmann proved Hoeffding’s lemma, some generalizations of this have been released since then. We have obtained a multivariate generalization for functions of bounded variation that groups the above mentioned generalizations, by ascertaining the relation between the approaches presented by Quesada-Molina (1992) and Cuadras (2002b) and extending the latter to the multivariate case.
One of the important concepts in statistical interpretation deals with dimensionality, which is why we have defined the geometric dimensionality of a joint distribution H as a function of the cardinal of the set of canonical correlations of H, if H can be represented by a diagonal expansion. The geometrical dimensionality has been obtained for some of the best known families of copulas. To determine the dimensionality of some copulas, numerical methods were used. According to the dimensionality, we have classified the copulas into four groups: the zero-, finite-, countable- or continuous-dimensional. Most of the copulas were found to possess countable dimension.
With the use of two functions that satisfy certain regularity conditions, we have obtained a generalized extension of the Gumbel-Barnett copula, for which we have derived its main properties and measures of dependence, particularly for some functions.
The FGM copula is one of the copulas with more applicability in fields such as financial analysis, and for which a large number of generalizations for the symmetric case have been obtained. We have obtained two new generalizations: the first was obtained by adding two auxiliary distributions and the second generalization is to the asymmetric case, in the latter some existing generalizations do fit. For both cases, the allowable ranges of association parameters, as well as the main properties and dependence measures have been deducted.
We show that if the canonical functions of a distribution function are known, it is possible to approximate it to another distribution function through linear combinations of canonical functions. As an example, consider the two-dimensional FGM copula, in the geometric sense, because their canonical functions are known and we have numerically found that their approximation to other copulas with countable dimension is acceptably good.
|
250 |
Generalizing list scheduling for stochastic soft real-time parallel applicationsDandass, Yoginder Singh. January 2003 (has links)
Thesis (Ph. D.)--Mississippi State University. Department of Computer Science and Engineering. / Title from title screen. Includes bibliographical references.
|
Page generated in 0.1515 seconds