Global ETD Search

1	Bayesian Analysis of Transposon Mutagenesis Data DeJesus, Michael A. 2012 May 1900 (has links) Determining which genes are essential for growth of a bacterial organism is an important question to answer as it is useful for the discovery of drugs that inhibit critical biological functions of a pathogen. To evaluate essentiality, biologists often use transposon mutagenesis to disrupt genomic regions within an organism, revealing which genes are able to withstand disruption and are therefore not required for growth. The development of next-generation sequencing technology augments transposon mutagenesis by providing high-resolution sequence data that identifies the exact location of transposon insertions in the genome. Although this high-resolution information has already been used to assess essentiality at a genome-wide scale, no formal statistical model has been developed capable of quantifying significance. This thesis presents a formal Bayesian framework for analyzing sequence information obtained from transposon mutagenesis experiments. Our method assesses the statistical significance of gaps in transposon coverage that are indicative of essential regions through a Gumbel distribution, and utilizes a Metropolis-Hastings sampling procedure to obtain posterior estimates of the probability of essentiality for each gene. We apply our method to libraries of M. tuberculosis transposon mutants, to identify genes essential for growth in vitro, and show concordance with previous essentiality results based on hybridization. Furthermore, we show how our method is capable of identifying essential domains within genes, by detecting significant sub-regions of open-reading frames unable to withstand disruption. We show that several genes involved in PG biosynthesis have essential domains. Bioinformatics Bayesian Analysis Gumbel Metropolis Hastings Sampling
2	Diagnósticos de Influência em Modelo de Regressão de Valor Extremo em Censura Tipo I Andrade, Maria Aparecida Silva de 01 March 2016 (has links) Submitted by Fernando Souza (fernandoafsou@gmail.com) on 2017-08-21T16:29:54Z No. of bitstreams: 1 arquivototal.pdf: 917941 bytes, checksum: e85ae298ee652a63fa17fc43f8fc6b8f (MD5) / Made available in DSpace on 2017-08-21T16:29:54Z (GMT). No. of bitstreams: 1 arquivototal.pdf: 917941 bytes, checksum: e85ae298ee652a63fa17fc43f8fc6b8f (MD5) Previous issue date: 2016-03-01 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / Conselho Nacional de Pesquisa e Desenvolvimento Científico e Tecnológico - CNPq / In this paper, we analyze the problem of evaluating the influence of observations in the extreme value regression model (Gumbel regression) under type I censorship. This model is very important in the analysis of lifetime data. First, we obtain the log-likelihood function, the score function and Fisher's information matrix. Then we will discuss some methods of influence, such as global influence and the influence local. In the local influence analysis will derive the normal curvatures under various perturbation schemes. We conclude the work obtaining a closed-form expression for the generalized leverage. / Neste trabalho, analisaremos o problema de avaliar a influência de observações no modelo de regressão de valor extremo (regressão Gumbel) sob censura tipo I. Tal modelo é muito importante na análise de dados de tempo de vida. Primeiramente, obteremos a função log-verossimilhança, a função escore e a matriz de informação de Fisher. Em seguida discutiremos alguns métodos de influência, tais como a influência global e a influência local. Na análise de influência local, derivaremos expressões para as curvaturas normais sob diferentes esquemas de perturbações. Finalizaremos obtendo uma expressão de forma fechada para a alavancagem generalizada. Distribuicão Gumbel Censura Tipo I Regressão Gumbel Diagnóstico de Influência Alavancagem Generalizada Gumbel distribution Type I Censorship Gumbel Regression Diagnosis of In uence Generalized Leverage
3	Optimal Monitoring Methods for Univariate and Multivariate EWMA Control Charts Huh, Ick 11 1900 (has links) Due to the rapid development of technology, quality control charts have attracted more attention from manufacturing industries in order to monitor quality characteristics of interest more effectively. Among many control charts, my research work has focused on the multivariate exponentially weighted moving average (MEWMA) and the univariate exponentially weighted moving average (EWMA) control charts by using the Markov chain method. The performance of the chart is measured by the optimal average run length (ARL). My Ph.D. thesis is composed of the following three contributions. My first research work is about differential smoothing. The MEWMA control chart proposed by Lowry et al. (1992) has become one of the most widely used charts to monitor multivariate processes. Its simplicity, combined with its high sensitivity to small and moderate process mean jumps, is at the core of its appeal. Lowry et al. (1992) advocated equal smoothing of each quality variable unless there is an a priori reason to weigh quality characteristics differently. However, one may have situations where differential smoothing may be justified. For instance: (a) departures in process mean may be different across quality variables, (b) some variables may evolve over time at a much different pace than other variables, and (c) the level of correlation between variables could vary substantially. For these reasons, I focus on and assess the performance of the differentially smoothed MEWMA chart. The case of two quality variables (BEWMA) is discussed in detail. A bivariate Markov chain method that uses conditional distributions is developed for average run length (ARL) calculations. The proposed chart is shown to perform at least as well as Lowry et al. (1992)'s chart, and noticeably better in most other mean jump directions. Comparisons with the recently introduced double-smoothed BEWMA chart and the univariate charts for the independent case show that the proposed differentially smoothed BEWMA chart has superior performance. My second research work is about monitoring skewed multivariate processes. Recently, Xie et al. (2011) studied monitoring bivariate exponential quality measurements using the standard MEWMA chart originally developed to monitor multivariate normal quality data. The focus of my work is on situations where, marginally, the quality measurements may follow not only exponential distributions but also other skewed distributions such as Gamma or Weibull, in any combination. The joint distribution is specified using the Gumbel copula function thus allowing for varying degrees of correlation among the quality measurements. In addition to the standard MEWMA chart, charts based on the largest or smallest of the measurements and on the joint cumulative distribution function or the joint survivor function, are studied in detail. The focus is on the case of two quality measurements, i.e., on skewed bivariate processes. The proposed charts avoid an undesirable feature encountered by Xie et al. (2011) for the standard MEWMA chart where in some cases the off-target average run length turns out to be larger than the on-target one. Using the optimal average run length, our extensive numerical results show that the proposed methods provide an overall good detection performance in most directions. Simulations were performed to obtain the optimal ARL results but the Markov chain method using the empirical CDF of the statistics involved verified the accuracy of the ARL results. In addition, an examination of the effect of correlation on chart performance was undertaken numerically. The methods are easily extendable to more than two variables. Final study is about a new ARL criterion for univariate processes studied in detail in this thesis. The traditional ARL is calculated assuming a given fixed process mean jump and a given time point where the jump occurs, usually taken to be from the very beginning in most chart performance studies. However, Ryu et al. (2010) demonstrated that the assumption of a fixed mean shift might lead to poor performance of control charts when the actual size of the mean shift is significantly different and therefore suggested a new ARL-based performance measure, called expected weighted run length (EWRL), by assuming that the size of the mean shift is not specified but rather it follows a probability distribution. The EWRL becomes the expected value of the weighted ordinary ARL with respect to this distribution. My methods generalize this criterion by allowing the time at which the mean shift occurs to also vary according to a probability distribution. This leads to a joint distribution for the size of the mean shift and the time the shift takes place, then the EWRL is calculated as the weighted expected value with respect to this joint distribution. The benefit of the generalized EWRL is that one can assess the performance of control charts more realistically when the process starts on-target and then the mean shift occurs at some later random time. Moreover, I also propose the effective EWRL, which measures the number of additional process runs that on average are needed to detect a jump in the mean after it happens. I evaluate several well-known univariate control charts based on their EWRL and effective EWRL performance. The numerical results show that the choice of control chart depends on the additional information on the transition point of the mean shift. The methods can readily be extended to other control charts, including multivariate charts. / Thesis / Doctor of Philosophy (PhD) / Since the introduction of the standard multivariate exponentially weighted moving average (MEWMA) procedure (Lowry et al. 1992), equal smoothing on all quality variables has been conveniently adopted. In this thesis, a bivariate exponentially weighted moving average (BEWMA) control statistic with unequal smooth- ing parameters is introduced with the aim of improving performance over the standard BEWMA chart. Extensive numerical comparisons reveal that the proposed chart enhances the efficiency and flexibility of the control chart in many mean-shift directions. Recently, Xie et al. (2011) proposed a chart for bivariate Exponential data when the quality measures follow Gumbel’s bivariate Exponential distribution (Gumbel 1960). However, when the process means experience a downward shift (D-D shift), the control charts are shown to break down. In other words, we encounter the strange situation where the out-of-control ARL becomes larger than the in-control ARL. To address this issue, we have proposed two methods, the MAX-MIN and CDF methods and applied them to the univariate EWMA chart. Our numerical results show that not only do our proposed methods prevent the undesirable behaviour from happening, but they also offer substantial improvement in the ARL over the approach proposed by Xie et al. (2011) in many mean shifts. Finally, in general, when it comes to designing a control chart, it is assumed that the size of the mean shift is fixed and known. However, Ryu et al. (2010) proposed a new general performance measure, EWRL, by modelling the size of the mean shift with a probability distribution function. We further generalize the measure by introducing a new random variable, T, which is the transition point of the mean shift. Based on that, we propose several ARL-based criteria to measure the chart performance and try them on several univariate control charts. Differential smoothing ARL Survival Gumbel copula
4	When do Systematic Gains Uniquely Determine the Number of Marriages between Different Types in the Choo-Siow matching model? Sufficient Conditions for a Unique Equilibrium Decker, Colin 22 February 2011 (has links) In a transferable utility context, Choo and Siow (2006) introduced a competitive model of the marriage market with gumbel distributed stochastic part, and derived its equilibrium output, a marriage match- ing function. The marriage matching function defines the gains generated by a marriage between agents of prescribed types in terms of the observed frequency of such marriages within the population, relative to the number of unmarried individuals of the same types. Left open in their work is the issue of existence and uniqueness of equilibrium. We resolve this question in the affirmative, assuming the norm of the gains matrix (viewed as an operator) to be less than two. Our method adapts a strategy called the continuity method,more commonly used to solve elliptic partial differen- tial equations, to the new setting of isolating positive roots of polynomial systems. Finally, the data estimated in [4] falls within the scope of our results. Matching Gumbel Choo-Siow Equilibrium Uniqueness Existence 0280
5	When do Systematic Gains Uniquely Determine the Number of Marriages between Different Types in the Choo-Siow matching model? Sufficient Conditions for a Unique Equilibrium Decker, Colin 22 February 2011 (has links) In a transferable utility context, Choo and Siow (2006) introduced a competitive model of the marriage market with gumbel distributed stochastic part, and derived its equilibrium output, a marriage match- ing function. The marriage matching function defines the gains generated by a marriage between agents of prescribed types in terms of the observed frequency of such marriages within the population, relative to the number of unmarried individuals of the same types. Left open in their work is the issue of existence and uniqueness of equilibrium. We resolve this question in the affirmative, assuming the norm of the gains matrix (viewed as an operator) to be less than two. Our method adapts a strategy called the continuity method,more commonly used to solve elliptic partial differen- tial equations, to the new setting of isolating positive roots of polynomial systems. Finally, the data estimated in [4] falls within the scope of our results. Matching Gumbel Choo-Siow Equilibrium Uniqueness Existence 0280
6	Comparison of Prediction Intervals for the Gumbel Distribution Fang, Lin 06 1900 (has links) <p> The problem of obtaining a prediction interval at specified confidence level to contain k future observations from the Gumbel distribution, based on an observed sample from the same distribution, is considered. An existing method due to Hahn, which is originally valid for the normal, is adapted to the Gumbel case. Motivated by the equivalence between Hahn's prediction intervals and Bayesian predictive intervals for the normal, we develop Bayesian predictive intervals for the Gumbel in the case where the scale parameter b is both known and unknown. Furthermore, we perform comparison of Hahn's and Bayesian intervals. We find that the Bayesian is better in the b known case, while Hahn and Bayes perform about the same in the other case when b is unknown. We then consider the maximum of the Hahn's and Bayesian predicted lower limits which is shown to be a better predictor when b is unknown. All the discussions are based on Monte Carlo simulations. In the end, the results are applied to Ontario Power Generation data on feeder thicknesses.</p> / Thesis / Master of Science (MSc)
7	Contribuições em inferência e modelagem de valores extremos / Contributions to extreme value inference and modeling. Pinheiro, Eliane Cantinho 04 December 2013 (has links) A teoria do valor extremo é aplicada em áreas de pesquisa tais como hidrologia, estudos de poluição, engenharia de materiais, controle de tráfego e economia. A distribuição valor extremo ou Gumbel é amplamente utilizada na modelagem de valores extremos de fenômenos da natureza e no contexto de análise de sobrevivência para modelar o logaritmo do tempo de vida. A modelagem de valores extremos de fenômenos da natureza tais como velocidade de vento, nível da água de rio ou mar, altura de onda ou umidade é importante em estatística ambiental pois o conhecimento de valores extremos de tais eventos é crucial na prevenção de catátrofes. Ultimamente esta teoria é de particular interesse pois fenômenos extremos da natureza têm sido mais comuns e intensos. A maioria dos artigos sobre teoria do valor extremo para modelagem de dados considera amostras de tamanho moderado ou grande. A distribuição Gumbel é frequentemente incluída nas análises mas a qualidade do ajuste pode ser pobre em função de presença de ouliers. Investigamos modelagem estatística de eventos extremos com base na teoria de valores extremos. Consideramos um modelo de regressão valor extremo introduzido por Barreto-Souza & Vasconcellos (2011). Os autores trataram da questão de corrigir o viés do estimador de máxima verossimilhança para pequenas amostras. Nosso primeiro objetivo é deduzir ajustes para testes de hipótese nesta classe de modelos. Derivamos a estatística da razão de verossimilhanças ajustada de Skovgaard (2001) e cinco ajustes da estatística da razão de verossimilhanças sinalizada, que foram propostos por Barndorff-Nielsen (1986, 1991), DiCiccio & Martin (1993), Skovgaard (1996), Severini (1999) e Fraser et al. (1999). As estatísticas ajustadas são aproximadamente distribuídas como uma distribuição $\\chi^2$ e normal padrão com alto grau de acurácia. Os termos dos ajustes têm formas compactas simples que podem ser facilmente implementadas em softwares disponíveis. Comparamos a performance do teste da razão de verossimilhanças, do teste da razão de verossimilanças sinalizada e dos testes ajustados obtidos neste trabalho em amostras pequenas. Ilustramos uma aplicação dos testes usuais e suas versões modificadas em conjuntos de dados reais. As distribuições das estatísticas ajustadas são mais próximas das respectivas distribuições limites comparadas com as distribuições das estatísticas usuais quando o tamanho da amostra é relativamente pequeno. Os resultados de simulação indicaram que as estatísticas ajustadas são recomendadas para inferência em modelo de regressão valor extremo quando o tamanho da amostra é moderado ou pequeno. Parcimônia é importante quando os dados são escassos, mas flexibilidade também é crucial pois um ajuste pobre pode levar a uma conclusão completamente errada. Uma revisão da literatura foi feita para listar as distribuições que são generalizações da distribuição Gumbel. Nosso segundo objetivo é avaliar a parcimônia e flexibilidade destas distribuições. Com este propósito, comparamos tais distribuições através de momentos, coeficientes de assimetria e de curtose e índice da cauda. As famílias mais amplas obtidas pela inclusão de parâmetros adicionais, que têm a distribuição Gumbel como caso particular, apresentam assimetria e curtose flexíveis enquanto a distribuição Gumbel apresenta tais características constantes. Dentre estas distribuições, a distribuição valor extremo generalizada é a única com índice da cauda que pode ser qualquer número real positivo enquanto os índices da cauda das outras distribuições são zero. Observamos que algumas generalizações da distribuição Gumbel estudadas na literatura são não identificáveis. Portanto, para estes modelos a interpretação e estimação de parâmetros individuais não é factível. Selecionamos as distribuições identificáveis e as ajustamos a um conjunto de dados simulado e a um conjunto de dados reais de velocidade de vento. Como esperado, tais distribuições se ajustaram bastante bem ao conjunto de dados simulados de uma distribuição Gumbel. A distribuição valor extremo generalizada e a mistura de duas distribuições Gumbel produziram melhores ajustes aos dados do que as outras distribuições na presença não desprezível de observações discrepantes que não podem ser acomodadas pela distribuição Gumbel e, portanto, sugerimos que tais distribuições devem ser utilizadas neste contexto. / The extreme value theory is applied in research fields such as hydrology, pollution studies, materials engineering, traffic management, economics and finance. The Gumbel distribution is widely used in statistical modeling of extreme values of a natural process such as rainfall and wind. Also, the Gumbel distribution is important in the context of survival analysis for modeling lifetime in logarithmic scale. The statistical modeling of extreme values of a natural process such as wind or humidity is important in environmental statistics; for example, understanding extreme wind speed is crucial in catastrophe/disaster protection. Lately this is of particular interest as extreme natural phenomena/episodes are more common and intense. The majority of papers on extreme value theory for modeling extreme data is supported by moderate or large sample sizes. The Gumbel distribution is often considered but the resulting fit may be poor in the presence of ouliers since its skewness and kurtosis are constant. We deal with statistical modeling of extreme events data based on extreme value theory. We consider a general extreme-value regression model family introduced by Barreto-Souza & Vasconcellos (2011). The authors addressed the issue of correcting the bias of the maximum likelihood estimators in small samples. Here, our first goal is to derive hypothesis test adjustments in this class of models. We derive Skovgaard\'s adjusted likelihood ratio statistics Skovgaard (2001) and five adjusted signed likelihood ratio statistics, which have been proposed by Barndorff-Nielsen (1986, 1991), DiCiccio & Martin (1993), Skovgaard (1996), Severini (1999) and Fraser et al. (1999). The adjusted statistics are approximately distributed as $\\chi^2$ and standard normal with high accuracy. The adjustment terms have simple compact forms which may be easily implemented by readily available software. We compare the finite sample performance of the likelihood ratio test, the signed likelihood ratio test and the adjusted tests obtained in this work. We illustrate the application of the usual tests and their modified versions in real datasets. The adjusted statistics are closer to the respective limiting distribution compared to the usual ones when the sample size is relatively small. Simulation results indicate that the adjusted statistics can be recommended for inference in extreme value regression model with small or moderate sample size. Parsimony is important when data are scarce, but flexibility is also crucial since a poor fit may lead to a completely wrong conclusion. A literature review was conducted to list distributions which nest the Gumbel distribution. Our second goal is to evaluate their parsimony and flexibility. For this purpose, we compare such distributions regarding moments, skewness, kurtosis and tail index. The larger families obtained by introducing additional parameters, which have Gumbel embedded in, present flexible skewness and kurtosis while the Gumbel distribution skewness and kurtosis are constant. Among these distributions the generalized extreme value is the only one with tail index that can be any positive real number while the tail indeces of the other distributions investigated here are zero. We notice that some generalizations of the Gumbel distribution studied in the literature are not indetifiable. Hence, for these models meaningful interpretation and estimation of individual parameters are not feasible. We select the identifiable distributions and fit them to a simulated dataset and to real wind speed data. As expected, such distributions fit the Gumbel simulated data quite well. The generalized extreme value distribution and the two-component extreme value distribution fit the data better than the others in the non-negligible presence of outliers that cannot be accommodated by the Gumbel distribution, and therefore we suggest them to be applied in this context. Ajustes para pequenas amostras Extreme-value regression Generalized Gumbel distributions Hypothesis tests Modelos não lineares Nonlinear models Regressão valor extremo Small-sample adjustments Testes de hipóteses
8	Algumas novas distribuições: desenvolvimento e aplicações / The new distributions: development and applications Brito, Edleide de 30 July 2014 (has links) Nos últimos anos, diversos autores têm concentrado seus esforços na generalização de distribuições de probabilidades obtendo, dessa forma, maior flexibilidade e, consequentemente, ganho na análise de dados e na capacidade de incorporar um grande número de sub-modelos nas distribuições generalizadas. Neste trabalho, serão apresentadas duas novas distribuições de probabilidade: McGumbel e gama Burr XII; e uma nova família de distribuições de probabilidade: Marshall-Olkin binomial negativa. Algumas propriedades das novas distribuições são apresentadas e o método de máxima verossimilhança foi utilizado para estimar os parâmetros dos modelos propostos. / In recent years, several authors have concentrated their efforts on the generalization of probability distributions obtained in this way more flexibility and hence gain in data analysis and the ability to incorporate a large number of sub-models in the generalized distributions. In this work, two new probability distributions will be presented: MacDonald Gumbel and gamma Burr XII; and a new family of probability distributions: negative binomial Marshall-Olkin. Some properties of the new distributions are presented and the method of maximum likelihood was used to estimate the parameters of the proposed models. Burr XII distribution Distribuição Burr XII Distribuição Gumbel Distribuição Marshall-Olkin Gumbel Distribution Marshall-Olkin distribution Matriz de informação observada Máxima verossimilhança Maximum likelihood Observed information matrix
9	Contribuições em inferência e modelagem de valores extremos / Contributions to extreme value inference and modeling. Eliane Cantinho Pinheiro 04 December 2013 (has links) A teoria do valor extremo é aplicada em áreas de pesquisa tais como hidrologia, estudos de poluição, engenharia de materiais, controle de tráfego e economia. A distribuição valor extremo ou Gumbel é amplamente utilizada na modelagem de valores extremos de fenômenos da natureza e no contexto de análise de sobrevivência para modelar o logaritmo do tempo de vida. A modelagem de valores extremos de fenômenos da natureza tais como velocidade de vento, nível da água de rio ou mar, altura de onda ou umidade é importante em estatística ambiental pois o conhecimento de valores extremos de tais eventos é crucial na prevenção de catátrofes. Ultimamente esta teoria é de particular interesse pois fenômenos extremos da natureza têm sido mais comuns e intensos. A maioria dos artigos sobre teoria do valor extremo para modelagem de dados considera amostras de tamanho moderado ou grande. A distribuição Gumbel é frequentemente incluída nas análises mas a qualidade do ajuste pode ser pobre em função de presença de ouliers. Investigamos modelagem estatística de eventos extremos com base na teoria de valores extremos. Consideramos um modelo de regressão valor extremo introduzido por Barreto-Souza & Vasconcellos (2011). Os autores trataram da questão de corrigir o viés do estimador de máxima verossimilhança para pequenas amostras. Nosso primeiro objetivo é deduzir ajustes para testes de hipótese nesta classe de modelos. Derivamos a estatística da razão de verossimilhanças ajustada de Skovgaard (2001) e cinco ajustes da estatística da razão de verossimilhanças sinalizada, que foram propostos por Barndorff-Nielsen (1986, 1991), DiCiccio & Martin (1993), Skovgaard (1996), Severini (1999) e Fraser et al. (1999). As estatísticas ajustadas são aproximadamente distribuídas como uma distribuição $\\chi^2$ e normal padrão com alto grau de acurácia. Os termos dos ajustes têm formas compactas simples que podem ser facilmente implementadas em softwares disponíveis. Comparamos a performance do teste da razão de verossimilhanças, do teste da razão de verossimilanças sinalizada e dos testes ajustados obtidos neste trabalho em amostras pequenas. Ilustramos uma aplicação dos testes usuais e suas versões modificadas em conjuntos de dados reais. As distribuições das estatísticas ajustadas são mais próximas das respectivas distribuições limites comparadas com as distribuições das estatísticas usuais quando o tamanho da amostra é relativamente pequeno. Os resultados de simulação indicaram que as estatísticas ajustadas são recomendadas para inferência em modelo de regressão valor extremo quando o tamanho da amostra é moderado ou pequeno. Parcimônia é importante quando os dados são escassos, mas flexibilidade também é crucial pois um ajuste pobre pode levar a uma conclusão completamente errada. Uma revisão da literatura foi feita para listar as distribuições que são generalizações da distribuição Gumbel. Nosso segundo objetivo é avaliar a parcimônia e flexibilidade destas distribuições. Com este propósito, comparamos tais distribuições através de momentos, coeficientes de assimetria e de curtose e índice da cauda. As famílias mais amplas obtidas pela inclusão de parâmetros adicionais, que têm a distribuição Gumbel como caso particular, apresentam assimetria e curtose flexíveis enquanto a distribuição Gumbel apresenta tais características constantes. Dentre estas distribuições, a distribuição valor extremo generalizada é a única com índice da cauda que pode ser qualquer número real positivo enquanto os índices da cauda das outras distribuições são zero. Observamos que algumas generalizações da distribuição Gumbel estudadas na literatura são não identificáveis. Portanto, para estes modelos a interpretação e estimação de parâmetros individuais não é factível. Selecionamos as distribuições identificáveis e as ajustamos a um conjunto de dados simulado e a um conjunto de dados reais de velocidade de vento. Como esperado, tais distribuições se ajustaram bastante bem ao conjunto de dados simulados de uma distribuição Gumbel. A distribuição valor extremo generalizada e a mistura de duas distribuições Gumbel produziram melhores ajustes aos dados do que as outras distribuições na presença não desprezível de observações discrepantes que não podem ser acomodadas pela distribuição Gumbel e, portanto, sugerimos que tais distribuições devem ser utilizadas neste contexto. / The extreme value theory is applied in research fields such as hydrology, pollution studies, materials engineering, traffic management, economics and finance. The Gumbel distribution is widely used in statistical modeling of extreme values of a natural process such as rainfall and wind. Also, the Gumbel distribution is important in the context of survival analysis for modeling lifetime in logarithmic scale. The statistical modeling of extreme values of a natural process such as wind or humidity is important in environmental statistics; for example, understanding extreme wind speed is crucial in catastrophe/disaster protection. Lately this is of particular interest as extreme natural phenomena/episodes are more common and intense. The majority of papers on extreme value theory for modeling extreme data is supported by moderate or large sample sizes. The Gumbel distribution is often considered but the resulting fit may be poor in the presence of ouliers since its skewness and kurtosis are constant. We deal with statistical modeling of extreme events data based on extreme value theory. We consider a general extreme-value regression model family introduced by Barreto-Souza & Vasconcellos (2011). The authors addressed the issue of correcting the bias of the maximum likelihood estimators in small samples. Here, our first goal is to derive hypothesis test adjustments in this class of models. We derive Skovgaard\'s adjusted likelihood ratio statistics Skovgaard (2001) and five adjusted signed likelihood ratio statistics, which have been proposed by Barndorff-Nielsen (1986, 1991), DiCiccio & Martin (1993), Skovgaard (1996), Severini (1999) and Fraser et al. (1999). The adjusted statistics are approximately distributed as $\\chi^2$ and standard normal with high accuracy. The adjustment terms have simple compact forms which may be easily implemented by readily available software. We compare the finite sample performance of the likelihood ratio test, the signed likelihood ratio test and the adjusted tests obtained in this work. We illustrate the application of the usual tests and their modified versions in real datasets. The adjusted statistics are closer to the respective limiting distribution compared to the usual ones when the sample size is relatively small. Simulation results indicate that the adjusted statistics can be recommended for inference in extreme value regression model with small or moderate sample size. Parsimony is important when data are scarce, but flexibility is also crucial since a poor fit may lead to a completely wrong conclusion. A literature review was conducted to list distributions which nest the Gumbel distribution. Our second goal is to evaluate their parsimony and flexibility. For this purpose, we compare such distributions regarding moments, skewness, kurtosis and tail index. The larger families obtained by introducing additional parameters, which have Gumbel embedded in, present flexible skewness and kurtosis while the Gumbel distribution skewness and kurtosis are constant. Among these distributions the generalized extreme value is the only one with tail index that can be any positive real number while the tail indeces of the other distributions investigated here are zero. We notice that some generalizations of the Gumbel distribution studied in the literature are not indetifiable. Hence, for these models meaningful interpretation and estimation of individual parameters are not feasible. We select the identifiable distributions and fit them to a simulated dataset and to real wind speed data. As expected, such distributions fit the Gumbel simulated data quite well. The generalized extreme value distribution and the two-component extreme value distribution fit the data better than the others in the non-negligible presence of outliers that cannot be accommodated by the Gumbel distribution, and therefore we suggest them to be applied in this context. Ajustes para pequenas amostras Modelos não lineares Regressão valor extremo Testes de hipóteses Extreme-value regression Generalized Gumbel distributions Hypothesis tests Nonlinear models Small-sample adjustments
10	Análise da probabilidade de ocorrência de extremos de precipitação e estudo da tendência de classes de precipitação na região metropolitana de São Paulo / Analysis of the probability of occurrence of extreme precipitation and trend study of classes of rainfall in the metropolitan region of São Paulo Raimundo, Clebson do Carmo 25 February 2011 (has links) Extreme rainfall events are responsible for social disorder and economic problems, especially in large urban centers. Densely populated areas suffer from flooding , landslide and building destruction that cause deaths and wide-spread diseases, such as malaria, dengue and leptospirosis. They are recurrent phenomena that wear down the life of the urban population, particularly the least privileged ones. The focal area of this work was the Metropolitan Region of São Paulo (MRSP), Brazil, one of the largest cities in the world. Rainfall daily totals of 21 rain gage network in the MRSP were analyzed to i) estimate the annual maximum daily rainfall (PMDA), by means of the Gumbel distribution; II) group different rainfall rates into classes (from drizzle to extreme rates) and verify the similarity between seasons (clustering), that is annual and seasonal rain rates, for the period 1947 to 1998, making use of the technique known as Cluster Analysis, and III) identify possible trends of three rain rate classes (drizzle, moderate and above 30.0 mm / day) for the annual and seasonal periods, for the whole dataset length of each gage, using the Mann-Kendall trend test. The results showed that the maximum daily rainfall observed data fit the Gumbel distribution in the annual period, with the estimated annual daily maximum rain rate equal to 239.3 mm / day with a return period of 500 years in Barrocada gage, located in MRSP north-central region. Cluster analysis showed little similarity amongst gages, with respect to some rain rate classes, both in the number of events and the classes rain totals, in the annual and seasonal periods. The Mann-Kendall test showed significant increasing trend of the cumulative totals for a larger number of gages for both annual and seasonal periods. The trend of the number of drizzle events class was significantly upward for most gages, again both in the annual and seasonal periods, but not all gages presented increasing trend for the moderate events class. Also, significant increasing trend of the rain rate classes above 30.0 mm / day was found at some gages in the year period. In general, there was significant upward trend of rain rate classes in the MRSP. / Fundação de Amparo a Pesquisa do Estado de Alagoas / Eventos extremos de chuva são responsáveis por distúrbios sociais e problemas econômicos, principalmente nos grandes centros urbanos. Áreas densamente povoadas sofrem deslizamentos, inundações e destruição de construções, que causam mortes e doenças em larga escala, tais como malária, dengue e leptospirose. Eles são fenômenos recorrentes que desgastam a vida da população urbana, principalmente aos menos privilegiados. A área de foco deste trabalho foi a Região Metropolitana de São Paulo (RMSP), Brasil, uma das maiores cidades do mundo. Foi analisada uma rede de 21 estações, na RMSP, com totais diários de precipitação para: i) estimar a precipitação máxima diária anual (PMDA), por meio da distribuição de Gumbel, ii) grupos com diferentes taxas de precipitação dentro das classes (de chuvisco a precipitação extrema), e, verificar a similaridade entre as estações (clustering), para taxas de precipitação anual e sazonal, para o período de 1947 a 1998, fazendo uso da técnica conhecida como análise de cluster, e III) identificar possíveis tendências nas três classes de taxa de precipitação (chuvisco, moderado e acima de 30mm/dia) para os períodos anuais e sazonais, para o comprimento total de cada estação, utilizando o teste de tendência de Mann-Kendall. Os resultados mostraram que os dados observados de precipitação máxima diária se ajustam à distribuição de Gumbel no período anual, com taxa anual estimada de precipitação máxima diária igual a 239,3 mm/dia com período de retorno de 500 anos na estação Barrocada, localizada na região centro-norte da RMSP. A análise de agrupamento mostrou pouca similaridade entre as estações, com relação a algumas taxas de classes de precipitação, tanto em número de eventos das classes de precipitação total, nos períodos anuais como sazonais. O teste de Mann-Kendall apresentou tendência de aumento significativo dos totais acumulados em um maior número de estações para ambos os períodos, anuais e sazonais. A tendência do número de eventos de classe chuvisco, foi significativamente alta para a maioria das estações, novamente tanto em períodos anuais como sazonais, mas nem todas as estações apresentaram tendência de aumento para a classe de eventos moderados. Além disso, a tendência de aumento significativo das classes de taxa de precipitação acima de 30 mm/dia foi encontrada em algumas estações no período anual. Em geral, houve tendência de aumento significativo das taxas de classes de precipitação na RMSP. Maximum daily rainfall Cluster analysis Gumbel distribuition Mann Kendall test Precipitação máxima diária Cluster análise Distribuição de Gumbel Teste de Mann-Kendall

Search results