Global ETD Search

11	Novel Statistical Methods in Quantitative Genetics : Modeling Genetic Variance for Quantitative Trait Loci Mapping and Genomic Evaluation Shen, Xia January 2012 (has links) This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS). statistical genetics quantitative trait loci genome-wide association study genomic selection genetic variance hierarchical generalized linear model linear mixed model random effect heteroscedastic effects model variance-controlling genes
12	Novel Statistical Methods in Quantitative Genetics : Modeling Genetic Variance for Quantitative Trait Loci Mapping and Genomic Evaluation Shen, Xia January 2012 (has links) This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS). statistical genetics quantitative trait loci genome-wide association study genomic selection genetic variance hierarchical generalized linear model linear mixed model random effect heteroscedastic effects model variance-controlling genes
13	Contributions à la théorie des valeurs extrêmes : Détection de tendance pour les extrêmes hétéroscédastiques / Contributions to extreme value theory : Trend detection for heteroscedastic extremes Mefleh, Aline 26 June 2018 (has links) Nous présentons dans cette thèse en premier lieu la méthode de Bootstrap par permutation appliquée à la méthode des blocs maxima utilisée en théorie des valeurs extrêmes (TVE) univariée. La méthode est basée sur un échantillonnage particulier des données en utilisant les rangs des blocs maxima dont la distribution est présentée et introduite dans les simulations. Elle amène à une réduction de la variance des paramètres de la loi GEV et des quantiles estimés. En second lieu, on s’intéresse au cas où les observations sont indépendantes mais non identiquement distribuées en TVE. Cette variation dans la distribution est quantifiée en utilisant une fonction dite « skedasis function » notée c qui représente la fréquence des extrêmes. Ce modèle a été introduit par Einmahl et al. dans le papier « Statistics of heteroscedastic extremes ». On étudie plusieurs modèles paramétriques pour c (log-linéaire, linéaire, log-linéaire discret) ainsi que les résultats de consistance et de normalité asymptotique du paramètre θ représentant la tendance. Le test θ =0 contre θ ≠0 est interprété alors comme un test de détection de tendance dans les extrêmes. Nous illustrons nos résultats dans une étude par simulation qui montre en particulier que les tests paramétriques sont en général plus puissants que les tests non paramétriques pour la détection de la tendance, d’où l’utilité de notre travail. Nous discutons en plus le choix du seuil k en appliquant la méthode de Lepski. Enfin, nous appliquons la méthodologie sur les données de températures minimales et maximales dans la région de Fort Collins, Colorado durant le 20ème siècle afin de détecter la présence d’une tendance dans les extrêmes sur cette période. En troisième lieu, on dispose d’un jeu de données de précipitation journalière maximale sur 24 ans dans 40 stations. On réalise une prédiction spatio-temporelle des quantiles correspondants à un niveau de retour de 20 ans pour les précipitations mensuelles dans chaque station. Nous utilisons des modèles de GEV en introduisant des covariables dans les paramètres. Le meilleur modèle est choisi en termes d’AIC et par la méthode de validation croisée. Pour chacun des deux modèles choisis, nous estimons les quantiles extrêmes. Finalement, on applique la TVE unvariée et bivariée sur les vitesses du vent et la hauteur des vagues dans une région au Liban en vue de protéger la plateforme pétrolière qui y sera installée de ces risques environnementaux. On applique d’abord la théorie univariée sur la vitesse du vent et la hauteur des vagues séparément en utilisant la méthode des blocs maximas pour estimer les paramètres de la GEV et les niveaux de retour associés à des périodes de retour de 50, 100 et 500 années. Nous passons ensuite à l’application de la théorie bivariée afin d’estimer la dépendance entre les vents et les vagues extrêmes et d’estimer des probabilités jointes de dépassement des niveaux de retour univariés. Nous associons ces probabilités jointes de dépassement à des périodes de retour jointes et nous les comparons aux périodes de retour marginales. / We firstly present in this thesis the permutation Bootstrap method applied for the block maxima (BM) method in extreme value theory. The method is based on BM ranks whose distribution is presented and simulated. It performs well and leads to a variance reduction in the estimation of the GEV parameters and the extreme quantiles. Secondly, we build upon the heteroscedastic extremes framework by Einmahl et al. (2016) where the observations are assumed independent but not identically distributed and the variation in their tail distributions is modeled by the so-called skedasis function. While the original paper focuses on non-parametric estimation of the skedasis function, we consider here parametric models and prove the consistency and asymptotic normality of the parameter estimators. A parametric test for trend detection in the case where the skedasis function is monotone is introduced. A short simulation study shows that the parametric test can be more powerful than the non-parametric Kolmogorov-Smirnov type test, even for misspecified models. We also discuss the choice of threshold based on Lepski's method. The methodology is finally illustrated on a dataset of minimal/maximal daily temperatures in Fort Collins, Colorado, during the 20th century. Thirdly, we have a training sample data of daily maxima precipitation over 24 years in 40 stations. We make spatio-temporal prediction of quantile of level corresponding to extreme monthly precipitation over the next 20 years in every station. We use generalized extreme value models by incorporating covariates. After selecting the best model based on the Akaike information criterion and the k-fold cross validation method, we present the results of the estimated quantiles for the selected models. Finally, we study the wind speed and wave height risks in Beddawi region in the northern Lebanon during the winter season in order to protect the oil rig that will be installed. We estimate the return levels associated to return periods of 50, 100 and 500 years for each risk separately using the univariate extreme value theory. Then, by using the multivariate extreme value theory we estimate the dependence between extreme wind speed and wave height as well as joint exceedance probabilities and joint return levels to take into consideration the risk of these two environmental factors simultaneously. Extrêmes hétéroscédastiques Détection de tendance Bootstrap par permutation Covariables Risques environnementaux Plateforme pétrolière Heteroscedastic extremes Trend detection Permutation Bootstrap Covariates Environmental risks Oil rig 510
14	Modelos de regressão linear heteroscedásticos com erros t-Student: uma abordagem bayesiana objetiva / Heteroscedastics linear regression models with Student t erros: an objective bayesian analysis. Aline Campos Reis de Souza 18 February 2016 (has links) Neste trabalho, apresentamos uma extensão da análise bayesiana objetiva feita em Fonseca et al. (2008), baseada nas distribuições a priori de Jeffreys para o modelo de regressão linear com erros t-Student, para os quais consideramos a suposição de heteoscedasticidade. Mostramos que a distribuição a posteriori dos parâmetros do modelo regressão gerada pela distribuição a priori é própria. Através de um estudo de simulação, avaliamos as propriedades frequentistas dos estimadores bayesianos e comparamos os resultados com outras distribuições a priori encontradas na literatura. Além disso, uma análise de diagnóstico baseada na medida de divergência Kullback-Leiber é desenvolvida com a finalidade de estudar a robustez das estimativas na presença de observações atípicas. Finalmente, um conjunto de dados reais é utilizado para o ajuste do modelo proposto. / In this work , we present an extension of the objective bayesian analysis made in Fonseca et al. (2008), based on Jeffreys priors for linear regression models with Student t errors, for which we consider the heteroscedasticity assumption. We show that the posterior distribution generated by the proposed Jeffreys prior, is proper. Through simulation study , we analyzed the frequentist properties of the bayesian estimators obtained. Then we tested the robustness of the model through disturbances in the response variable by comparing its performance with those obtained under another prior distributions proposed in the literature. Finally, a real data set is used to analyze the performance of the proposed model . We detected possible in uential points through the Kullback -Leibler divergence measure, and used the selection model criterias EAIC, EBIC, DIC and LPML in order to compare the models. Distribuição a priori de Jeffreys Erros t-Student Inferência robusta Effreys prior Robust inference Student t erros
15	Modelos de regressão linear heteroscedásticos com erros t-Student : uma abordagem bayesiana objetiva / Heteroscedastics linear regression models with Student-t errors: an objective bayesian analysis Souza, Aline Campos Reis de 18 February 2016 (has links) Submitted by Luciana Sebin (lusebin@ufscar.br) on 2016-09-26T18:57:40Z No. of bitstreams: 1 DissACRS.pdf: 1390452 bytes, checksum: a5365fdbf745228c0174f2643b3f7267 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-27T19:59:56Z (GMT) No. of bitstreams: 1 DissACRS.pdf: 1390452 bytes, checksum: a5365fdbf745228c0174f2643b3f7267 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-27T20:00:01Z (GMT) No. of bitstreams: 1 DissACRS.pdf: 1390452 bytes, checksum: a5365fdbf745228c0174f2643b3f7267 (MD5) / Made available in DSpace on 2016-09-27T20:00:08Z (GMT). No. of bitstreams: 1 DissACRS.pdf: 1390452 bytes, checksum: a5365fdbf745228c0174f2643b3f7267 (MD5) Previous issue date: 2016-02-18 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / In this work , we present an extension of the objective bayesian analysis made in Fonseca et al. (2008), based on Je reys priors for linear regression models with Student t errors, for which we consider the heteroscedasticity assumption. We show that the posterior distribution generated by the proposed Je reys prior, is proper. Through simulation study , we analyzed the frequentist properties of the bayesian estimators obtained. Then we tested the robustness of the model through disturbances in the response variable by comparing its performance with those obtained under another prior distributions proposed in the literature. Finally, a real data set is used to analyze the performance of the proposed model . We detected possible in uential points through the Kullback -Leibler divergence measure, and used the selection model criterias EAIC, EBIC, DIC and LPML in order to compare the models. / Neste trabalho, apresentamos uma extensão da análise bayesiana objetiva feita em Fonseca et al. (2008), baseada nas distribuicões a priori de Je reys para o modelo de regressão linear com erros t-Student, para os quais consideramos a suposicão de heteoscedasticidade. Mostramos que a distribuiçãoo a posteriori dos parâmetros do modelo regressão gerada pela distribuição a priori e própria. Através de um estudo de simulação, avaliamos as propriedades frequentistas dos estimadores bayesianos e comparamos os resultados com outras distribuições a priori encontradas na literatura. Além disso, uma análise de diagnóstico baseada na medida de divergência Kullback-Leiber e desenvolvida com analidade de estudar a robustez das estimativas na presença de observações atípicas. Finalmente, um conjunto de dados reais e utilizado para o ajuste do modelo proposto. Distribuição a priori de Je reys Inferência robusta Erros t-Student Je reys prior Robust inference Student t errors Heteroscedastic linear regression models CIENCIAS EXATAS E DA TERRA
16	Acurácia de previsões para vazão em redes: um comparativo entre ARIMA, GARCH e RNA Duarte, Felipe Machado 29 August 2014 (has links) Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2016-03-31T15:28:38Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Felipe Machado Duarte.pdf: 1439236 bytes, checksum: 970d1a4b49da9d4541eb167aa39a82fa (MD5) / Made available in DSpace on 2016-03-31T15:28:39Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Felipe Machado Duarte.pdf: 1439236 bytes, checksum: 970d1a4b49da9d4541eb167aa39a82fa (MD5) Previous issue date: 2014-08-29 / Em consequência da evolução da internet, causada por mudanças de paradigma como a Internet das coisas, por exemplo, surgem novas demandas tecnológicas por conta do crescimento do número de dispositivos conectados. Um dos novos desafios que vieram junto a esta demanda é gerenciar esta rede em expansão, de maneira a garantir conectividade aos dispositivos que a integram. Um dos aspectos que merecem atenção no gerenciamento da rede é o provisionamento da largura de banda, que deve ser realizado de maneira a evitar o desperdício de banda, sem por outro lado comprometer a conectividade ao restringi-la demais. No entanto, balancear esta equação não é uma tarefa simples, pois o tráfego de dados na rede é bastante complexo e exibe componentes, como a volatilidade, que tornam difícil a sua modelagem. Já há algum tempo, estudos são publicados apresentando a utilização de ferramentas de análise de séries temporais para prever a vazão de dados em redes de computadores, e entre as técnicas aplicadas com mais sucesso estão os modelos ARMA, GARCH e RNA. Embora estas técnicas tenham sido discutidas como alternativa para modelar dados de tráfego de redes, pouco material está disponível sobre a comparação de suas acurácias, de maneira que neste estudo foi proposta uma avaliação das acurácias dos modelos ARIMA, GARCH e RNA. Esta avaliação foi realizada em cenários configurados em diferentes granularidades de tempo e para múltiplos horizontes de previsão. Para cada um destes cenários foram ajustados modelos ARIMA, GARCH e RNA, e a validação das métricas de acurácia das previsões obtidas se deu através do Rolling Forecast Horizon. Os resultados obtidos mostraram que a RNA exibiu melhor acurácia em grande parte dos cenários propostos, chegando a exibir RMSE até 32% menor que as previsões geradas pelos modelos ARIMA e GARCH. No entanto, na presença de alta volatilidade, o GARCH conseguiu apresentar as previsões com melhor desempenho, exibindo RMSE até 29% menores que os outros modelos estudados. Os resultados deste trabalho servem de auxílio para a área de gerenciamento de redes, em especial a tarefa de provisionamento de largura de banda de tráfego, pois trazem mais informações sobre os desempenhos dos modelos ARIMA, GARCH e RNA ao gerar previsões para este tipo de tráfego. / The Internet evolution, caused by paradigm changes as the Internet of Things, fosters technological advances to cope with the rising number of connected devices. One of the new challenges that appeared with this new reality is the management of such expanding networks, assuring connectivity to every device within them. One of the major aspects of network management is bandwidth provisioning, which must be performed in a way to avoid bandwidth wasting, but without compromising connectivity by restricting it too much. Balancing such an equation is not a simple task, as network data traffic is very complex and presents property features, such as volatility, that turns its modeling rather difficult. It has been some time since research is published with the use of temporal analysis tools to predict data throughput in computer networks, among them, the most successful techniques employ the ARMA, GARCH and ANN models. Although these approaches have been discussed as alternatives do network data traffic modeling, there is little literature available concerning their accuracy, which motivated this work to perform an accuracy evaluation of the ARIMA, GARCH and ANN models. This evaluation was conducted in scenarios configured with different time granularities and for multiple forecast horizons. For each scenario, ARIMA, GARCH and ANN models were set, and the accuracy metrics evaluation was performed with a Rolling Forecast Horizon. Results show that ANN yielded better accuracy in most proposed scenarios, having a RMSE up to 32% lower than the forecasts generated by the ARIMA and GARCH models. However, when there is a high volatility, GARCH provided better forecasts, with a RMSE up to 29% lower than its counterparts. The results from this work provide a useful assistance to network management, especially to bandwidth provisioning, by shedding light on the accuracy presented by the ARIMA, GARCH and ANN models when generating forecasts for this type of traffic. Estimação de tráfego Análise de séries temporais Avaliação de desempenho Traffic estimation Time series analysis Performance evaluation
17	Correção tipo-Bartlett em modelos não lineares simétricos heteroscedástico NASCIMENTO, Kátia Pires do 25 February 2010 (has links) Submitted by (ana.araujo@ufrpe.br) on 2016-07-07T14:09:13Z No. of bitstreams: 1 Katia Pires do Nascimento.pdf: 303593 bytes, checksum: 6b936f81d2b21d770e2224c3fbdd07c1 (MD5) / Made available in DSpace on 2016-07-07T14:09:13Z (GMT). No. of bitstreams: 1 Katia Pires do Nascimento.pdf: 303593 bytes, checksum: 6b936f81d2b21d770e2224c3fbdd07c1 (MD5) Previous issue date: 2010-02-25 / This manuscript has two aims. First, we derive general matrix formulae to Bartlett–type correction to the score statistic in a class of heteroscedastic symmetric nonlinear regression models, with link functions any for both mean and dispersion parameter. In the second part Monte Carlo simulations are also performed to assess the influence of the correction in the models studied. / Essa dissertação tem dois objetivos. O primeiro é a obtenção de expressões matriciais para o fator de correção tipo–Bartlett para a estatística escore nos modelos não–lineares simétricos heteroscedásticos, com funções de ligação quaisquer para a média e para o parâmetro de dispersão. O segundo é apresentar resultados de simulação de forma a verificar a influência da correção nos modelos em estudo. Correção tipo Bartlett Distribuições simétricas Modelos heteroscedásticos Modelos não lineares Teste escore Bartlett-type correction Symmetric distribution Heteroscedastic model Nonlinear mode Score test
18	Uncertainty-aware deep learning for prediction of remaining useful life of mechanical systems Cornelius, Samuel J 10 December 2021 (has links) Remaining useful life (RUL) prediction is a problem that researchers in the prognostics and health management (PHM) community have been studying for decades. Both physics-based and data-driven methods have been investigated, and in recent years, deep learning has gained significant attention. When sufficiently large and diverse datasets are available, deep neural networks can achieve state-of-the-art performance in RUL prediction for a variety of systems. However, for end users to trust the results of these models, especially as they are integrated into safety-critical systems, RUL prediction uncertainty must be captured. This work explores an approach for estimating both epistemic and heteroscedastic aleatoric uncertainties that emerge in RUL prediction deep neural networks and demonstrates that quantifying the overall impact of these uncertainties on predictions reveal valuable insight into model performance. Additionally, a study is carried out to observe the effects of RUL truth data augmentation on perceived uncertainties in the model. deep learning heteroscedastic aleatoric uncertainty epistemic uncertainty remaining useful life prediction prognostics and health management C-MAPSS Data Science Other Mechanical Engineering
19	Tests d'ajustement reposant sur les méthodes d'ondelettes dans les modèles ARMA avec un terme d'erreur qui est une différence de martingales conditionnellement hétéroscédastique Liou, Chu Pheuil 09 1900 (has links) No description available. Qualité d'ajustement Modèle ARMA Méthode d'ondelettes Autocorrélation résiduelle Densité spectrale Lack of fit tests ARMA model Wavelet method Residual autocorrelation Spectral density
20	O teorema das seções de Lévy aplicado à séries temporais correlacionadas não estacionárias: uma análise da convergência gaussiana em sistemas dinâmicos / The theorem of the sections of Levy applied to the correlated time series no stationary: an analysis of Gaussian convergence in dynamic systems Passos, Frederico Salgueiro 01 December 2014 (has links) Weakly nonstationary processes appear in many challenging problems related to the physics of complex systems. An interesting question is how to quantify the rate of convergence to Gaussian behavior of rescaled heteroscedastic comming from economics time series with stationary first moments but nonstationary multifractal long-range correlated second moments and also time series generated from fractionated brownian motion where the series correlation is dependent of a parameter. Here it is used the approach Which uses a recently proposed extension of the Lévy sections theorem. It was analyzed the statistical and multifractal properties of heteroscedastic time series and found that the Lévy sections approach provides a faster convergence to Gaussian behavior relative to the convergence of traditional partial sums of variables. To understand this transition it is used several statistical tests to provide enough data on convergence behavior. It was also observed that the rescaled signals retain multifractal properties even after reaching what appears to be the stable Gaussian regime. / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Processos não-estacionários com interações fracas aparecem como problemas desafiadores em sistemas complexos em física. Uma questão interessante é como quantificar a taxa de convergência para o comportamento gaussiano em séries temporais heteroscedásticas, sem uma variância única em toda a série, provenientes de sistemas financeiros, reescaladas com os primeiros momentos estacionários mas com uma multifractalidade não estacionária e segundos momentos que possuem uma correlação do longo alcance e verificar o mesmo mecanismo também em séries temporais geradas a partir de um movimento Browniano Fracionado onde a correlação da série depende de um parâmetro ajustável. Aqui é usada uma extensão do teorema das seções de Lévy. Analisando as propriedades estatísticas e multifractais de uma série temporal heteroscedástica e encontrando que as seções de Lévy fornece uma convergência mais rápida para o comportamento gaussiano relativo à convergência das tradicionais somas de variáveis, o teorema do limite central. Para entender essa transição foram utilizados vários testes estatísticos que forneceram dados suficientes sobre o comportamento de convergência. Também observou-se que os sinais reescalados mantêm suas propriedades multifractais mesmo depois de atingirem um regime que parece ser um regime gaussiano. Teorema das seções de Lévy Teorema do limite central Séries temporais Heteroscedásticas Física estatística Lévy sections theorem Central limit theorem Time series Statistical Physics Heteroscedastic Convergence tests CNPQ::CIENCIAS EXATAS E DA TERRA::FISICA

Search results