Spelling suggestions: "subject:"miscorrection"" "subject:"lacorrection""
31 |
Perceived Breadth of Bias as a Determinant of Bias CorrectionGretton, Jeremy David January 2017 (has links)
No description available.
|
32 |
Joint Gaussian Graphical Model for multi-class and multi-level dataShan, Liang 01 July 2016 (has links)
Gaussian graphical model has been a popular tool to investigate conditional dependency between random variables by estimating sparse precision matrices. The estimated precision matrices could be mapped into networks for visualization. For related but different classes, jointly estimating networks by taking advantage of common structure across classes can help us better estimate conditional dependencies among variables. Furthermore, there may exist multilevel structure among variables; some variables are considered as higher level variables and others are nested in these higher level variables, which are called lower level variables. In this dissertation, we made several contributions to the area of joint estimation of Gaussian graphical models across heterogeneous classes: the first is to propose a joint estimation method for estimating Gaussian graphical models across unbalanced multi-classes, whereas the second considers multilevel variable information during the joint estimation procedure and simultaneously estimates higher level network and lower level network.
For the first project, we consider the problem of jointly estimating Gaussian graphical models across unbalanced multi-class. Most existing methods require equal or similar sample size among classes. However, many real applications do not have similar sample sizes. Hence, in this dissertation, we propose the joint adaptive graphical lasso, a weighted L1 penalized approach, for unbalanced multi-class problems. Our joint adaptive graphical lasso approach combines information across classes so that their common characteristics can be shared during the estimation process. We also introduce regularization into the adaptive term so that the unbalancedness of data is taken into account. Simulation studies show that our approach performs better than existing methods in terms of false positive rate, accuracy, Mathews correlation coefficient, and false discovery rate. We demonstrate the advantage of our approach using liver cancer data set.
For the second one, we propose a method to jointly estimate the multilevel Gaussian graphical models across multiple classes. Currently, methods are still limited to investigate a single level conditional dependency structure when there exists the multilevel structure among variables. Due to the fact that higher level variables may work together to accomplish certain tasks, simultaneously exploring conditional dependency structures among higher level variables and among lower level variables are of our main interest. Given multilevel data from heterogeneous classes, our method assures that common structures in terms of the multilevel conditional dependency are shared during the estimation procedure, yet unique structures for each class are retained as well. Our proposed approach is achieved by first introducing a higher level variable factor within a class, and then common factors across classes. The performance of our approach is evaluated on several simulated networks. We also demonstrate the advantage of our approach using breast cancer patient data. / Ph. D.
|
33 |
Implementação no software estatístico R de modelos de regressão normal com parametrização geral / Normal regression models with general parametrization in software RPerette, André Casagrandi 23 August 2019 (has links)
Este trabalho objetiva o desenvolvimento de um pacote no software estatístico R com a implementação de estimadores em modelos de regressão normal univariados com parametrização geral, uma particularidade do modelo definido em Patriota e Lemonte (2011). Essa classe contempla uma ampla gama de modelos conhecidos, tais como modelos de regressão não lineares e heteroscedásticos. São implementadas correções nos estimadores de máxima verossimilhança e na estatística de razão de verossimilhanças. Tais correções são efetivas quando o tamanho amostral é pequeno. Para a correção do estimador de máxima verossimilhança, considerou-se a correção do viés de segunda ordem, enquanto que para a estatística da razão de verossimilhanças aplicou-se a correção desenvolvida em Skovgaard (2001). Todas as funcionalidades do pacote são descritas detalhadamente neste trabalho. Para avaliar a qualidade do algoritmo desenvolvido, realizaram-se simulações de Monte Carlo para diferentes cenários, avaliando taxas de convergência, erros da estimação e eficiência das correções de viés e de Skovgaard. / This work aims to develop a package in R language with the implementation of normal regression models with general parameterization, proposed in Patriota and Lemonte (2011). This model unifies important models, such as nonlinear heteroscedastic models. Corrections are implemented for the MLEs and likelihood-ratio statistics. These corrections are effective in small samples. The algorithm considers the second-order bias of MLEs solution presented in Patriota and Lemonte (2009) and the Skovgaard\'s correction for likelihood-ratio statistics defined in Skovgaard (2001). In addition, a simulation study is developed under different scenarios, where the convergence ratio, relative squared error and the efficiency of bias correction and Skovgaard\'s correction are evaluated.
|
34 |
Climate change assessment for the southeastern United StatesZhang, Feng 11 August 2011 (has links)
Water resource planning and management practices in the southeastern United States may be vulnerable to climate change. This vulnerability has not been quantified, and decision makers, although generally concerned, are unable to appreciate the extent of the possible impact of climate change nor formulate and adopt mitigating management strategies. Thus, this dissertation aims to fulfill this need by generating decision worthy data and information using an integrated climate change assessment framework.
To begin this work, we develop a new joint variable spatial downscaling technique for statistically downscaling gridded climatic variables to generate high-resolution, gridded datasets for regional watershed modeling and assessment. The approach differs from previous statistical downscaling methods in that multiple climatic variables are downscaled simultaneously and consistently to produce realistic climate projections. In the bias correction step, JVSD uses a differencing process to create stationary joint cumulative frequency statistics of the variables being downscaled. The functional relationship between these statistics and those of the historical observation period is subsequently used to remove GCM bias. The original variables are recovered through summation of bias corrected differenced sequences. In the spatial disaggregation step, JVSD uses a historical analogue approach, with historical analogues identified simultaneously for all atmospheric fields and over all areas of the basin under study.
In the second component of the integrated assessment framework, we develop a data-driven, downward hydrological watershed model for transforming the climate variables obtained from the downscaling procedures to hydrological variables. The watershed model includes several water balance elements with nonlinear storage-release functions. The release functions and parameters are data driven and estimated using a recursive identification methodology suitable for multiple, inter-linked modeling components. The model evolves from larger spatial/temporal scales down to smaller spatial/temporal scales with increasing model structure complexity. For ungauged or poorly-gauged watersheds, we developed and applied regionalization hydrologic models based on stepwise regressions to relate the parameters of the hydrological models to observed watershed responses at specific scales.
Finally, we present the climate change assessment results for six river basins in the southeastern United States. The historical (baseline) assessment is based on climatic data for the period 1901 through 2009. The future assessment consists of running the assessment models under all IPCC A1B and A2 climate scenarios for the period from 2000 through 2099. The climate assessment includes temperature, precipitation, and potential evapotranspiration; the hydrology assessment includes primary hydrologic variables (i.e., soil moisture, evapotranspiration, and runoff) for each watershed.
|
35 |
On Parametric and Nonparametric Methods for Dependent DataBandyopadhyay, Soutir 2010 August 1900 (has links)
In recent years, there has been a surge of research interest in the analysis of time series
and spatial data. While on one hand more and more sophisticated models are being
developed, on the other hand the resulting theory and estimation process has become
more and more involved. This dissertation addresses the development of statistical
inference procedures for data exhibiting dependencies of varied form and structure.
In the first work, we consider estimation of the mean squared prediction error
(MSPE) of the best linear predictor of (possibly) nonlinear functions of finitely many
future observations in a stationary time series. We develop a resampling methodology
for estimating the MSPE when the unknown parameters in the best linear predictor
are estimated. Further, we propose a bias corrected MSPE estimator based on the
bootstrap and establish its second order accuracy. Finite sample properties of the
method are investigated through a simulation study.
The next work considers nonparametric inference on spatial data. In this work
the asymptotic distribution of the Discrete Fourier Transformation (DFT) of spatial
data under pure and mixed increasing domain spatial asymptotic structures are
studied under both deterministic and stochastic spatial sampling designs. The deterministic
design is specified by a scaled version of the integer lattice in IRd while
the data-sites under the stochastic spatial design are generated by a sequence of independent
random vectors, with a possibly nonuniform density. A detailed account
of the asymptotic joint distribution of the DFTs of the spatial data is given which, among other things, highlights the effects of the geometry of the sampling region and
the spatial sampling density on the limit distribution. Further, it is shown that in
both deterministic and stochastic design cases, for "asymptotically distant" frequencies,
the DFTs are asymptotically independent, but this property may be destroyed if
the frequencies are "asymptotically close". Some important implications of the main
results are also given.
|
36 |
Essays on heteroskedasticityda Glória Abage de Lima, Maria 31 January 2008 (has links)
Made available in DSpace on 2014-06-12T18:29:15Z (GMT). No. of bitstreams: 2
arquivo4279_1.pdf: 1161561 bytes, checksum: 80aee0b17f88de11dd7d0999ad1594a1 (MD5)
license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5)
Previous issue date: 2008 / Esta tese de doutorado trata da realização de inferências no modelo de regressão linear sob
heteroscedasticidade de forma desconhecida. No primeiro capítulo, nós desenvolvemos estimadores
intervalares que são robustos à presença de heteroscedasticidade. Esses estimadores
são baseados em estimadores consistentes de matrizes de covariâncias propostos na literatura,
bem como em esquemas bootstrap. A evidência numérica favorece o estimador intervalar HC4.
O Capítulo 2 desenvolve uma seqüência corrigida por viés de estimadores de matrizes de covariâncias
sob heteroscedasticidade de forma desconhecida a partir de estimador proposto por
Qian eWang (2001). Nós mostramos que o estimador de Qian-Wang pode ser generalizado em
uma classe mais ampla de estimadores consistentes para matrizes de covariâncias e que nossos
resultados podem ser facilmente estendidos a esta classe de estimadores. Finalmente, no Capítulo
3 nós usamos métodos de integração numérica para calcular as distribuições nulas exatas
de diferentes estatísticas de testes quasi-t, sob a suposição de que os erros são normalmente
distribuídos. Os resultados favorecem o teste HC4
|
37 |
Intercomparaison et développement de modèles statistiques pour la régionalisation du climat / Intercomparison and developement of statistical models for climate downscalingVaittinada ayar, Pradeebane 22 January 2016 (has links)
L’étude de la variabilité du climat est désormais indispensable pour anticiper les conséquences des changements climatiques futurs. Nous disposons pour cela de quantité de données issues de modèles de circulation générale (GCMs). Néanmoins, ces modèles ne permettent qu’une résolution partielle des interactions entre le climat et les activités humaines entre autres parce que ces modèles ont des résolutions spatiales souvent trop faibles. Il existe aujourd’hui toute une variété de modèles répondant à cette problématique et dont l’objectif est de générer des variables climatiques à l’échelle locale àpartir de variables à grande échelle : ce sont les modèles de régionalisation ou encore appelés modèles de réduction d’échelle spatiale ou de downscaling en anglais.Cette thèse a pour objectif d’approfondir les connaissances à propos des modèles de downscaling statistiques (SDMs) parmi lesquels on retrouve plusieurs approches. Le travail s’articule autour de quatre objectifs : (i) comparer des modèles de réduction d’échelle statistiques (et dynamiques), (ii) étudier l’influence des biais des GCMs sur les SDMs au moyen d’une procédure de correction de biais, (iii) développer un modèle de réduction d’échelle qui prenne en compte la non-stationnarité spatiale et temporelle du climat dans un contexte de modélisation dite spatiale et enfin, (iv) établir une définitiondes saisons à partir d’une modélisation des régimes de circulation atmosphérique ou régimes de temps.L’intercomparaison de modèles de downscaling a permis de mettre au point une méthode de sélection de modèles en fonction des besoins de l’utilisateur. L’étude des biais des GCMs révèle une influence indéniable de ces derniers sur les sorties de SDMs et les apports de la correction des biais. Les différentes étapes du développement d’un modèle spatial de réduction d’échelle donnent des résultats très encourageants. La définition des saisons par des régimes de temps se révèle être un outil efficace d’analyse et de modélisation saisonnière.Tous ces travaux de “Climatologie Statistique” ouvrent des perspectives pertinentes, non seulement en termes méthodologiques ou de compréhension de climat à l’échelle locale, mais aussi d’utilisations par les acteurs de la société. / The study of climate variability is vital in order to understand and anticipate the consequences of future climate changes. Large data sets generated by general circulation models (GCMs) are currently available and enable us to conduct studies in that direction. However, these models resolve only partially the interactions between climate and human activities, namely du to their coarse resolution. Nowadays there is a large variety of models coping with this issue and aiming at generating climate variables at local scale from large-scale variables : the downscaling models.The aim of this thesis is to increase the knowledge about statistical downscaling models (SDMs) wherein there is many approaches. The work conducted here pursues four main goals : (i) to discriminate statistical (and dynamical) downscaling models, (ii) to study the influences of GCMs biases on the SDMs through a bias correction scheme, (iii) to develop a statistical downscaling model accounting for climate spatial and temporal non-stationarity in a spatial modelling context and finally, (iv) to define seasons thanks to a weather typing modelling.The intercomparison of downscaling models led to set up a model selection methodology according to the end-users needs. The study of the biases of the GCMs reveals the impacts of those biases on the SDMs simulations and the positive contributions of the bias correction procedure. The different steps of the spatial SDM development bring some interesting and encouraging results. The seasons defined by the weather regimes are relevant for seasonal analyses and modelling.All those works conducted in a “Statistical Climatologie” framework lead to many relevant perspectives, not only in terms of methodology or knowlegde about local-scale climate, but also in terms of use by the society.
|
38 |
Essays on bayesian and classical econometrics with small samplesJarocinski, Marek 15 June 2006 (has links)
Esta tesis se ocupa de los problemas de la estimación econométrica con muestras pequeñas, en los contextos del los VARs monetarios y de la investigación empírica del crecimiento. Primero, demuestra cómo mejorar el análisis con VAR estructural en presencia de muestra pequeña. El primer capítulo adapta la especificación con prior intercambiable (exchangeable prior) al contexto del VAR y obtiene nuevos resultados sobre la transmisión monetaria en nuevos miembros de la Unión Europea. El segundo capítulo propone un prior sobre las tasas de crecimiento iniciales de las variables modeladas. Este prior resulta en la corrección del sesgo clásico de la muestra pequeña en series temporales y reconcilia puntos de vista Bayesiano y clásico sobre la estimación de modelos de series temporales. El tercer capítulo estudia el efecto del error de medición de la renta nacional sobre resultados empíricos de crecimiento económico, y demuestra que los procedimientos econométricos robustos a incertidumbre acerca del modelo son muy sensibles al error de medición en los datos. / This thesis deals with the problems of econometric estimation with small samples, in the contexts of monetary VARs and growth empirics. First, it shows how to improve structural VAR analysis on short datasets. The first chapter adapts the exchangeable prior specification to the VAR context, and obtains new findings about monetary transmission in New Member States. The second chapter proposes a prior on initial growth rates of modeled variables, which tackles the Classical small-sample bias in time series, and reconciles Bayesian and Classical points of view on time series estimation. The third chapter studies the effect of measurement error in income data on growth empirics, and shows that econometric procedures which are robust to model uncertainty are very sensitive to measurement error of the plausible size and properties.
|
39 |
WATER RESOURCES MANAGEMENT SOLUTIONS FOR EAST AFRICA: INCREASING AVAILABILITY AND UTILIZATION OF DATA FOR DECISION-MAKINGVictoria M Garibay (12890987) 27 June 2022 (has links)
<p> </p>
<p>The management of water resources in East Africa is inherently challenged by rainfall variability and the uneven spatial distribution of freshwater resources. In addition to these issues, meteorological and water data collection has been inconsistent over the past decades, and unclearly defined purposes or end goals for collected data have left many datasets ineffectively curated. In light of the data intensiveness of current modelling and planning methods, data scarcity and inaccessibility have become substantial impediments to informed decision-making. Among the outputs of this research are 1) a revised technique for evaluating bias correction performance on reanalysis data for use in regions where precipitation data is temporally discontinuous which can potentially be applied to other types of climate data as well, 2) a new methodology for quantifying qualitative information contained in legislation and official documents and websites for the assessment of relationships between documented meteorological and water data policies and resulting outcomes in terms of data availability and accessibility, and 3) a fresh look at data needs and the value data holds with respect to water resources decision-making and management in the region.</p>
|
40 |
[en] PROPOSALS FOR THE USE OF REANALYSIS BASES FOR WIND ENERGY MODELING IN BRAZIL / [pt] PROPOSTAS DO USO DE BASES DE REANÁLISE PARA MODELAGEM DE ENERGIA EÓLICA NO BRASILSAULO CUSTODIO DE AQUINO FERREIRA 13 August 2024 (has links)
[pt] O Brasil sempre foi um país que teve sua matriz elétrica pautada majoritariamente
em fontes renováveis, mais especificamente na hídrica. Com passar dos anos, esta tem se
diversificado e demonstrado uma maior participação da fonte eólica. Para melhor explorála, pesquisas visando modelar seu comportamento são essenciais. Entretanto, não é sempre que se tem dados de velocidade do vento e de geração eólica disponíveis em
quantidade e nas localidades de interesse. Esses dados são primordiais para identificar
potenciais locais de instalação de parques eólicos, melhorar o desempenho dos existentes
e estimular pesquisas de previsão e simulação da geração eólica que são entradas para
auxiliar na melhor performance do planejamento e da operação do setor elétrico
brasileiro. Na carência de dados de velocidade do vento, uma alternativa é o uso de dados
vindos de base de reanálises. Elas disponibilizam longos históricos de dados de variáveis
climáticas e atmosféricas para diversos pontos do globo terrestre e de forma gratuita.
Desta forma, a primeira contribuição deste trabalho teve como foco a verificação da
representatividade dos dados de velocidade do vento, disponibilizados pelo MERRA-2,
no território brasileiro. Seguindo as recomendações da literatura, utilizou-se técnicas de
interpolação, extrapolação e correção de viés para melhorar a adequação as velocidades
fornecidas pela base de reanalise as que acontecem na altura dos rotores das turbinas dos
parques eólicos. Em uma segunda contribuição combinou-se os dados do MERRA-2 com
os de potência medidas em parques eólicos brasileiros para modelar de modo estocástico
e não paramétrico a relação existente entre a velocidade e potência nas turbinas eólicas.
Para isto utilizou-se as técnicas de clusterização, estimação das curvas de densidade e
simulação. Por fim, em uma terceira contribuição, desenvolveu-se um aplicativo, no
ambiente shiny, para disponibilizar as metodologias desenvolvidas nas duas primeiras
contribuições. / [en] Brazil s energy landscape has historically relied heavily on renewable sources,
notably hydropower, with wind energy emerging as a significant contributor in recent
years. Understanding and harnessing the potential of wind energy necessitates robust
modeling of its behavior. However, obtaining comprehensive wind speed and generation
data, particularly in specific locations of interest, remains a challenge. In the absence of
wind speed data, an alternative is to use data from a reanalysis database. They provide
long histories of data on climatic and atmospheric variables for different parts of the world,
free of charge. Therefore, the first contribution of this work focused on verifying the
representativeness of wind speed data made available by MERRA-2 in Brazilian territory.
Following literature recommendations, interpolation, extrapolation, and bias correction
techniques were used to improve the adequacy of the speeds provided by the reanalysis
based on those that occur at the height of the wind farm turbine rotors. In a second
contribution, MERRA-2 data was combined with power measured in Brazilian wind farms
to model in a stochastic and non-parametric way the relationship between speed and power
in wind turbines. For this purpose, clustering, density curve estimation, and simulation
techniques were used. Finally, the research culminates in the development of an
application within the Shiny environment, offering a user-friendly platform to access and
apply the methodologies devised in the preceding analyses. By making these
methodologies readily accessible, the application facilitates broader engagement and
utilization within the research community and industry practitioners alike.
|
Page generated in 0.0796 seconds