Spelling suggestions: "subject:"model byelection"" "subject:"model dielection""
91 |
A Fully Bayesian Analysis of Multivariate Latent Class Models with an Application to Metric Conjoint AnalysisFrühwirth-Schnatter, Sylvia, Otter, Thomas, Tüchler, Regina January 2002 (has links) (PDF)
In this paper we head for a fully Bayesian analysis of the latent class model with a priori unknown number of classes. Estimation is carried out by means of Markov Chain Monte Carlo (MCMC) methods. We deal explicitely with the consequences the unidentifiability of this type of model has on MCMC estimation. Joint Bayesian estimation of all latent variables, model parameters, and parameters determining the probability law of the latent process is carried out by a new MCMC method called permutation sampling. In a first run we use the random permutation sampler to sample from the unconstrained posterior. We will demonstrate that a lot of important information, such as e.g. estimates of the subject-specific regression coefficients, is available from such an unidentified model. The MCMC output of the random permutation sampler is explored in order to find suitable identifiability constraints. In a second run we use the permutation sampler to sample from the constrained posterior by imposing identifiablity constraints. The unknown number of classes is determined by formal Bayesian model comparison through exact model likelihoods. We apply a new method of computing model likelihoods for latent class models which is based on the method of bridge sampling. The approach is applied to simulated data and to data from a metric conjoint analysis in the Austrian mineral water market. (author's abstract) / Series: Working Papers SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
|
92 |
從假設檢定的觀點探討ARMA模型的參數配適 / ARMA Model Selection from Hypothesis Point of View林芸生, Lin, Yun Sheng Unknown Date (has links)
本篇論文著重於探討ARMA模型的選模準則,過去較為著名的AIC、BIC等選模準則中,若總參數個數相同,模型選擇便簡化為比較各模型的概似函數在MLE下的值,故本研究將假設檢定定義為檢定總參數個數;截至目前為止,選模準則在使用上以AIC及BIC較為普遍,此兩種選模準則從本研究所定義的假設檢定的觀點來看,AIC犯型一誤差機率高,同時檢定力也高;BIC犯型一誤差的機率極低,同時檢定力也相對不高,本研究從此觀點提出一個選模準則方法,嘗試將上述兩種方法折衷,將型一誤差控制在5%,且檢定力略高於BIC。模擬的結果在理想的情形下皆符合預期,但在真實情形本研究方法涉及第一階段的模型選取,本研究提供兩種第一階段的模型選取方法,模擬的結果顯示,方法一型一誤差略為膨脹,檢定力增幅顯著;方法二型一誤差控制精準,但檢定力表現較差。本研究所提出的方法計算時間較為冗長,但若想將 AIC 及 BIC 方法折衷,可考慮嘗試本研究方法。 / This thesis focuses on model selection criteria for ARMA models. For information-based criteria such as AIC and BIC, the task of model selection is reduced to the comparison among likelihood values at maximum likelihood estimates if the numbers of parameters in candidate models are all the same. Thus the key step in model selection is the determination of the total number of parameters.
The determination of number of parameters can be addressed using a hypothesis testing approach, where the null hypothesis is that the total number of model parameters is equal to a given number k and the alternative hypothesis is that the total number of parameters is equal to k+1. In this thesis, an information-based model selection method is proposed, where the number of parameters is determined using a two-stage testing procedure, which is constructed with the attempt to control the average type I error probability to be 5%. When using BIC in the above testing problem, simulation results indicate that the average type I error probability for BIC is lower than 0.05, so it is expected the proposed test is more powerful than BIC.
The first stage of the proposed test involves selecting the most likely models under the null and the alternative hypothesis respectively. Two methods are considered for the first-stage selection. For the first method, the type I error probability can be larger than 0.05, but the power is significantly larger than BIC. For the second method, the type I error probability is under control, but its power increment is comparatively low. The computing time for the proposed test is rather long. However, for those who need an eclectic method between AIC and BIC, the proposed test can serve as a reasonable choice.
|
93 |
New results in dimension reduction and model selectionSmith, Andrew Korb 26 March 2008 (has links)
Dimension reduction is a vital tool in many areas of applied statistics in which the dimensionality of the predictors can be large. In such cases, many statistical methods will fail or yield unsatisfactory results. However, many data sets of high dimensionality actually contain a much simpler, low-dimensional structure. Classical methods such as principal components analysis are able to detect linear structures very effectively, but fail in the presence of nonlinear structures. In the first part of this thesis, we investigate the asymptotic behavior of two nonlinear dimensionality reduction algorithms, LTSA and HLLE. In particular, we show that both algorithms, under suitable conditions, asymptotically recover the true generating coordinates up to an isometry. We also discuss the relative merits of the two algorithms, and the effects of the underlying probability distributions of the coordinates on their performance.
Model selection is a fundamental problem in nearly all areas of applied statistics. In particular, a balance must be achieved between good in-sample performance and out-of-sample prediction. It is typically very easy to achieve good fit in the sample data, but empirically we often find that such models will generalize poorly. In the second part of the thesis, we propose a new procedure for the model selection problem which generalizes traditional methods. Our algorithm allows the combination of existing model selection criteria via a ranking procedure, leading to the creation of new criteria which are able to combine measures of in-sample fit and out-of-sample prediction performance into a single value. We then propose an algorithm which provably finds the optimal combination with a specified probability. We demonstrate through simulations that these new combined criteria can be substantially more powerful than any individual criterion.
|
94 |
eScience Approaches to Model Selection and Assessment : Applications in BioinformaticsEklund, Martin January 2009 (has links)
High-throughput experimental methods, such as DNA and protein microarrays, have become ubiquitous and indispensable tools in biology and biomedicine, and the number of high-throughput technologies is constantly increasing. They provide the power to measure thousands of properties of a biological system in a single experiment and have the potential to revolutionize our understanding of biology and medicine. However, the high expectations on high-throughput methods are challenged by the problem to statistically model the wealth of data in order to translate it into concrete biological knowledge, new drugs, and clinical practices. In particular, the huge number of properties measured in high-throughput experiments makes statistical model selection and assessment exigent. To use high-throughput data in critical applications, it must be warranted that the models we construct reflect the underlying biology and are not just hypotheses suggested by the data. We must furthermore have a clear picture of the risk of making incorrect decisions based on the models. The rapid improvements of computers and information technology have opened up new ways of how the problem of model selection and assessment can be approached. Specifically, eScience, i.e. computationally intensive science that is carried out in distributed network envi- ronments, provides computational power and means to efficiently access previously acquired scientific knowledge. This thesis investigates how we can use eScience to improve our chances of constructing biologically relevant models from high-throughput data. Novel methods for model selection and assessment that leverage on computational power and on prior scientific information to "guide" the model selection to models that a priori are likely to be relevant are proposed. In addition, a software system for deploying new methods and make them easily accessible to end users is presented.
|
95 |
Model selection and testing for an automated constraint modelling toolchainHussain, Bilal Syed January 2017 (has links)
Constraint Programming (CP) is a powerful technique for solving a variety of combinatorial problems. Automated modelling using a refinement based approach abstracts over modelling decisions in CP by allowing users to specify their problem in a high level specification language such as ESSENCE. This refinement process produces many models resulting from different choices that can be selected, each with their own strengths. A parameterised specification represents a problem class where the parameters of the class define the instance of the class we wish to solve. Since each model has different performance characteristics the model chosen is crucial to be able to solve the instance effectively. This thesis presents a method to generate instances automatically for the purpose of choosing a subset of the available models that have superior performance across the instance space. The second contribution of this thesis is a framework to automate the testing of a toolchain for automated modelling. This process includes a generator of test cases that covers all aspects of the ESSENCE specification language. This process utilises our first contribution namely instance generation to generate parameterised specifications. This framework can detect errors such as inconsistencies in the model produced during the refinement process. Once we have identified a specification that causes an error, this thesis presents our third contribution; a method for reducing the specification to a much simpler form, which still exhibits a similar error. Additionally this process can generate a set of complementary specifications including specifications that do not cause the error to help pinpoint the root cause.
|
96 |
Frequentist Model Averaging For Functional Logistic Regression ModelJun, Shi January 2018 (has links)
Frequentist model averaging as a newly emerging approach provides us a way to overcome the uncertainty caused by traditional model selection in estimation. It acknowledges the contribution of multiple models, instead of making inference and prediction purely based on one single model. Functional logistic regression is also a burgeoning method in studying the relationship between functional covariates and a binary response. In this paper, the frequentist model averaging approach is applied to the functional logistic regression model. A simulation study is implemented to compare its performance with model selection. The analysis shows that when conditional probability is taken as the focus parameter, model averaging is superior to model selection based on BIC. When the focus parameter is the intercept and slopes, model selection performs better.
|
97 |
Assessing Nonlinear Relationships through Rich Stimulus Sampling in Repeated-Measures DesignsCole, James Jacob 01 August 2018 (has links)
Explaining a phenomenon often requires identification of an underlying relationship between two variables. However, it is common practice in psychological research to sample only a few values of an independent variable. Young, Cole, and Sutherland (2012) showed that this practice can impair model selection in between-subject designs. The current study expands that line of research to within-subjects designs. In two Monte Carlo simulations, model discrimination under systematic sampling of 2, 3, or 4 levels of the IV was compared with that under random uniform sampling and sampling from a Halton sequence. The number of subjects, number of observations per subject, effect size, and between-subject parameter variance in the simulated experiments were also manipulated. Random sampling out-performed the other methods in model discrimination with only small, function-specific costs to parameter estimation. Halton sampling also produced good results but was less consistent. The systematic sampling methods were generally rank-ordered by the number of levels they sampled.
|
98 |
Seleção de modelos lineares mistos utilizando critérios de informação / Mixed linear model selection using information criterionTatiana Kazue Yamanouchi 18 August 2017 (has links)
O modelo misto é comumente utilizado em dados de medidas repetidas devido a sua flexibilidade de incorporar no modelo a correlação existente entre as observações medidas no mesmo indivíduo e a heterogeneidade de variâncias das observações feitas ao longo do tempo. Este modelo é composto de efeitos fixos, efeitos aleatórios e o erro aleatório e com isso na seleção do modelo misto muitas vezes é necessário selecionar os melhores componentes do modelo misto de tal forma que represente bem os dados. Os critérios de informação são ferramentas muito utilizadas na seleção de modelos, mas não há muitos estudos que indiquem como os critérios de informação se desempenham na seleção dos efeitos fixos, efeitos aleatórios e da estrutura de covariância que compõe o erro aleatório. Diante disso, neste trabalho realizou-se um estudo de simulação para avaliar o desempenho dos critérios de informação AIC, BIC e KIC na seleção dos componentes do modelo misto, medido pela taxa TP (Taxa de verdadeiro positivo). De modo geral, os critérios de informação se desempenharam bem, ou seja, tiveram altos valores de taxa TP em situações em que o tamanho da amostra é maior. Na seleção de efeitos fixos e na seleção da estrutura de covariância, em quase todas as situações, o critério BIC teve um desempenho melhor em relação aos critérios AIC e KIC. Na seleção de efeitos aleatórios nenhum critério teve um bom desempenho, exceto na seleção de efeitos aleatórios em que considera a estrutura de simetria composta, situação em que BIC teve o melhor desempenho. / The mixed model is commonly used in data of repeated measurements because of its flexibility to incorporate in the model the correlation existing between the observations measured in the same individual and the heterogeneity of variances of observations made over time. This model is composed of fixed effects, random effects and random error and with this in the selection of the mixed model it is often necessary to select the best components of the mixed model in such a way that it represents the data well. Information criteria are tools widely used in model selection, but there are not many studies that indicate how information criteria play out in the selection of fixed effects, random effects, and the covariance structure that makes up the random error. In this work, a simulation study was performed to evaluate the performance of the AIC, BIC and KIC information criteria in the selection of the components of the mixed model, measured by the TP (True positive Rate). In general, the information criteria performed well, that is, they had high TP rate in situations where the sample size is larger. In the selection of fixed effects and in the selection of the covariance structure, in almost all situations, the BIC criterion had a better performance in relation to the AIC and KIC criteria. In the selection of random effects no criterion had a good performance, except in the selection of Random effects in which it considers the compound symmetric structure, situation in which BIC had the best performance.
|
99 |
Seleção de modelos cópula-GARCH: uma abordagem bayesiana / Copula-Garch model model selection: a bayesian approachJoão Luiz Rossi 04 June 2012 (has links)
Esta dissertação teve como objetivo o estudo de modelos para séries temporais bivariadas, que tem a estrutura de dependência determinada por meio de funções de cópulas. A vantagem desta abordagem é que as cópulas fornecem uma descrição completa da estrutura de dependência. Em termos de inferência, foi adotada uma abordagem Bayesiana com utilização dos métodos de Monte Carlo via cadeias de Markov (MCMC). Primeiramente, um estudo de simulações foi realizado para verificar como os seguintes fatores, tamanho das séries e variações nas funções de cópula, nas distribuições marginais, nos valores do parâmetro de cópula e nos métodos de estimação, influenciam a taxa de seleção de modelos segundo os critérios EAIC, EBIC e DIC. Posteriormente, foram realizadas aplicações a dados reais dos modelos com estrutura de dependência estática e variante no tempo / The aim of this work was to study models for bivariate time series, where the dependence structure among the series is modeled by copulas. The advantage of this approach is that copulas provide a complete description of dependence structure. In terms of inference was adopted the Bayesian approach with utilization of Markov chain Monte Carlo (MCMC) methods. First, a simulation study was performed to verify how the factors, length of the series and variations on copula functions, on marginal distributions, on copula parameter value and on estimation methods, may affect models selection rate given by EAIC, EBIC and DIC criteria. After that, we applied the models with static and time-varying dependence structure to real data
|
100 |
Modelos de regressão sobre dados composicionais / Regression model for Compositional dataAndré Pierro de Camargo 09 December 2011 (has links)
Dados composicionais são constituídos por vetores cujas componentes representam as proporções de algum montante, isto é: vetores com entradas positivas cuja soma é igual a 1. Em diversas áreas do conhecimento, o problema de estimar as partes $y_1, y_2, \\dots, y_D$ correspondentes aos setores $SE_1, SE_2, \\dots, SE_D$, de uma certa quantidade $Q$, aparece com frequência. As porcentagens $y_1, y_2, \\dots, y_D$ de intenção de votos correspondentes aos candidatos $Ca_1, Ca_2, \\dots, Ca_D$ em eleições governamentais ou as parcelas de mercado correspondentes a industrias concorrentes formam exemplos típicos. Naturalmente, é de grande interesse analisar como variam tais proporções em função de certas mudanças contextuais, por exemplo, a localização geográfica ou o tempo. Em qualquer ambiente competitivo, informações sobre esse comportamento são de grande auxílio para a elaboração das estratégias dos concorrentes. Neste trabalho, apresentamos e discutimos algumas abordagens propostas na literatura para regressão sobre dados composicionais, assim como alguns métodos de seleção de modelos baseados em inferência bayesiana. \\\\ / Compositional data consist of vectors whose components are the proportions of some whole. The problem of estimating the portions $y_1, y_2, \\dots, y_D$ corresponding to the pieces $SE_1, SE_2, \\dots, SE_D$ of some whole $Q$ is often required in several domains of knowledge. The percentages $y_1, y_2, \\dots, y_D$ of votes corresponding to the competitors $Ca_1, Ca_2, \\dots, Ca_D$ in governmental elections or market share problems are typical examples. Of course, it is of great interest to study the behavior of such proportions according to some contextual transitions. In any competitive environmet, additional information of such behavior can be very helpful for the strategists to make proper decisions. In this work we present and discuss some approaches proposed by different authors for compositional data regression as well as some model selection methods based on bayesian inference.\\\\
|
Page generated in 0.0876 seconds