Spelling suggestions: "subject:"model choice"" "subject:"godel choice""
1 |
Méthodes de modélisation bayésienne et applications en recherche clinique / Methods of bayesian modelling and applications in clinical researchDenis, Marie 01 July 2010 (has links)
Depuis quelques années un engouement pour les méthodes de modélisation bayésienne a été observé dans divers domaines comme l'environnement, la médecine. Les études en recherche clinique s'appuient sur la modélisation mathématique et l'inférence statistique. Le but de cette thèse est d'étudier les applications possibles d'une telle modélisation dans le cadre de la recherche clinique. En effet, les jugements, les connaissances des experts (médecins, biologistes..) sont nombreux et importants. Il semble donc naturel de vouloir prendre en compte toutes ces connaissances a priori dans le modèle statistique. Après un rappel sur les fondamentaux des statistiques bayésiennes, des travaux préliminaires dans le cadre de la théorie de la décision sont présentés ainsi qu'un état de l'art des méthodes d'approximation. Une méthode de Monte Carlo par chaînes de Markov à sauts réversibles a été mise en place dans le contexte de modèles bien connus en recherche clinique : le modèle de Cox et le modèle logistique. Une approche de sélection de modèle est proposée comme une alternative aux critères classiques dans le cadre de la régression spline. Enfin différentes applications des méthodes bayésiennes non paramétriques sont développées. Des algorithmes sont adaptés et implémentés afin de pouvoir appliquer de telles méthodes. Cette thèse permet de mettre en avant les méthodes bayésiennes de différentes façons dans le cadre de la recherche clinique au travers de plusieurs jeux de données. / For some years a craze for the methods of bayesian modelling was observed in diverse domains as the environment, the medicine. The studies in clinical research lies on the modelling mathematical and the statistical inference. The purpose of this thesis is to study the possible applications of such a modelling within the framework of the clinical research. Indeed, judgments, knowledge of the experts (doctors, biologists) are many and important. It thus seems natural to want to take into account all these knowledge a priori in the statistical model. After a background on the fundamental of the bayesian statistics, preliminary works within the framework of the theory of the decision are presented as well as a state of the art of the methods of approximation. A MCMC method with reversible jumps was organized in the context of models known well in clinical research : the model of Cox and the logistic model. An approach of selection of model is proposed as an alternative in the classic criteria within the framework of the regression spline. Finally various applications of the nonparametric bayesian methods are developed. Algorithms are adapted and implemented to be able to apply such methods. This thesis allows to advance the bayesian methods in various ways within the framework of the clinical research through several sets of data.
|
2 |
A Response Selection Model for Choice Reaction TimeTindall, Albert Douglas 10 1900 (has links)
<p> The binary choice Fast Guess Model of Ollman and Yellott was generalized to a multiple choice model and six subjects were run in a choice reaction time task to test the model.
Stimulus set sizes of two, four and six were used and response accuracy and speed motivation was manipulated through specific instructions which were changed from trial to trial. Three
different motivational instructions were used. In all cases, subjects were to respond with maximum accuracy but were also told on each trial to either disregard the duration of their
response, respond within 440 milliseconds or respond within 300 milliseconds.</p> <p> The generalized Fast Guess Model was rejected because response time parameters of the SCR state were found to change across response accuracy-speed motivation instructions and across
stimulus set sizes. Implications of these results for other classes of models were also discussed.</p> / Thesis / Doctor of Philosophy (PhD)
|
3 |
A puzzle about economic explanation: examining the Cournot and Bertrand models of duopoly competitionNebel, Jonathan January 1900 (has links)
Master of Arts / Department of Economics / Peri da Silva / Economists use various models to explain why it is that firms are capable of pricing above marginal cost. In this paper, we will examine two of them: the Cournot and Bertrand duopoly models. Economists generally accept both models as good explanations of the phenomenon, but the two models contradict each other in various important ways. The puzzle is that two inconsistent explanations are both regarded as good explanations for the same phenomenon. This becomes especially worrisome when the two models are offering divergent policy recommendations. This report presents that puzzle by laying out how the two models contradict each other in a myriad of ways and then offers five possible solutions to that puzzle from various economists, philosophers of science, and philosophers of economics.
|
4 |
Model Likelihoods and Bayes Factors for Switching and Mixture ModelsFrühwirth-Schnatter, Sylvia January 2000 (has links) (PDF)
In the present paper we explore various approaches of computing model likelihoods from the MCMC output for mixture and switching models, among them the candidate's formula, importance sampling, reciprocal importance sampling and bridge sampling. We demonstrate that the candidate's formula is sensitive to label switching. It turns out that the best method to estimate the model likelihood is the bridge sampling technique, where the MCMC sample is combined with an iid sample from an importance density. The importance density is constructed in an unsupervised manner from the MCMC output using a mixture of complete data posteriors. Whereas the importance sampling estimator as well as the reciprocal importance sampling estimator are sensitive to the tail behaviour of the importance density, we demonstrate that the bridge sampling estimator is far more robust in this concern. Our case studies range from from selecting the number of classes in a mixture of multivariate normal distributions, testing for the inhomogeneity of a discrete time Poisson process, to testing for the presence of Markov switching and order selection in the MSAR model. (author's abstract) / Series: Forschungsberichte / Institut für Statistik
|
5 |
Model Likelihoods and Bayes Factors for Switching and Mixture ModelsFrühwirth-Schnatter, Sylvia January 2002 (has links) (PDF)
In the present paper we discuss the problem of estimating model likelihoods from the MCMC output for a general mixture and switching model. Estimation is based on the method of bridge sampling (Meng and Wong, 1996), where the MCMC sample is combined with an iid sample from an importance density. The importance density is constructed in an unsupervised manner from the MCMC output using a mixture of complete data posteriors. Whereas the importance sampling estimator as well as the reciprocal importance sampling estimator are sensitive to the tail behaviour of the importance density, we demonstrate that the bridge sampling estimator is far more robust in this concern. Our case studies range from computing marginal likelihoods for a mixture of multivariate normal distributions, testing for the inhomogeneity of a discrete time Poisson process, to testing for the presence of Markov switching and order selection in the MSAR model. (author's abstract) / Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"
|
6 |
Bayesian model estimation and comparison for longitudinal categorical dataTran, Thu Trung January 2008 (has links)
In this thesis, we address issues of model estimation for longitudinal categorical data and of model selection for these data with missing covariates. Longitudinal survey data capture the responses of each subject repeatedly through time, allowing for the separation of variation in the measured variable of interest across time for one subject from the variation in that variable among all subjects. Questions concerning persistence, patterns of structure, interaction of events and stability of multivariate relationships can be answered through longitudinal data analysis. Longitudinal data require special statistical methods because they must take into account the correlation between observations recorded on one subject. A further complication in analysing longitudinal data is accounting for the non- response or drop-out process. Potentially, the missing values are correlated with variables under study and hence cannot be totally excluded. Firstly, we investigate a Bayesian hierarchical model for the analysis of categorical longitudinal data from the Longitudinal Survey of Immigrants to Australia. Data for each subject is observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia. Secondly, we examine the Bayesian model selection techniques of the Bayes factor and Deviance Information Criterion for our regression models with miss- ing covariates. Computing Bayes factors involve computing the often complex marginal likelihood p(y|model) and various authors have presented methods to estimate this quantity. Here, we take the approach of path sampling via power posteriors (Friel and Pettitt, 2006). The appeal of this method is that for hierarchical regression models with missing covariates, a common occurrence in longitudinal data analysis, it is straightforward to calculate and interpret since integration over all parameters, including the imputed missing covariates and the random effects, is carried out automatically with minimal added complexi- ties of modelling or computation. We apply this technique to compare models for the employment status of immigrants to Australia. Finally, we also develop a model choice criterion based on the Deviance In- formation Criterion (DIC), similar to Celeux et al. (2006), but which is suitable for use with generalized linear models (GLMs) when covariates are missing at random. We define three different DICs: the marginal, where the missing data are averaged out of the likelihood; the complete, where the joint likelihood for response and covariates is considered; and the naive, where the likelihood is found assuming the missing values are parameters. These three versions have different computational complexities. We investigate through simulation the performance of these three different DICs for GLMs consisting of normally, binomially and multinomially distributed data with missing covariates having a normal distribution. We find that the marginal DIC and the estimate of the effective number of parameters, pD, have desirable properties appropriately indicating the true model for the response under differing amounts of missingness of the covariates. We find that the complete DIC is inappropriate generally in this context as it is extremely sensitive to the degree of missingness of the covariate model. Our new methodology is illustrated by analysing the results of a community survey.
|
7 |
Nouvelles méthodes d'inférence de l'histoire démographique à partir de données génétiques / New methods for inference on demographic history from genetic dataMerle, Coralie 12 December 2016 (has links)
Cette thèse consiste à améliorer les outils statistiques adaptés à des modèles stochastiques de génétiques des populations et de développer des méthodes statistiques adaptées à des données génétiques de nouvelle génération. Pour un modèle paramétrique basé sur le coalescent, la vraisemblance en un point de l'espace des paramètres s'écrit comme la somme des probabilités de toutes les histoires (généalogies munies de mutations) possibles de l'échantillon observé. À l'heure actuelle, les meilleures méthodes d'inférence des paramètres de ce type de modèles sont les méthodes bayésiennes approchées et l'approximation de la fonction de vraisemblance.L'algorithme d'échantillonnage préférentiel séquentiel (SIS) estime la vraisemblance, en parcourant de manière efficace l'espace latent de ces histoires. Dans ce schéma, la distribution d'importance propose les histoires de l'échantillon observé les plus probables possibles. Cette technique est lourde en temps de calcul mais fournit des estimations par maximum de vraisemblance d'une grande précision.Les modèles que nous souhaitons inférer incluent des variations de la taille de la population. Les méthodes d'IS ne sont pas efficaces pour des modèles en déséquilibre car les distributions d'importance ont été développées pour une population de taille constante au cours du temps. Le temps de calcul augmente fortement pour la même précision de l'estimation de la vraisemblance. La première contribution de cette thèse a consisté à explorer l'algorithme SIS avec ré-échantillonnage (SISR). L'idée est de ré-échantillonner de façon à apprendre quelles sont les histoires proposées par la distribution d'importance qui seront les plus probables avant d'avoir terminé leur simulation et diminuer le temps de calcul. Par ailleurs, nous avons proposé une nouvelle distribution de ré-échantillonnage, tirant profit de l'information contenue dans la vraisemblance composite par paire de l'échantillon.Le développement récent des technologies de séquençage à haut débit a révolutionné la génération de données de polymorphisme chez de nombreux organismes. Les méthodes d'inférence classiques de maximum de vraisemblance ou basées sur le Sites Frequency Spectrum, adaptées à des jeux de données de polymorphisme génétique de quelques loci, supposent l'indépendance des généalogies des loci. Pour tirer parti de données beaucoup plus denses sur le génome, nous considérons la dépendance des généalogies sur des positions voisines du génome et modéliser la recombinaison génétique. Alors, la vraisemblance prend la forme d'une intégrale sur tous les graphes de recombinaison ancestraux possibles pour les séquences échantillonnées, un espace de bien plus grande dimension que l'espace des généalogies. Les méthodes d'inférence basées sur la vraisemblance ne peuvent plus être utilisées sans plus d'approximations. De nombreuses méthodes infèrent les changements historiques de la taille de la population mais ne considèrent pas la complexité du modèle ajusté. Même si certaines proposent un contrôle d'un potentiel sur-ajustement du modèle, à notre connaissance, aucune procédure de choix de modèle entre des modèles démographiques de complexité différente n'a été proposée à partir de longueurs de segments identiques. Nous nous concentrons sur un modèle de taille de population constante et un modèle de population ayant subit un unique changement de taille dans le passé. Puisque ces modèles sont emboîtés, la deuxième contribution de cette thèse a consisté à développer un critère de choix de modèle pénalisé basé sur la comparaison d'homozygotie haplotypique observée et théorique. Notre pénalisation, reposant sur des indices de sensibilité de Sobol, est liée à la complexité du modèle. Ce critère pénalisé de choix de modèle nous a permis de choisir entre un modèle de taille de population constante ou présentant un changement passé de la taille de la population sur des jeux de données simulés et sur un jeux de données de vaches. / This thesis aims to improve statistical methods suitable for stochastic models of population genetics and to develop statistical methods adapted to next generation sequencing data.Sequential importance sampling algorithms have been defined to estimate likelihoods in models of ancestral population processes. However, these algorithms are based on features of the models with constant population size, and become inefficient when the population size varies in time, making likelihood-based inferences difficult in many demographic situations. In the first contribution of this thesis, we modify a previous sequential importance sampling algorithm to improve the efficiency of the likelihood estimation. Our procedure is still based on features of the model with constant size, but uses a resampling technique with a new resampling probability distribution depending on the pairwise composite likelihood. We tested our algorithm, called sequential importance sampling with resampling (SISR) on simulated data sets under different demographic cases. In most cases, we divided the computational cost by two for the same accuracy of inference, in some cases even by one hundred. This work provides the first assessment of the impact of such resampling techniques on parameter inference using sequential importance sampling, and extends the range of situations where likelihood inferences can be easily performed.The recent development of high-throughput sequencing technologies has revolutionized the generation of genetic data for many organisms : genome wide sequence data are now available. Classical inference methods (maximum likelihood methods (MCMC, IS), methods based on the Sites Frequency Spectrum (SFS)) suitable for polymorphism data sets of some loci assume that the genealogies of the loci are independent. To take advantage of genome wide sequence data with known genome, we need to consider the dependency of genealogies of adjacent positions in the genome. Thus, when we model recombination, the likelihood takes the form of an integral over all possible ancestral recombination graph for the sampled sequences. This space is of much larger dimension than the genealogies space, to the extent that we cannot handle likelihood-based inference while modeling recombination without further approximations.Several methods infer the historical changes in the effective population size but do not consider the complexity of the demographic model fitted.Even if some of them propose a control for potential over-fitting, to the best of our knowledge, no model choice procedure between demographic models of different complexity have been proposed based on IBS segment lengths. The aim of the second contribution of this thesis is to overcome this lack by proposing a model choice procedure between demographic models of different complexity. We focus on a simple model of constant population size and a slightly more complex model with a single past change in the population size.Since these models are embedded, we developed a penalized model choice criterion based on the comparison of observed and predicted haplotype homozygosity.Our penalization relies on Sobol's sensitivity indices and is a form of penalty related to the complexity of the model.This penalized model choice criterion allowed us to choose between a population of constant size and a population size with a past change on simulated data sets and also on a cattle data set.
|
8 |
Essays on Emerging Multinational Enterprises' Acquisitions in Developed EconomiesHarahap, Faisal R 25 August 2017 (has links)
This dissertation investigates emerging multinational enterprises (EMNEs)’s acquisitions of firms in developed economies (DE) through three distinctive but interrelated essays. Despite costs EMNEs must offset from the obvious cultural distance (CD) they encounter with limited exploitable advantages, EMNEs have continued to aggressively acquire firms in DE, suggesting there are ways for the EMNEs to effectively overcome CD. In Essay 1, using insights from the symbolic interaction paradigm in sociology, I developed the Dynamic Socio-Cultural Model (DSCM), to uncover the general process of cultural creation and change. At the core of the DSCM is the process of collective learning and adaptive interaction in every social system. Viewing EMNEs’ acquisitions in DE as a cultural event that leads to new shared cultural resources, DSCM shows culture is not as rigid as was typically conceptualized in the cross-cultural management literature. While the negative effect of CD may initially impede EMNEs, CD may be positively moderated by certain conditions of the involved cultures. In Essay 2, I extended DSCM and combined it with insights from the organizational learning literature to focus on EMNE’s choices of control mode and their performance implications. Performing event study and endogenous switching regression on 1157 EMNE’s acquisitions in 21 advanced economies, I found EMNEs have, on average, a positive post-acquisition performance. I also found being an EMNE from an emerging economy that underwent rapid industrialization and targeting a high-tech firm increases the probability for choosing a low-control mode. Moreover, EMNE acquirers choose control mode by strategically considering their unique characteristics to optimize performance. In Essay 3, using the same theoretical approach, I examined the target firms’ sources of value creation. Applying an event study on 167 acquisitions in North America made by EMNEs from 11 countries, I found EMNEs’ partial acquisitions in DE generate, on average, a positive target’s cumulative abnormal returns (CAR). There is also empirical support for several determinants of target’s value creation and moderation effects. In particular, I found target’s international experience attenuates the negative effect of CD on target CAR, while acquirer’s state-owned status exacerbates it. Overall, the three essays collectively contribute to research streams in EMNEs, seller’s view of M&A, and cultural change.
|
9 |
Lois a priori non-informatives et la modélisation par mélange / Non-informative priors and modelization by mixturesKamary, Kaniav 15 March 2016 (has links)
L’une des grandes applications de la statistique est la validation et la comparaison de modèles probabilistes au vu des données. Cette branche des statistiques a été développée depuis la formalisation de la fin du 19ième siècle par des pionniers comme Gosset, Pearson et Fisher. Dans le cas particulier de l’approche bayésienne, la solution à la comparaison de modèles est le facteur de Bayes, rapport des vraisemblances marginales, quelque soit le modèle évalué. Cette solution est obtenue par un raisonnement mathématique fondé sur une fonction de coût.Ce facteur de Bayes pose cependant problème et ce pour deux raisons. D’une part, le facteur de Bayes est très peu utilisé du fait d’une forte dépendance à la loi a priori (ou de manière équivalente du fait d’une absence de calibration absolue). Néanmoins la sélection d’une loi a priori a un rôle vital dans la statistique bayésienne et par conséquent l’une des difficultés avec la version traditionnelle de l’approche bayésienne est la discontinuité de l’utilisation des lois a priori impropres car ils ne sont pas justifiées dans la plupart des situations de test. La première partie de cette thèse traite d’un examen général sur les lois a priori non informatives, de leurs caractéristiques et montre la stabilité globale des distributions a posteriori en réévaluant les exemples de [Seaman III 2012]. Le second problème, indépendant, est que le facteur de Bayes est difficile à calculer à l’exception des cas les plus simples (lois conjuguées). Une branche des statistiques computationnelles s’est donc attachée à résoudre ce problème, avec des solutions empruntant à la physique statistique comme la méthode du path sampling de [Gelman 1998] et à la théorie du signal. Les solutions existantes ne sont cependant pas universelles et une réévaluation de ces méthodes suivie du développement de méthodes alternatives constitue une partie de la thèse. Nous considérons donc un nouveau paradigme pour les tests bayésiens d’hypothèses et la comparaison de modèles bayésiens en définissant une alternative à la construction traditionnelle de probabilités a posteriori qu’une hypothèse est vraie ou que les données proviennent d’un modèle spécifique. Cette méthode se fonde sur l’examen des modèles en compétition en tant que composants d’un modèle de mélange. En remplaçant le problème de test original avec une estimation qui se concentre sur le poids de probabilité d’un modèle donné dans un modèle de mélange, nous analysons la sensibilité sur la distribution a posteriori conséquente des poids pour divers modélisation préalables sur les poids et soulignons qu’un intérêt important de l’utilisation de cette perspective est que les lois a priori impropres génériques sont acceptables, tout en ne mettant pas en péril la convergence. Pour cela, les méthodes MCMC comme l’algorithme de Metropolis-Hastings et l’échantillonneur de Gibbs et des approximations de la probabilité par des méthodes empiriques sont utilisées. Une autre caractéristique de cette variante facilement mise en œuvre est que les vitesses de convergence de la partie postérieure de la moyenne du poids et de probabilité a posteriori correspondant sont assez similaires à la solution bayésienne classique / One of the major applications of statistics is the validation and comparing probabilistic models given the data. This branch statistics has been developed since the formalization of the late 19th century by pioneers like Gosset, Pearson and Fisher. In the special case of the Bayesian approach, the comparison solution of models is the Bayes factor, ratio of marginal likelihoods, whatever the estimated model. This solution is obtained by a mathematical reasoning based on a loss function. Despite a frequent use of Bayes factor and its equivalent, the posterior probability of models, by the Bayesian community, it is however problematic in some cases. First, this rule is highly dependent on the prior modeling even with large datasets and as the selection of a prior density has a vital role in Bayesian statistics, one of difficulties with the traditional handling of Bayesian tests is a discontinuity in the use of improper priors since they are not justified in most testing situations. The first part of this thesis deals with a general review on non-informative priors, their features and demonstrating the overall stability of posterior distributions by reassessing examples of [Seaman III 2012].Beside that, Bayes factors are difficult to calculate except in the simplest cases (conjugate distributions). A branch of computational statistics has therefore emerged to resolve this problem with solutions borrowing from statistical physics as the path sampling method of [Gelman 1998] and from signal processing. The existing solutions are not, however, universal and a reassessment of the methods followed by alternative methods is a part of the thesis. We therefore consider a novel paradigm for Bayesian testing of hypotheses and Bayesian model comparison. The idea is to define an alternative to the traditional construction of posterior probabilities that a given hypothesis is true or that the data originates from a specific model which is based on considering the models under comparison as components of a mixture model. By replacing the original testing problem with an estimation version that focus on the probability weight of a given model within a mixture model, we analyze the sensitivity on the resulting posterior distribution of the weights for various prior modelings on the weights and stress that a major appeal in using this novel perspective is that generic improper priors are acceptable, while not putting convergence in jeopardy. MCMC methods like Metropolis-Hastings algorithm and the Gibbs sampler are used. From a computational viewpoint, another feature of this easily implemented alternative to the classical Bayesian solution is that the speeds of convergence of the posterior mean of the weight and of the corresponding posterior probability are quite similar.In the last part of the thesis we construct a reference Bayesian analysis of mixtures of Gaussian distributions by creating a new parameterization centered on the mean and variance of those models itself. This enables us to develop a genuine non-informative prior for Gaussian mixtures with an arbitrary number of components. We demonstrate that the posterior distribution associated with this prior is almost surely proper and provide MCMC implementations that exhibit the expected component exchangeability. The analyses are based on MCMC methods as the Metropolis-within-Gibbs algorithm, adaptive MCMC and the Parallel tempering algorithm. This part of the thesis is followed by the description of R package named Ultimixt which implements a generic reference Bayesian analysis of unidimensional mixtures of Gaussian distributions obtained by a location-scale parameterization of the model. This package can be applied to produce a Bayesian analysis of Gaussian mixtures with an arbitrary number of components, with no need to specify the prior distribution.
|
10 |
Strategická analýza podniku / Strategic Analysis of an EnterpriseVoborská, Jana January 2008 (has links)
Diploma thesis describes strategy analysis of the enterprise producing interior and facade paints. In diploma thesis are used such analysis as PEST, 4C, Porter's 5F model, internal factors and SWOT analysis. On the basis of SWOT analysis is formulated strategy for improving competitive position of the enterprise on the market.
|
Page generated in 0.051 seconds