Global ETD Search

61	Structural equation modelling Mohanlal, Pramod 06 1900 (has links) Over the past two decades there has been an upsurge in interest in structural equation modelling (SEM). Applications abound in the social sciences and econometrics, but the use of this multivariate technique is not so common in public health research. This dissertation discusses the methodology, the criticisms and practical problems of SEM. We examine actual applications of SEM in public health research. Comparisons are made between multiple regression and SEM and between factor analysis and SEM. A complex model investigating the utilization of antenatal care services (ANC) by migrant women in Belgium is analysed using SEM. The dissertation concludes with a discussion of the results found and on the use of SEM in public health research. Structural equation modelling is recommended as a tool for public health researchers with a warning against using the technique too casually. / Mathematical Sciences / M. Sc. (Statistics) Structural equation modelling Measurement model Structural model Latent variable Multivariate statistical techniques Theory based model Path diagram Identification Specification error Multiple regression Exploratory factor analysis Confirmatory factor analysis Utilization of antenatal care services 519.535 Multivariate analysis Public health -- Research
62	On specification and inference in the econometrics of public procurement Sundström, David January 2016 (has links) In Paper [I] we use data on Swedish public procurement auctions for internal regularcleaning service contracts to provide novel empirical evidence regarding green publicprocurement (GPP) and its effect on the potential suppliers’ decision to submit a bid andtheir probability of being qualified for supplier selection. We find only a weak effect onsupplier behavior which suggests that GPP does not live up to its political expectations.However, several environmental criteria appear to be associated with increased complexity,as indicated by the reduced probability of a bid being qualified in the postqualificationprocess. As such, GPP appears to have limited or no potential to function as an environmentalpolicy instrument. In Paper [II] the observation is made that empirical evaluations of the effect of policiestransmitted through public procurements on bid sizes are made using linear regressionsor by more involved non-linear structural models. The aspiration is typically to determinea marginal effect. Here, I compare marginal effects generated under both types ofspecifications. I study how a political initiative to make firms less environmentally damagingimplemented through public procurement influences Swedish firms’ behavior. Thecollected evidence brings about a statistically as well as economically significant effect onfirms’ bids and costs. Paper [III] embarks by noting that auction theory suggests that as the number of bidders(competition) increases, the sizes of the participants’ bids decrease. An issue in theempirical literature on auctions is which measurement(s) of competition to use. Utilizinga dataset on public procurements containing measurements on both the actual and potentialnumber of bidders I find that a workhorse model of public procurements is bestfitted to data using only actual bidders as measurement for competition. Acknowledgingthat all measurements of competition may be erroneous, I propose an instrumental variableestimator that (given my data) brings about a competition effect bounded by thosegenerated by specifications using the actual and potential number of bidders, respectively.Also, some asymptotic results are provided for non-linear least squares estimatorsobtained from a dependent variable transformation model. Paper [VI] introduces a novel method to measure bidders’ costs (valuations) in descending(ascending) auctions. Based on two bounded rationality constraints bidders’costs (valuations) are given an imperfect measurements interpretation robust to behavioraldeviations from traditional rationality assumptions. Theory provides no guidanceas to the shape of the cost (valuation) distributions while empirical evidence suggeststhem to be positively skew. Consequently, a flexible distribution is employed in an imperfectmeasurements framework. An illustration of the proposed method on Swedishpublic procurement data is provided along with a comparison to a traditional BayesianNash Equilibrium approach. auctions dependent variable transformation model green public procurement indirect inference instrumental variable latent variable log-generalized gamma distribution maximum likelihood measurement error non-linear least squares objective effectiveness orthogonal polynomial regression prediction simulation estimation structural estimation
63	Mesures subjectives et épidémiologie : problèmes méthodologiques liés à l'utilisation des techniques psychométriques Rouquette, Alexandra 09 1900 (has links) L’utilisation des mesures subjectives en épidémiologie s’est intensifiée récemment, notamment avec la volonté de plus en plus affirmée d’intégrer la perception qu’ont les sujets de leur santé dans l’étude des maladies et l’évaluation des interventions. La psychométrie regroupe les méthodes statistiques utilisées pour la construction des questionnaires et l’analyse des données qui en sont issues. Ce travail de thèse avait pour but d’explorer différents problèmes méthodologiques soulevés par l’utilisation des techniques psychométriques en épidémiologie. Trois études empiriques sont présentées et concernent 1/ la phase de validation de l’instrument : l’objectif était de développer, à l’aide de données simulées, un outil de calcul de la taille d’échantillon pour la validation d’échelle en psychiatrie ; 2/ les propriétés mathématiques de la mesure obtenue : l’objectif était de comparer les performances de la différence minimale cliniquement pertinente d’un questionnaire calculée sur des données de cohorte, soit dans le cadre de la théorie classique des tests (CTT), soit dans celui de la théorie de réponse à l’item (IRT) ; 3/ son utilisation dans un schéma longitudinal : l’objectif était de comparer, à l’aide de données simulées, les performances d’une méthode statistique d’analyse de l’évolution longitudinale d’un phénomène subjectif mesuré à l’aide de la CTT ou de l’IRT, en particulier lorsque certains items disponibles pour la mesure différaient à chaque temps. Enfin, l’utilisation de graphes orientés acycliques a permis de discuter, à l’aide des résultats de ces trois études, la notion de biais d’information lors de l’utilisation des mesures subjectives en épidémiologie. / Recently, subjective measurements have increasingly been used in epidemiology, alongside the growing will to integrate individuals’ point of view on their health in studies on diseases or health interventions. Psychometrics includes statistical methods used to develop questionnaires and to analyze questionnaire data. This doctoral dissertation aimed to explore methodological issues raised by the use of psychometric techniques in epidemiology. Three empirical studies are presented and cover 1 / the validation stage of a questionnaire: the objective was to develop, using simulated data, a tool to determine sample size for internal validity studies on psychiatric scale; 2 / the mathematical properties of the subjective measurement: the objective was to compare the performances of the minimal clinically important difference of a questionnaire, assessed on data from a cohort study, computed using the classical test theory (CTT) framework or the item response theory framework (IRT); 3 / its use in a longitudinal design: the objective was to compare, using simulated data, the performances of a statistical method aimed to analyze the longitudinal course of a subjective phenomenon measured using the CTT or IRT framework, especially when some of the available items used for its measurement differ at each time of data collection. Finally, directed acyclic graphs were used to discuss the results from these three studies and the concept of information bias when subjective measurements are used in epidemiology. Mesures subjectives Psychométrie Questionnaire Variable latente Epidémiologie Longitudinal Biais d’information Graphe acyclique orienté Subjective measurement Psychometrics Latent variable Epidemiology Longitudinal design Information bias Directed acyclic graphs
64	Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision Täckström, Oscar January 2013 (has links) Contemporary approaches to natural language processing are predominantly based on statistical machine learning from large amounts of text, which has been manually annotated with the linguistic structure of interest. However, such complete supervision is currently only available for the world's major languages, in a limited number of domains and for a limited range of tasks. As an alternative, this dissertation considers methods for linguistic structure prediction that can make use of incomplete and cross-lingual supervision, with the prospect of making linguistic processing tools more widely available at a lower cost. An overarching theme of this work is the use of structured discriminative latent variable models for learning with indirect and ambiguous supervision; as instantiated, these models admit rich model features while retaining efficient learning and inference properties. The first contribution to this end is a latent-variable model for fine-grained sentiment analysis with coarse-grained indirect supervision. The second is a model for cross-lingual word-cluster induction and the application thereof to cross-lingual model transfer. The third is a method for adapting multi-source discriminative cross-lingual transfer models to target languages, by means of typologically informed selective parameter sharing. The fourth is an ambiguity-aware self- and ensemble-training algorithm, which is applied to target language adaptation and relexicalization of delexicalized cross-lingual transfer parsers. The fifth is a set of sequence-labeling models that combine constraints at the level of tokens and types, and an instantiation of these models for part-of-speech tagging with incomplete cross-lingual and crowdsourced supervision. In addition to these contributions, comprehensive overviews are provided of structured prediction with no or incomplete supervision, as well as of learning in the multilingual and cross-lingual settings. Through careful empirical evaluation, it is established that the proposed methods can be used to create substantially more accurate tools for linguistic processing, compared to both unsupervised methods and to recently proposed cross-lingual methods. The empirical support for this claim is particularly strong in the latter case; our models for syntactic dependency parsing and part-of-speech tagging achieve the hitherto best published results for a wide number of target languages, in the setting where no annotated training data is available in the target language. linguistic structure prediction structured prediction latent-variable model semi-supervised learning multilingual learning cross-lingual learning indirect supervision partial supervision ambiguous supervision part-of-speech tagging dependency parsing named-entity recognition sentiment analysis
65	Essays on travel mode choice modeling : a discrete choice approach of the interactions between economic and behavioral theories / Essais sur la modélisation du choix modal : une approche par les choix discrets des interactions entre théories économiques et comportementales Bouscasse, Hélène 09 November 2017 (has links) Cette thèse a pour objectif d’incorporer des éléments de théories de psychologie et d’économie comportementale dans des modèles de choix discret afin d’améliorer la compréhension du choix modal réalisé à l’échelle régionale. Les estimations se basent sur une enquête de type choice experiment présentée en première partie. Une deuxième partie s’intéresse à l’incorporation de variables latentes pour expliquer le choix modal. Après une revue de littérature sur les modèles de choix hybrides, c’est-à-dire des modèles combinant modèle d’équations structurelles et modèle de choix discret, un tel modèle est estimé pour montrer comment l’hétérogénéité d’outputs économiques (ici, la valeur du temps) peut être expliquée à l’aide de variables latentes (ici, le confort perçu dans les transports en commun) et de variables observables (ici, la garantie d’une place assise). La simulation de scénarios montre cependant que le gain économique (diminution de la valeur du temps) est plus élevé lorsque les politiques agissent sur des dimensions palpables que sur des dimensions latentes. S’appuyant sur un modèle de médiation, l’estimation d’un modèle d’équations structurelles montre par ailleurs que l’effet de la conscience environnementale sur les habitudes de choix modal est partiellement médié par l’utilité indirecte retirée de l’usage des transports en commun. Une troisième partie s’intéresse à deux formalisations de l’utilité issues de l’économie comportementale : 1) l’utilité dépendante au rang en situation de risque et 2) l’utilité dépendante à la référence. Dans un premier temps, un modèle d’utilité dépendante au rang est inséré dans des modèles de choix discret et, en particulier, un modèle à classes latentes, afin d’analyser l’hétérogénéité intra- et inter-individuelle lorsque le temps de déplacement n’est pas fiable. La probabilité de survenue d’un retard est sur-évaluée pour les déplacements en train et sous-évaluée pour les déplacements en voiture, en particulier pour les automobilistes, les usagers du train prenant d’avantage en compte l’espérance du temps de déplacement. Dans les modèles prenant en compte l’aversion au risque, les fonctions d’utilité sont convexes, ce qui implique une décroissance,de la valeur du temps. Dans un deuxième temps, une nouvelle famille de modèles de choix discret généralisant le modèle logit multinomial, les modèles de référence, est estimée. Sur mes données, ces modèles permettent une meilleure sélection des variables explicatives que le logit multinomial et l’estimation d’outputs économiques plus robustes, notamment en cas de forte hétérogénéité inobservée. La traduction économique des modèles de référence montre que les meilleurs modèles empiriques sont également les plus compatibles avec le modèle de dépendance à la référence de Tversky et Kahneman. / The objective of this thesis is to incorporate aspects of psychology and behavioral economics theories in discrete choice models to promote a better understanding of mode choice at regional level. Part II examines the inclusion of latent variables to explain mode choice. A literature review of integrated choice and latent variable models – that is, models combining a structural equation model and a discrete choice model – is followed by the estimation of an integrated choice and latent variable model to show how the heterogeneity of economic outputs (here, value of time) can be explained with latent variables (here, perceived comfort in public transport) and observable variables (here, the guarantee of a seat). The simulation of scenarios shows, however, that the economic gain (decrease in value of time) is higher when policies address tangible factors than when they address latent factors. On the basis of a mediation model, the estimation of a structural equation model furthermore implies that the influence of environmental concern on mode choice habits is partially mediated by the indirect utility derived frompublic transport use. Part III examines two utility formulations taken from behavioral economics: 1) rankdependent utility to model risky choices, and 2) reference-dependent utility. Firstly, a rank-dependent utility model is included in discrete choice models and, in particular, a latent-class model, in order to analyze intra- and inter-individual heterogeneity when the travel time is subject to variability. The results show that the probability of a delay is over-estimated for train travel and under-estimated for car travel, especially for car users, as train users are more likely to take into account the expected travel time. In the models that account for risk aversion, the utility functions are convex, which implies a decrease in value of time. Secondly, a new family of discrete choice models generalizing the multinomial logit model, the reference models, is estimated. On my data, these models allow for a better selection of explanatory variables than the multinomial logit model and a more robust estimation of economic outputs, particularly in cases of high unobserved heterogeneity. The economic formulation of reference models shows thatthe best empirical models are also more compatible with Tversky et Kahneman’s reference-dependent model. Choix modal Variables latentes Modèles de choix discrets Modèles de choix hybrides Valeur du temps Comportement dans le risque Modèles de référence Mode choice Latent variables Discrete choice models Integrated choice latent variable models Value of time Risky choices Reference models 330
66	拔靴法在線性結構關係模式適合度指標之應用 / Bootstrap procedures for evaluating goodness-of-fit indices of linear structural equation models 羅靖霖, Lo, Chin Lin Unknown Date (has links) 線性結構關係模式是一種考慮以多個直線方程式來分析處理變數間因果關係的統計方法，其結合了因徑分析及因素分析之優點並將之融合於整體模式中。線性結構關係模式經過參數估計後，需評估整個模式之好壞，因此許多學者嘗試提出一些評估模式好壞的適合度指標，如一般常用的卡方檢定、殘差均方根、適合度指標、調整後適合度指標以及基準指標等。這些指標中有的指標會受到樣本數大小或樣本分布的影響，有些指標受模式隱藏變數多寡或因素指標多寡的影響，有些指標需有嚴格的條件（如樣本需服從常態分布）及前提方可適用，且有些指標的分布是未知的，因此欲對這些指標進行區間估計、假設檢定、或顯著性差異比較是不可能的。基於上述各種適合度指標的缺點，本論文利用拔靴法進行重抽樣求得拔靴分布來解決上述各種問題。然而傳統的拔靴法在線性結構關係模式上是不適用的，因此，再提出一改良拔靴法程序，求得拔靴分布來做為評估模式好壞的依據，並利用改良拔靴法來做巢狀模式之顯著性差異比較及利用抽樣誤差和非抽樣誤差觀念來評估模式適合度。
67	Probabilistic models in noisy environments : and their application to a visual prosthesis for the blind Archambeau, Cédric 26 September 2005 (has links) In recent years, probabilistic models have become fundamental techniques in machine learning. They are successfully applied in various engineering problems, such as robotics, biometrics, brain-computer interfaces or artificial vision, and will gain in importance in the near future. This work deals with the difficult, but common situation where the data is, either very noisy, or scarce compared to the complexity of the process to model. We focus on latent variable models, which can be formalized as probabilistic graphical models and learned by the expectation-maximization algorithm or its variants (e.g., variational Bayes).<br> After having carefully studied a non-exhaustive list of multivariate kernel density estimators, we established that in most applications locally adaptive estimators should be preferred. Unfortunately, these methods are usually sensitive to outliers and have often too many parameters to set. Therefore, we focus on finite mixture models, which do not suffer from these drawbacks provided some structural modifications.<br> Two questions are central in this dissertation: (i) how to make mixture models robust to noise, i.e. deal efficiently with outliers, and (ii) how to exploit side-channel information, i.e. additional information intrinsic to the data. In order to tackle the first question, we extent the training algorithms of the popular Gaussian mixture models to the Student-t mixture models. the Student-t distribution can be viewed as a heavy-tailed alternative to the Gaussian distribution, the robustness being tuned by an extra parameter, the degrees of freedom. Furthermore, we introduce a new variational Bayesian algorithm for learning Bayesian Student-t mixture models. This algorithm leads to very robust density estimators and clustering. To address the second question, we introduce manifold constrained mixture models. This new technique exploits the information that the data is living on a manifold of lower dimension than the dimension of the feature space. Taking the implicit geometrical data arrangement into account results in better generalization on unseen data.<br> Finally, we show that the latent variable framework used for learning mixture models can be extended to construct probabilistic regularization networks, such as the Relevance Vector Machines. Subsequently, we make use of these methods in the context of an optic nerve visual prosthesis to restore partial vision to blind people of whom the optic nerve is still functional. Although visual sensations can be induced electrically in the blind's visual field, the coding scheme of the visual information along the visual pathways is poorly known. Therefore, we use probabilistic models to link the stimulation parameters to the features of the visual perceptions. Both black-box and grey-box models are considered. The grey-box models take advantage of the known neurophysiological information and are more instructive to medical doctors and psychologists.<br> Visual prosthesis Nonparameteric density estimation Optic nerve Variational Bayes Expectation-maximization Bayesian learning Finite mixture models Rehabilitation Geodesics Manifold constrained models Regularization networks Probabilistic graphical models Latent variable models Robustness to noise
68	Composite Likelihood Estimation for Latent Variable Models with Ordinal and Continuous, or Ranking Variables Katsikatsou, Myrsini January 2013 (has links) The estimation of latent variable models with ordinal and continuous, or ranking variables is the research focus of this thesis. The existing estimation methods are discussed and a composite likelihood approach is developed. The main advantages of the new method are its low computational complexity which remains unchanged regardless of the model size, and that it yields an asymptotically unbiased, consistent, and normally distributed estimator. The thesis consists of four papers. The first one investigates the two main formulations of the unrestricted Thurstonian model for ranking data along with the corresponding identification constraints. It is found that the extra identifications constraints required in one of them lead to unreliable estimates unless the constraints coincide with the true values of the fixed parameters. In the second paper, a pairwise likelihood (PL) estimation is developed for factor analysis models with ordinal variables. The performance of PL is studied in terms of bias and mean squared error (MSE) and compared with that of the conventional estimation methods via a simulation study and through some real data examples. It is found that the PL estimates and standard errors have very small bias and MSE both decreasing with the sample size, and that the method is competitive to the conventional ones. The results of the first two papers lead to the next one where PL estimation is adjusted to the unrestricted Thurstonian ranking model. As before, the performance of the proposed approach is studied through a simulation study with respect to relative bias and relative MSE and in comparison with the conventional estimation methods. The conclusions are similar to those of the second paper. The last paper extends the PL estimation to the whole structural equation modeling framework where data may include both ordinal and continuous variables as well as covariates. The approach is demonstrated through an example run in R software. The code used has been incorporated in the R package lavaan (version 0.5-11). latent variable models factor analysis structural equation models Thurstonian model item response theory composite likelihood estimation pairwise likelihood estimation maximum likelihood weighted least squares ordinal variables ranking variables lavaan
69	Predicting Linguistic Structure with Incomplete and Cross-Lingual Supervision Täckström, Oscar January 2013 (has links) Contemporary approaches to natural language processing are predominantly based on statistical machine learning from large amounts of text, which has been manually annotated with the linguistic structure of interest. However, such complete supervision is currently only available for the world's major languages, in a limited number of domains and for a limited range of tasks. As an alternative, this dissertation considers methods for linguistic structure prediction that can make use of incomplete and cross-lingual supervision, with the prospect of making linguistic processing tools more widely available at a lower cost. An overarching theme of this work is the use of structured discriminative latent variable models for learning with indirect and ambiguous supervision; as instantiated, these models admit rich model features while retaining efficient learning and inference properties. The first contribution to this end is a latent-variable model for fine-grained sentiment analysis with coarse-grained indirect supervision. The second is a model for cross-lingual word-cluster induction and the application thereof to cross-lingual model transfer. The third is a method for adapting multi-source discriminative cross-lingual transfer models to target languages, by means of typologically informed selective parameter sharing. The fourth is an ambiguity-aware self- and ensemble-training algorithm, which is applied to target language adaptation and relexicalization of delexicalized cross-lingual transfer parsers. The fifth is a set of sequence-labeling models that combine constraints at the level of tokens and types, and an instantiation of these models for part-of-speech tagging with incomplete cross-lingual and crowdsourced supervision. In addition to these contributions, comprehensive overviews are provided of structured prediction with no or incomplete supervision, as well as of learning in the multilingual and cross-lingual settings. Through careful empirical evaluation, it is established that the proposed methods can be used to create substantially more accurate tools for linguistic processing, compared to both unsupervised methods and to recently proposed cross-lingual methods. The empirical support for this claim is particularly strong in the latter case; our models for syntactic dependency parsing and part-of-speech tagging achieve the hitherto best published results for a wide number of target languages, in the setting where no annotated training data is available in the target language. linguistic structure prediction structured prediction latent-variable model semi-supervised learning multilingual learning cross-lingual learning indirect supervision partial supervision ambiguous supervision part-of-speech tagging dependency parsing named-entity recognition sentiment analysis
70	Computer experiments: design, modeling and integration Qian, Zhiguang 19 May 2006 (has links) The use of computer modeling is fast increasing in almost every scientific, engineering and business arena. This dissertation investigates some challenging issues in design, modeling and analysis of computer experiments, which will consist of four major parts. In the first part, a new approach is developed to combine data from approximate and detailed simulations to build a surrogate model based on some stochastic models. In the second part, we propose some Bayesian hierarchical Gaussian process models to integrate data from different types of experiments. The third part concerns the development of latent variable models for computer experiments with multivariate response with application to data center temperature modeling. The last chapter is devoted to the development of nested space-filling designs for multiple experiments with different levels of accuracy. Computer experiment Gaussian process models Metamodels Surrogate models Design of experiments Orthogonal arrays Structural equation model Latent variable model Data center Mathematical models Approximation theory Decision support systems Digital computer simulation Experimental design Gaussian processes

Search results