Spelling suggestions: "subject:"datent variables"" "subject:"iatent variables""
61 |
Multiple Calibrations in Integrative Data Analysis: A Simulation Study and Application to Multidimensional Family TherapyHall, Kristin Wynn 01 January 2013 (has links)
A recent advancement in statistical methodology, Integrative Data Analyses (IDA Curran & Hussong, 2009) has led researchers to employ a calibration technique as to not violate an independence assumption. This technique uses a randomly selected, simplified correlational structured subset, or calibration, of a whole data set in a preliminary stage of analysis. However, a single calibration estimator suffers from instability, low precision and loss of power. To overcome this limitation, a multiple calibration (MC; Greenbaum et al., 2013; Wang et al., 2013) approach has been developed to produce better estimators, while still removing a level of dependency in the data as to not violate independence assumption. The MC method is conceptually similar to multiple imputation (MI; Rubin, 1987; Schafer, 1997), so MI estimators were borrowed for comparison.
A simulation study was conducted to compare the MC and MI estimators, as well as to evaluate the performance of the operating characteristics of the methods in a cross classified data characteristic design. The estimators were tested in the context of assessing change over time in a longitudinal data set. Multiple calibrations consisting of a single measurement occasion per subject were drawn from a repeated measures data set, analyzed separately, and then combined by the rules set forth by each method to produce the final results. The data characteristics investigated were effect size, sample size, and the number of repeated measures per subject. Additionally, a real data application of an MC approach in an IDA framework was conducted on data from three completed, randomized controlled trials studying the treatment effects of Multidimensional Family Therapy (MDFT; Liddle et al., 2002) on substance use trajectories for adolescents at a one year follow-up.
The simulation study provided empirical evidence of how the MC method preforms, as well as how it compares to the MI method in a total of 27 hypothetical scenarios. There were strong asymptotic tendencies observed for the bias, standard error, mean square error and relative efficiency of an MC estimator to approach the whole set estimators as the number of calibrations approached 100. The MI combination rules proved not appropriate to borrow for the MC case because the standard error formulas were too conservative and performance with respect to power was not robust. As a general suggestion, 5 calibrations are sufficient to produce an estimator with about half the bias of a single calibration estimator and at least some indication of significance, while 20 calibrations are ideal. After 20 calibrations, the contribution of an additional calibration to the combined estimator greatly diminished.
The MDFT application demonstrated a successful implementation of 5 calibration approach in an IDA on real data, as well as the risk of missing treatment effects when analysis is limited to a single calibration's results. Additionally, results from the application provided evidence that MDFT interventions reduced the trajectories of substance use involvement at a 1-year follow-up to a greater extent than any of the active control treatment groups, overall and across all gender and ethnicity subgroups. This paper will aid researchers interested in employing a MC approach in an IDA framework or whenever a level of dependency in a data set needs to be removed for an independence assumption to hold.
|
62 |
Fatores de risco para nascimento pré-termo - uma análise com modelagem de equações estruturais / Risk factors of preterm birth - an analysis with structural equation modelingAdelaide Alves de Oliveira 20 October 2014 (has links)
Introdução: Um dos principais fatores associados à mortalidade e morbidade no período perinatal é o nascimento pré-termo. Sua prevalência está aumentando em diversas localidades e os motivos para tal envolve complexa inter-relação de fatores que incluem aspectos biológicos, de assistência, psicológicos, sociais e econômicos, entre diversos outros associados ao nascimento pré-termo. Objetivo: O presente estudo objetiva analisar os efeitos de variáveis associadas ao nascimento pré-termo, via modelagem de equações estruturais (MEE), com a construção de variáveis latentes de vulnerabilidade socioeconômica, familiar, psicossocial, condições maternas e intercorrências. Outros objetivos são analisar as diversas fontes de obtenção da idade gestacional (IG) visando gerar uma variável latente para a idade gestacional e comparar modelos com diferentes tipos de variável de desfecho (contínuo, ordinal, binário e latente). Métodos: Esse estudo foi baseado na pesquisa do tipo caso-controle populacional sobre nascidos vivos hospitalares de mães residentes na cidade de Londrina, estado do Paraná, realizado entre junho de 2006 e março de 2007. A partir de um modelo conceitual, foi empregada a MEE para construir modelos com variáveis latentes e compostas com utilização de medidas de ajustes. Resultados: O Modelo final apontou os seguintes efeitos diretos sobre IG: intercorrências, atenção de pré-natal inadequada, gestação múltipla, condição reprodutiva, IMC, consumo de bebida alcoólica, prática de caminhadas e esforço físico. Além do efeito direto da atenção pré-natal, foi encontrado seu papel de variável de mediação entre as variáveis latentes: Vulnerabilidade Socioeconômica, Não Aceitação da Gravidez e Vulnerabilidade Familiar e o desfecho IG. Conclusões: Para a utilização de MEE, foi necessário rediscutir o papel das variáveis propostas no estudo original, o que sugere que, para a aplicação da metodologia, já seja considerada a formulação de variáveis latentes à priori na fase de planejamento de uma pesquisa. A aplicação do MEE permitiu construir variáveis latentes e variáveis compostas a partir do estudo original e identificar efeitos diretos, indiretos e de mediação sobre IG. Ademais, foi possível construir e utilizar a variável latente IG a partir das fontes de registro. / Introduction: One of the main factors associated with mortality and morbidity in the perinatal period is preterm birth. Its prevalence is increasing in many locations and the reasons for such involves a complex interplay of factors associated with preterm birth including biological aspects, assistance, psychological, social and economic factors birth, among many others associated with preterm birth. Objective: This study aims to analyze the effects of variables associated with preterm birth via structural equation modeling (SEM), with the construction of latent variables of socioeconomic vulnerability, family, psychosocial, and maternal complications. Other objectives are to analyze the use of various sources of registering gestational age in order to generate a latent variable for gestational age (GA) and compare models with different types of outcome (continuous, ordinal, binary and latent variable). Methods - This study was based on research in population case-control over hospital births to mothers resident in the city of Londrina, Paraná, conducted between June 2006 and March 2007. From a conceptual model it was used SEM to build models with latent and composite variables with use of adjustment measures. Results: The final model showed the following direct effects on GA: complications, prenatal care inadequate, multiple gestation, reproductive condition, BMI, alcohol consumption, going for walks and effort. Besides the direct effect of prenatal care, found their role of mediating variable between latent variables: Socioeconomic Vulnerability, Not Acceptance of Pregnancy, Vulnerability and Family and the outcome GA. Conclusions: For the use of SEM it was necessary to revisit the role of the variables proposed in the original study, which suggests that, for the application of the methodology, it is already considered the formulation of a priori latent variables in the planning phase of a research. The application allowed SEM to build latent variables and composite variables from the original study and identify direct, indirect and mediating effects on GA. Furthermore, it was possible to build and use the GA latent variable from the recording sources.
|
63 |
Aplicações de cópulas em modelos de riscos múltiplos dependentes e em modelos de misturas de distribuições / Applications of copula to polyhazard models with dependence and mixture modelsTsai, Rodrigo, 1974- 30 November 2029 (has links)
Orientador: Luiz Koodi Hotta / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica / Made available in DSpace on 2018-08-21T13:55:30Z (GMT). No. of bitstreams: 1
Tsai_Rodrigo_D.pdf: 3859687 bytes, checksum: 1064b1fa05b98307d97763bb79e95de4 (MD5)
Previous issue date: 2012 / Resumo: Nesse trabalho discutimos aplicações de cópulas a modelos de riscos múltiplos com dependência e modelos de misturas de distribuições. Numa primeira parte analisamos a inclusão de dependência entre os fatores de risco do modelo de riscos múltiplos. Os modelos de riscos múltiplos são uma família de modelos flexíveis para representar dados de tempos de vida. Suas maiores vantagens sobre os modelos de risco simples incluem a habilidade de representar funções de taxa de falha com formas não usuais e a facilidade de incluir covariáveis. O objetivo principal dessa parte é modelar a dependência existente entre as causas latentes de falha do modelo de riscos múltiplos por meio de funções de cópulas. A escolha da função de cópulas bem como das funções de distribuição dos tempos latentes de falha resultam numa classe flexível de distribuições de sobrevivência que é capaz de representar funções de taxa de falha de formas multimodais, forma de banheira e contendo efeitos locais dados pela concorrência dos riscos. A identificação e estimação do modelo proposto também são discutidas. Ao eliminar a restrição de suporte positivo para as variáveis latentes, o método pode ser utilizado para gerar uma família rica de distribuições univariadas contendo assimetrias e múltiplas modas. Na segunda parte propomos um modelo de mistura de distribuições generalizado utilizando cópulas. O parâmetro da cópula é útil para definir formas de assimetria e ponderar com maior ou menor peso determinadas regiões do suporte das distribuições componentes para compor a mistura. pesos das distribuições componentes variam no suporte da distribuição e não são restritos à soma unitária. A modelagem resultante acrescenta uma maior flexibilidade aos modelos de misturas na representação de dados com densidades de várias formas multimodais e assimétricas. O modelo tem como casos particulares o modelo de mistura tradicional, o modelo de riscos múltiplos e o modelo de fração de cura. Os modelos são aplicados a dados simulados e reais da literatura. Foram utilizados os métodos de estimação de máxima verossimilhança e os critérios de ajuste de Akaike e Bayesiano para a seleção dos modelos. Os modelos representaram bem os conjuntos de dados analisados em comparação com metodologias propostas na literatura / Abstract: In this work, we discuss the application of copula to polyhazard and mixture models. First we analyse the inclusion of dependence among failure causes in the polyhazard models. The polyhazard models constitute a family of flexible models to represent lifetime data. Their main advantages over single hazard models include the ability to represent hazard rate functions with unusual shapes and the ease of including covariates. The main purpose in this first part is to model the dependence that exists among the latent causes of failure in the polyhazard model by copula functions. The choice of the copula function as well as the latent failure distributions produces a flexible class of survival distributions that is able to model hazard functions with unusual shapes such as bathtub or multimodal curves, while also modelling local effects given by the competing risks. The model identification and estimation are also discussed. Dropping the restriction of positive support for the latent variables, the method can be used to generate a rich family of univariate distributions with asymmetries and multiple modes. In the second part a generalized mixture model using copula functions is proposed. To assemble the mixture model, the parameter of the copula function is used to define asymmetry shapes and to attribute more or less weight to chosen regions of the component distributions. The weights of the component distributions vary on the support of the distribution and are not restricted to the unitary sum. The resulting model increases the flexibility of the mixture models to represent data with densities with several multimodal and asymmetric shapes. Special cases of the model are the traditional mixture models, the polyhazard model, and the cure fraction model. Simulated and empirical data from the literature are analysed by the proposed models. The estimation was done by maximum likelihood methods and the selection of the models used the Akaike and Bayesian criteria. The proposed models exhibited very good fit to the data sets in comparison to other methodologies presented in the literature / Doutorado / Estatistica / Doutor em Estatística
|
64 |
Chemometric Approaches for Systems BiologyFolch Fortuny, Abel 23 January 2017 (has links)
The present Ph.D. thesis is devoted to study, develop and apply approaches commonly used in chemometrics to the emerging field of systems biology. Existing procedures and new methods are applied to solve research and industrial questions in different multidisciplinary teams. The methodologies developed in this document will enrich the plethora of procedures employed within omic sciences to understand biological organisms and will improve processes in biotechnological industries integrating biological knowledge at different levels and exploiting the software packages derived from the thesis.
This dissertation is structured in four parts. The first block describes the framework in which the contributions presented here are based. The objectives of the two research projects related to this thesis are highlighted and the specific topics addressed in this document via conference presentations and research articles are introduced. A comprehensive description of omic sciences and their relationships within the systems biology paradigm is given in this part, jointly with a review of the most applied multivariate methods in chemometrics, on which the novel approaches proposed here are founded.
The second part addresses many problems of data understanding within metabolomics, fluxomics, proteomics and genomics. Different alternatives are proposed in this block to understand flux data in steady state conditions. Some are based on applications of multivariate methods previously applied in other chemometrics areas. Others are novel approaches based on a bilinear decomposition using elemental metabolic pathways, from which a GNU licensed toolbox is made freely available for the scientific community. As well, a framework for metabolic data understanding is proposed for non-steady state data, using the same bilinear decomposition proposed for steady state data, but modelling the dynamics of the experiments using novel two and three-way data analysis procedures. Also, the relationships between different omic levels are assessed in this part integrating different sources of information of plant viruses in data fusion models. Finally, an example of interaction between organisms, oranges and fungi, is studied via multivariate image analysis techniques, with future application in food industries.
The third block of this thesis is a thoroughly study of different missing data problems related to chemometrics, systems biology and industrial bioprocesses. In the theoretical chapters of this part, new algorithms to obtain multivariate exploratory and regression models in the presence of missing data are proposed, which serve also as preprocessing steps of any other methodology used by practitioners. Regarding applications, this block explores the reconstruction of networks in omic sciences when missing and faulty measurements appear in databases, and how calibration models between near infrared instruments can be transferred, avoiding costs and time-consuming full recalibrations in bioindustries and research laboratories. Finally, another software package, including a graphical user interface, is made freely available for missing data imputation purposes.
The last part discusses the relevance of this dissertation for research and biotechnology, including proposals deserving future research. / Esta tesis doctoral se centra en el estudio, desarrollo y aplicación de técnicas quimiométricas en el emergente campo de la biología de sistemas. Procedimientos comúnmente utilizados y métodos nuevos se aplican para resolver preguntas de investigación en distintos equipos multidisciplinares, tanto del ámbito académico como del industrial. Las metodologías desarrolladas en este documento enriquecen la plétora de técnicas utilizadas en las ciencias ómicas para entender el funcionamiento de organismos biológicos y mejoran los procesos en la industria biotecnológica, integrando conocimiento biológico a diferentes niveles y explotando los paquetes de software derivados de esta tesis.
Esta disertación se estructura en cuatro partes. El primer bloque describe el marco en el cual se articulan las contribuciones aquí presentadas. En él se esbozan los objetivos de los dos proyectos de investigación relacionados con esta tesis. Asimismo, se introducen los temas específicos desarrollados en este documento mediante presentaciones en conferencias y artículos de investigación. En esta parte figura una descripción exhaustiva de las ciencias ómicas y sus interrelaciones en el paradigma de la biología de sistemas, junto con una revisión de los métodos multivariantes más aplicados en quimiometría, que suponen las pilares sobre los que se asientan los nuevos procedimientos aquí propuestos.
La segunda parte se centra en resolver problemas dentro de metabolómica, fluxómica, proteómica y genómica a partir del análisis de datos. Para ello se proponen varias alternativas para comprender a grandes rasgos los datos de flujos metabólicos en estado estacionario. Algunas de ellas están basadas en la aplicación de métodos multivariantes propuestos con anterioridad, mientras que otras son técnicas nuevas basadas en descomposiciones bilineales utilizando rutas metabólicas elementales. A partir de éstas se ha desarrollado software de libre acceso para la comunidad científica. A su vez, en esta tesis se propone un marco para analizar datos metabólicos en estado no estacionario. Para ello se adapta el enfoque tradicional para sistemas en estado estacionario, modelando las dinámicas de los experimentos empleando análisis de datos de dos y tres vías. En esta parte de la tesis también se establecen relaciones entre los distintos niveles ómicos, integrando diferentes fuentes de información en modelos de fusión de datos. Finalmente, se estudia la interacción entre organismos, como naranjas y hongos, mediante el análisis multivariante de imágenes, con futuras aplicaciones a la industria alimentaria.
El tercer bloque de esta tesis representa un estudio a fondo de diferentes problemas relacionados con datos faltantes en quimiometría, biología de sistemas y en la industria de bioprocesos. En los capítulos más teóricos de esta parte, se proponen nuevos algoritmos para ajustar modelos multivariantes, tanto exploratorios como de regresión, en presencia de datos faltantes. Estos algoritmos sirven además como estrategias de preprocesado de los datos antes del uso de cualquier otro método. Respecto a las aplicaciones, en este bloque se explora la reconstrucción de redes en ciencias ómicas cuando aparecen valores faltantes o atípicos en las bases de datos. Una segunda aplicación de esta parte es la transferencia de modelos de calibración entre instrumentos de infrarrojo cercano, evitando así costosas re-calibraciones en bioindustrias y laboratorios de investigación. Finalmente, se propone un paquete software que incluye una interfaz amigable, disponible de forma gratuita para imputación de datos faltantes.
En la última parte, se discuten los aspectos más relevantes de esta tesis para la investigación y la biotecnología, incluyendo líneas futuras de trabajo. / Aquesta tesi doctoral es centra en l'estudi, desenvolupament, i aplicació de tècniques quimiomètriques en l'emergent camp de la biologia de sistemes. Procediments comúnment utilizats i mètodes nous s'apliquen per a resoldre preguntes d'investigació en diferents equips multidisciplinars, tant en l'àmbit acadèmic com en l'industrial. Les metodologies desenvolupades en aquest document enriquixen la plétora de tècniques utilitzades en les ciències òmiques per a entendre el funcionament d'organismes biològics i milloren els processos en la indústria biotecnològica, integrant coneixement biològic a distints nivells i explotant els paquets de software derivats d'aquesta tesi.
Aquesta dissertació s'estructura en quatre parts. El primer bloc descriu el marc en el qual s'articulen les contribucions ací presentades. En ell s'esbossen els objectius dels dos projectes d'investigació relacionats amb aquesta tesi. Així mateix, s'introduixen els temes específics desenvolupats en aquest document mitjançant presentacions en conferències i articles d'investigació. En aquesta part figura una descripació exhaustiva de les ciències òmiques i les seues interrelacions en el paradigma de la biologia de sistemes, junt amb una revisió dels mètodes multivariants més aplicats en quimiometria, que supossen els pilars sobre els quals s'assenten els nous procediments ací proposats.
La segona part es centra en resoldre problemes dins de la metabolòmica, fluxòmica, proteòmica i genòmica a partir de l'anàlisi de dades. Per a això es proposen diverses alternatives per a compendre a grans trets les dades de fluxos metabòlics en estat estacionari. Algunes d'elles estàn basades en l'aplicació de mètodes multivariants propostos amb anterioritat, mentre que altres són tècniques noves basades en descomposicions bilineals utilizant rutes metabòliques elementals. A partir d'aquestes s'ha desenvolupat software de lliure accés per a la comunitat científica. Al seu torn, en aquesta tesi es proposa un marc per a analitzar dades metabòliques en estat no estacionari. Per a això s'adapta l'enfocament tradicional per a sistemes en estat estacionari, modelant les dinàmiques dels experiments utilizant anàlisi de dades de dues i tres vies. En aquesta part de la tesi també s'establixen relacions entre els distints nivells òmics, integrant diferents fonts d'informació en models de fusió de dades. Finalment, s'estudia la interacció entre organismes, com taronges i fongs, mitjançant l'anàlisi multivariant d'imatges, amb futures aplicacions a la indústria alimentària.
El tercer bloc d'aquesta tesi representa un estudi a fons de diferents problemes relacionats amb dades faltants en quimiometria, biologia de sistemes i en la indústria de bioprocessos. En els capítols més teòrics d'aquesta part, es proposen nous algoritmes per a ajustar models multivariants, tant exploratoris com de regressió, en presencia de dades faltants. Aquests algoritmes servixen ademés com a estratègies de preprocessat de dades abans de l'ús de qualsevol altre mètode. Respecte a les aplicacions, en aquest bloc s'explora la reconstrucció de xarxes en ciències òmiques quan apareixen valors faltants o atípics en les bases de dades. Una segona aplicació d'aquesta part es la transferència de models de calibració entre instruments d'infrarroig proper, evitant així costoses re-calibracions en bioindústries i laboratoris d'investigació. Finalment, es proposa un paquet software que inclou una interfície amigable, disponible de forma gratuïta per a imputació de dades faltants.
En l'última part, es discutixen els aspectes més rellevants d'aquesta tesi per a la investigació i la biotecnologia, incloent línies futures de treball. / Folch Fortuny, A. (2016). Chemometric Approaches for Systems Biology [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/77148 / Premios Extraordinarios de tesis doctorales
|
65 |
Mixture model analysis with rank-based samplesHatefi, Armin January 2013 (has links)
Simple random sampling (SRS) is the most commonly used sampling design in data collection. In many applications (e.g., in fisheries and medical research) quantification of the variable of interest is either time-consuming or expensive but ranking a number of sampling units, without actual measurement on them, can be done relatively easy and at low cost. In these situations, one may use rank-based sampling (RBS) designs to obtain more representative samples from the underlying population and improve the efficiency of the statistical inference. In this thesis, we study the theory and application of the finite mixture models (FMMs) under RBS designs. In Chapter 2, we study the problems of Maximum Likelihood (ML) estimation and classification in a general class of FMMs under different ranked set sampling (RSS) designs. In Chapter 3, deriving Fisher information (FI) content of different RSS data structures including complete and incomplete RSS data, we show that the FI contained in each variation of the RSS data about different features of FMMs is larger than the FI contained in their SRS counterparts. There are situations where it is difficult to rank all the sampling units in a set with high confidence. Forcing rankers to assign unique ranks to the units (as RSS) can lead to substantial ranking error and consequently to poor statistical inference. We hence focus on the partially rank-ordered set (PROS) sampling design, which is aimed at reducing the ranking error and the burden on rankers by allowing them to declare ties (partially ordered subsets) among the sampling units. Studying the information and uncertainty structures of the PROS data in a general class of distributions, in Chapter 4, we show the superiority of the PROS design in data analysis over RSS and SRS schemes. In Chapter 5, we also investigate the ML estimation and classification problems of FMMs under the PROS design. Finally, we apply our results to estimate the age structure of a short-lived fish species based on the length frequency data, using SRS, RSS and PROS designs.
|
66 |
The application of latent variable models to the assessment of determinants of HIV risk behavior /Smolenski, Derek Joseph. Risser, Jan Mary Hale, Stigler, Melissa H., Diamond, Pamela M. January 2009 (has links)
Thesis (Ph. D.)--University of Texas Health Science Center at Houston, School of Public Health, 2009. / Advisor: Michael W. Ross. Includes bibliographical references.
|
67 |
[en] MONITORING THE SPREAD OF MULTIVARIATE PROCESSES USING PROJECTIONS OF THE OBSERVABLE VARIABLES VECTOR / [pt] MONITORAMENTO DA DISPERSÃO DE PROCESSOS MULTIVARIADOS POR PROJEÇÕES DO VETOR DE VARIÁVEIS OBSERVADASSERGIO FERREIRA BASTOS 01 December 2016 (has links)
[pt] Em processos multivariados, existem diversas variáveis observáveis para serem controladas. Pressupõe-se neste trabalho que os descontroles do processo se devem a causas especiais que atuam em fontes de variação independentes, cada uma destas podendo ser representada por uma variável aleatória não observável, ou latente. Alguma alteração na média de uma dessas variáveis ou um aumento na sua dispersão resultam, respectivamente, em deslocamento da média do vetor x de variáveis observáveis ao longo de uma direção atribuível específica, ou aumento da variabilidade do vetor x nessa direção. Propõe-se então controlar a dispersão de tais processos multivariados por gráficos de controle do desvio-padrão dos valores das projeções do vetor de variáveis observadas em direções específicas, associadas a variações nas variáveis latentes não observáveis do processo. Essas direções são denominadas de direções atribuíveis. Foram desenvolvidos, também, gráficos para média da norma quadrática de um vetor resíduo, a fim de permitir a sinalização da ocorrência de novas fontes de variação ainda desconhecidas ao processo, que levem a um aumento da variabilidade do vetor x em direções não contidas no subespaço das direções atribuíveis. O esquema proposto mostrou-se eficaz para o controle estatístico de causas especiais, atuando sobre as fontes de variação do processo, com a vantagem adicional de identificar automaticamente a variável latente afetada. / [en] In multivariate processes, there are several observable variables to be controlled. It is assumed in this work that loss of control is due to special causes acting in independent sources of variation, each of these being represented by an unobservable random variable, or latent. A change in average of these variables or an increase in dispersion results, respectively, in a displacement of the average of the vector x of the observable variables along a specific assignable direction or in an increase of vector x variability in that direction. It is proposed to control the dispersion of such multivariate processes by means of control charts of the vector projections values of observed variables in specific directions, associated with process changes in latent variables, not observable. We call these directions assignable directions. Graphs of average squared norm of a residual vector were developed to enable the signaling of the occurrence of new sources of variation, yet unknown to the process, that lead to increased vector x variability in directions not contained in the assignable directions subspace. The proposed scheme was shown to be an effective tool for statistical control of special causes acting on the process variation sources, with the added benefit of automatically identification of the affected latent variable.
|
68 |
Approche EM pour modèles multi-blocs à facteurs à une équation structurelle / EM estimation of a structural equation modelTami, Myriam 12 July 2016 (has links)
Les modèles d'équations structurelles à variables latentes permettent de modéliser des relations entre des variables observables et non observables. Les deux paradigmes actuels d'estimation de ces modèles sont les méthodes de moindres carrés partiels sur composantes et l'analyse de la structure de covariance. Dans ce travail, après avoir décrit les deux principales méthodes d'estimation que sont PLS et LISREL, nous proposons une approche d'estimation fondée sur la maximisation par algorithme EM de la vraisemblance globale d'un modèle à facteurs latents et à une équation structurelle. Nous en étudions les performances sur des données simulées et nous montrons, via une application sur des données réelles environnementales, comment construire pratiquement un modèle et en évaluer la qualité. Enfin, nous appliquons l'approche développée dans le contexte d'un essai clinique en cancérologie pour l'étude de données longitudinales de qualité de vie. Nous montrons que par la réduction efficace de la dimension des données, l'approche EM simplifie l'analyse longitudinale de la qualité de vie en évitant les tests multiples. Ainsi, elle contribue à faciliter l'évaluation du bénéfice clinique d'un traitement. / Structural equation models enable the modeling of interactions between observed variables and latent ones. The two leading estimation methods are partial least squares on components and covariance-structure analysis. In this work, we first describe the PLS and LISREL methods and, then, we propose an estimation method using the EM algorithm in order to maximize the likelihood of a structural equation model with latent factors. Through a simulation study, we investigate how fast and accurate the method is, and thanks to an application to real environmental data, we show how one can handly construct a model or evaluate its quality. Finally, in the context of oncology, we apply the EM approach on health-related quality-of-life data. We show that it simplifies the longitudinal analysis of quality-of-life and helps evaluating the clinical benefit of a treatment.
|
69 |
Comparing Three Effect Sizes for Latent Class AnalysisGranado, Elvalicia A. 12 1900 (has links)
Traditional latent class analysis (LCA) considers entropy R2 as the only measure of effect size. However, entropy may not always be reliable, a low boundary is not agreed upon, and good separation is limited to values of greater than .80. As applications of LCA grow in popularity, it is imperative to use additional sources to quantify LCA classification accuracy. Greater classification accuracy helps to ensure that the profile of the latent classes reflect the profile of the true underlying subgroups. This Monte Carlo study compared the quantification of classification accuracy and confidence intervals of three effect sizes, entropy R2, I-index, and Cohen’s d. Study conditions included total sample size, number of dichotomous indicators, latent class membership probabilities (γ), conditional item-response probabilities (ρ), variance ratio, sample size ratio, and distribution types for a 2-class model. Overall, entropy R2 and I-index showed the best accuracy and standard error, along with the smallest confidence interval widths. Results showed that I-index only performed well for a few cases.
|
70 |
The Relationship of Self-Efficacy, Self-Advocacy, and Multicultural Counseling Competency of School Counselors: A Structural Equation ModelAydogan, Mustafa 06 August 2021 (has links)
No description available.
|
Page generated in 0.0778 seconds