• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 48
  • 10
  • 7
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 73
  • 73
  • 30
  • 18
  • 17
  • 15
  • 13
  • 12
  • 11
  • 10
  • 10
  • 10
  • 10
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Variable selection and other extensions of the mixture model clustering framework /

Dean, Nema, January 2006 (has links)
Thesis (Ph. D.)--University of Washington, 2006. / Vita. Includes bibliographical references (p. 115-121).
42

Uso de aprendizado de maquina para estimar esforço de execução de testes funcionais / Using machine learning to estimate execution effort of functional tests

Silva, Daniel Guerreiro e, 1983- 15 August 2018 (has links)
Orientador: Mario Jino / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação / Made available in DSpace on 2018-08-15T04:58:41Z (GMT). No. of bitstreams: 1 Silva_DanielGuerreiroe_M.pdf: 2351174 bytes, checksum: 7f8ba90b6462fe7be00711143e365482 (MD5) Previous issue date: 2009 / Resumo: O planejamento das atividades de teste tem papel essencial para qualquer equipe independente de testes que realize testes de diferentes sistemas de software, desenvolvidos por diferentes equipes de desenvolvimento. Dado que o esforço empreendido no processo de testes pode chegar até a metade do esforço total de desenvolvimento de um sistema, estimar adequadamente o esforço de testes pode evitar custos desnecessários e contribuir para a boa qualidade dos produtos. Para superar este desafio, ferramentas de aprendizado de máquina têm sido usadas em pesquisa para estimar esforço e para solucionar outros problemas de engenharia de software, principalmente porque eles constituem uma classe de problemas complexos com muitas limitações à sua solução por abordagens matemáticas clássicas. Este trabalho estuda a aplicação das ferramentas de aprendizado de máquina - redes neurais artificiais e máquinas de vetor de suporte - e de ferramentas de seleção de variáveis na solução do problema de estimar esforço de execução de testes funcionais. Um estudo do processo de execução de testes é desenvolvido e são conduzidos experimentos em duas bases de dados reais com o objetivo de propor uma metodologia adequada para abordar sistematicamente o problema, tanto em termos de qualidade de resultados como em praticidade de uso. As principais contribuições deste trabalho são: a proposta de realizar a seleção de variáveis para a síntese da base de dados; a adoção de um modelo de rede neural treinada por uma função custo assimétrica; e um estudo comparativo de desempenho dos modelos preditores / Abstract: Planning and scheduling of testing activities play a key role for any independent test team that performs tests for different software systems, produced by different development teams. Since the effort that is applied in the test process can amount to up to half of the total effort of software development, adequate estimation of test effort can prevent unnecessary costs and improve the quality of delivered products. To overcome this challenge, machine learning tools have been used in research to estimate effort and to solve other software engineering problems, mainly because they constitute a class of complex problems with many limitations to their solution by classical mathematical approaches. This work studies the application of machine learning tools - artificial neural networks and support vector machines - and variable selection tools to solve the problem of estimating the execution effort of functional tests. An analysis of the test execution process is done and experiments are performed with two real databases aimed at proposing a suitable methodology to systematically tackle this problem, considering both the quality of results and ease of application. The main contributions of this work are: the proposal of applying variable selection for database synthesis; the adoption of an artificial neural network trained with an asymmetric cost function; and a comparative study of performance with the predictive models / Mestrado / Engenharia de Computação / Mestre em Engenharia Elétrica
43

Estimação robusta em modelos de variáveis latentes para dados censurados / Robust estimation in latent variable models for censored data

Costa, Denise Reis, 1985- 12 February 2013 (has links)
Orientador: Víctor Hugo Lachos Dávila / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica / Made available in DSpace on 2018-08-24T00:22:53Z (GMT). No. of bitstreams: 1 Costa_DeniseReis_D.pdf: 5095534 bytes, checksum: 5bb05e15edf36db32e78eff601b757db (MD5) Previous issue date: 2013 / Resumo: Modelos de variáveis latentes são amplamente utilizados por psicometristas, econometristas e pesquisadores da área de ciencias sociais para modelar variáveis que não podem ser medidas diretamente, conhecidas como construtos ou efeitos aleatórios (Skrondal e Rabe-Hesketh, 2004). Na literatura, é muito comum verificar a utilização da distribuição normal para a modelagem dessas variáveis, contudo tal suposição pode ser inadequada, especialmente na presença de valores discrepantes. Preocupados com a sensibilidade das inferências sob a presença de potenciais pontos discrepantes ou com dados provenientes de distribuições com caudas pesadas, nesta tese propomos métodos de inferência robusta, utilizando a distribuição t de Student multivariada, para dois tipos de modelos de variáveis latentes: o modelo linear generalizado misto para respostas binárias (GLMM) e o modelo de análise fatorial Tobit (TCFA) para respostas contínuas e censuradas. Para a estimação dos parâmetros dos modelos estudados, um algoritmo do tipo EM foi proposto e este apresenta expressões fechadas no passo E que utiliza os dois primeiros momentos de uma distribuição multivariada t truncada. Adicionalmente apresentamos uma abordagem via análise Bayesiana e propomos medidas de diagnóstico de influência para dados censurados sob o modelo TCFA quando a suposição de normalidade é assumida. Para avaliação dos métodos propostos, foram realizados alguns estudos simulados, além da aplicação a conjuntos de dados reais. / Abstract: Latent variable models are broadly used by psychometrists, econometrists and social science researchers to model variables that cannnot be directly measured, known as constructs or random effects (Skrondal and Rabe-Hesketh, 2004). In the literature, such variables are commonly modeled with a normal distribution, but such assumption may be inadequate, especially when there are outliers. Concerned with the sensitivity of the inferences under the presence of potential outliers or data derived from heavy-tailed distributions, this thesis proposes robust inference models, using the mutivariate t-Student distribution, for two types of latent variable models: the Generalized Linear Mixed Model for correlated binary data (GLMM) and the Tobit Confirmatory Factor Analysis (TCFA) for continuous and censored data. In order to estimate the parameters of the studied models, an EM-type algorithm was proposed. This algorithm presents closed expressions on the E-step which use the two first moments of a multivariate truncated t-distribution. Moreover, we present a Bayesian approach and propose measures of influence diagnostics for censored data under the TCFA model when normality is assumed. In order to evaluate the proposed methods, simulated studies were carried out, as well as the application on real datasets. / Doutorado / Estatistica / Doutor em Estatística
44

Contribuições ao estudo do modelo de resposta nominal / Contributions to the study of nominal response model

Pereira, Sheila Regina dos Santos, 1981- 02 October 2012 (has links)
Orientadores: Caio Lucidius Neberezny Azevedo, Hildete Prisco Pinheiro / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica / Made available in DSpace on 2018-08-20T00:05:22Z (GMT). No. of bitstreams: 1 Pereira_SheilaReginadosSantos_M.pdf: 25409754 bytes, checksum: 462fcbdec414321c70cbff248acc4ed1 (MD5) Previous issue date: 2012 / Resumo: Na área educacional é crescente o interesse pela aplicação de técnicas derivadas da Teoria de Resposta ao Item (TRI), já que esta metodologia vem sendo utilizada em processos qualitativos da avaliação psicológica e educacional. Porém, na grande parte das avaliações que empregam itens de múltipla escolha, é comum a redução das respostas em padrões de certo ou errado para a utilização desses modelos. A dicotomização das respostas dos indivíduos ignora qualquer conhecimento parcial que ele possa ter da resposta correta, pois assume implicitamente que ou o indivíduo tem conhecimento para escolher a alternativa correta, ou não o tem e seleciona aleatoriamente uma das alternativas. Desta maneira, a informação do conhecimento parcial não é ser usada na estimação dos traços latentes. Nesse sentido, o objetivo do presente trabalho é mostrar a eficiência do Modelo de Resposta Nominal no processo de estimação dos traços latentes dos indivíduos submetidos a testes com itens de múltipla escolha, bem com, analisar e interpretar os parâmetros dos itens estimados por esse modelo / Abstract: In the educational field there is growing interest in applying techniques derived from the TRI, since this methodology has been used in qualitative processes of psychological and educational assessment. However, in most of the educational assessments that use multiple choice items is common to decreased response in patterns of right or wrong to use these models. The dichotomization of the responses of individuals ignores any partial knowledge he may have the correct answer, or because it implicitly assumes that the individual has knowledge to choose the correct alternative, or do not have it and randomly selects one of the alternatives. Thus, information from the partial knowledge is not to be used in the estimation of latent traits. In this sense, the objective of this work is to show the efficiency of the Nominal Response Model in the estimation of latent traits of individuals tested with multiple choice items, as well as analyze and interpret the parameters of the items estimated by this model / Mestrado / Estatistica / Mestre em Estatística
45

FAULT DIAGNOSIS TOOLS IN MULTIVARIATE STATISTICAL PROCESS AND QUALITY CONTROL

Vidal Puig, Santiago 01 March 2016 (has links)
[EN] An accurate fault diagnosis of both, faults sensors and real process faults have become more and more important for process monitoring (minimize downtime, increase safety of plant operation and reduce the manufacturing cost). Quick and correct fault diagnosis is required in order to put back on track our processes or products before safety or quality can be compromised. In the study and comparison of the fault diagnosis methodologies, this thesis distinguishes between two different scenarios, methods for multivariate statistical quality control (MSQC) and methods for latent-based multivariate statistical process control: (Lb-MSPC). In the first part of the thesis the state of the art on fault diagnosis and identification (FDI) is introduced. The second part of the thesis is devoted to the fault diagnosis in multivariate statistical quality control (MSQC). The rationale of the most extended methods for fault diagnosis in supervised scenarios, the requirements for their implementation, their strong points and their drawbacks and relationships are discussed. The performance of the methods is compared using different performance indices in two different process data sets and simulations. New variants and methods to improve the diagnosis performance in MSQC are also proposed. The third part of the thesis is devoted to the fault diagnosis in latent-based multivariate statistical process control (Lb-MSPC). The rationale of the most extended methods for fault diagnosis in supervised Lb-MSPC is described and one of our proposals, the Fingerprints contribution plots (FCP) is introduced. Finally the thesis presents and compare the performance results of these diagnosis methods in Lb-MSPC. The diagnosis results in two process data sets are compared using a new strategy based in the use of the overall sensitivity and specificity / [ES] La realización de un diagnóstico preciso de los fallos, tanto si se trata de fallos de sensores como si se trata de fallos de procesos, ha llegado a ser algo de vital importancia en la monitorización de procesos (reduce las paradas de planta, incrementa la seguridad de la operación en planta y reduce los costes de producción). Se requieren diagnósticos rápidos y correctos si se quiere poder recuperar los procesos o productos antes de que la seguridad o la calidad de los mismos se pueda ver comprometida. En el estudio de las diferentes metodologías para el diagnóstico de fallos esta tesis distingue dos escenarios diferentes, métodos para el control de estadístico multivariante de la calidad (MSQC) y métodos para el control estadístico de procesos basados en el uso de variables latentes (Lb-MSPC). En la primera parte de esta tesis se introduce el estado del arte sobre el diagnóstico e identificación de fallos (FDI). La segunda parte de la tesis está centrada en el estudio del diagnóstico de fallos en control estadístico multivariante de la calidad. Se describen los fundamentos de los métodos más extendidos para el diagnóstico en escenarios supervisados, sus requerimientos para su implementación sus puntos fuertes y débiles y sus posibles relaciones. Los resultados de diagnóstico de los métodos es comparado usando diferentes índices sobre los datos procedentes de dos procesos reales y de diferentes simulaciones. En la tesis se proponen nuevas variantes que tratan de mejorar los resultados obtenidos en MSQC. La tercera parte de la tesis está dedicada al diagnóstico de fallos en control estadístico multivariante de procesos basados en el uso de modelos de variables latentes (Lb-MSPC). Se describe los fundamentos de los métodos mas extendidos en el diagnóstico de fallos en Lb-MSPC supervisado y se introduce una de nuestras propuestas, el fingerprint contribution plot (FCP). Finalmente la tesis presenta y compara los resultados de diagnóstico de los métodos propuestos en Lb-MSPC. Los resultados son comparados sobre los datos de dos procesos usando una nueva estrategia basada en el uso de la sensitividad y especificidad promedia. / [CAT] La realització d'un diagnòstic precís de les fallades, tant si es tracta de fallades de sensors com si es tracta de fallades de processos, ha arribat a ser de vital importància en la monitorització de processos (reduïx les parades de planta, incrementa la seguretat de l'operació en planta i reduïx els costos de producció) . Es requerixen diagnòstics ràpids i correctes si es vol poder recuperar els processos o productes abans de que la seguretat o la qualitat dels mateixos es puga veure compromesa. En l'estudi de les diferents metodologies per al diagnòstic de fallades esta tesi distingix dos escenaris diferents, mètodes per al control estadístic multivariant de la qualitat (MSQC) i l mètodes per al control estadístic de processos basats en l'ús de variables latents (Lb-MSPC). En la primera part d'esta tesi s'introduïx l'estat de l'art sobre el diagnòstic i identificació de fallades (FDI). La segona part de la tesi està centrada en l'estudi del diagnòstic de fallades en control estadístic multivariant de la qualitat. Es descriuen els fonaments dels mètodes més estesos per al diagnòstic en escenaris supervisats, els seus requeriments per a la seua implementació els seus punts forts i febles i les seues possibles relacions. Els resultats de diagnòstic dels mètodes és comparat utilitzant diferents índexs sobre les dades procedents de dos processos reals i de diferents simulacions. En la tesi es proposen noves variants que tracten de millorar els resultats obtinguts en MSQC. La tercera part de la tesi està dedicada al diagnòstic de fallades en control estadístic multivariant de processos basat en l'ús de models de variables latents (Lb-MSPC). Es descriu els fonaments dels mètodes més estesos en el diagnòstic de fallades en MSPC supervisat i s'introdueix una nova proposta, el fingerprint contribution plot (FCP). Finalment la tesi presenta i compara els resultats de diagnòstic dels mètodes proposats en MSPC. Els resultats són comparats sobre les dades de dos processos utilitzant una nova estratègia basada en l'ús de la sensibilitat i especificitat mitjana. / Vidal Puig, S. (2016). FAULT DIAGNOSIS TOOLS IN MULTIVARIATE STATISTICAL PROCESS AND QUALITY CONTROL [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/61292 / TESIS
46

Latent Variable Models in Measurement: Theory and Application

Fang, Guanhua January 2020 (has links)
Latent variable models play an important role in educational and psychological measurement, where items are presented to individuals, resulting in item response data. Such data entail important information about the individual latent traits, population structure and item design, which are key components to be understood in educational and psychological assessments. This thesis focuses on the development of statistical learning methods based on latent variable models with identifiability theories. The thesis consists of three parts, with three kinds of applications in mind. The first part is on the identifiability of diagnostic classification models (DCMs), which is a special subfamily of latent class models. It aims to examine the test takers' ability based on his/her mastery of set of required skills. A key issue common to DCMs and more generally to latent class model is the identifiability which is a property whether the unknown model or related parameters can be estimated consistently under a suitably defined asymptotic regime. Most existing works focus on the identifiability of DCMs with binary responses and attributes. In this thesis, we provide general identifiability results for DCMs with polytoumous responses and attributes and less parameter restrictions. The second part considers the identifiability of testlet factor models, which is a subfamily of latent variable models with underlying continuous latent traits assumed to follow normal distribution. Similar to DCMs, factor models also suffer from identifiability issues, where the parameters can only be identified up to a rotation in general. However, in most applications, testlet models or bifactor models are popular in educational assessment. They are constrained factor models assuming that the response test items can be accounted for by one primary factor and multiple secondary group-specific factors. By aid of this special structure, we can show that the model can be strictly identifiable and we provide checkable necessary and sufficient conditions accordingly. The third part focuses on the statistical learning in studying the complex problem-solving (CPS) items. With advanced computer technology, there is a new trend of developing CPS test items through online platform, where the examinees are asked to solve challenging tasks in a simulated environment. During the test, all actions performed by examinees will be recored into a log file. Therefore, we can not only observe their final responses, but also have access to their entire solving process. Such data type is known as process data in the measurement literature. The traditional item response model cannot be applicable, at least directly. The analysis of the process data is still in its infancy. In the thesis, we propose a new model-based approach and show its usefulness through an interesting real data application.
47

Latent Variable Models for Events on Social Networks

Ward, Owen Gerard January 2022 (has links)
Network data, particularly social network data, is widely collected in the context of interactions between users of online platforms, but it can also be observed directly, such as in the context of behaviours of animals in a group living environment. Such network data can reveal important insights into the latent structure present among the nodes of a network, such as the presence of a social hierarchy or of communities. This is generally done through the use of a latent variable model. Existing network models which are commonly used for such data often aggregate the dynamic events which occur, reducing complex dynamic events (such as the times of messages on a social network website) to a binary variable. Methods which can incorporate the continuous time component of these interactions therefore offer the potential to better describe the latent structure present. Using observed interactions between mice, we take advantage of the observed interactions’ timestamps, proposing a series of network point process models with latent ranks. We carefully design these models to incorporate important theories on animal behaviour that account for dynamic patterns observed in the interaction data, including the winner effect, bursting and pair-flip phenomena. Through iteratively constructing and evaluating these models we arrive at the final cohort Markov-Modulated Hawkes process (C-MMHP), which best characterizes all aforementioned patterns observed in interaction data. The generative nature of our model provides evidence for hypothesised phenomena and allows for additional insights compared to existing aggregate methods, while the probabilistic nature allows us to estimate the uncertainty in our ranking. In particular, our model is able to provide insights into the distribution of power within the hierarchy which forms and the strength of the established hierarchy. We compare all models using simulated and real data. Using statistically developed diagnostic perspectives, we demonstrate that the C-MMHP model outperforms other methods, capturing relevant latent ranking structures that lead to meaningful predictions for real data. While such network models can lead to important insights, there are inherent computational challenges for fitting network models, particularly as the number of nodes in the network grows. This is exacerbated when considering events between each pair of nodes. As such, new computational tools are required to fit network point process models to the large social networks commonly observed. We consider online variational inference for one such model. We derive a natural online variational inference procedure for this event data on networks. Using simulations, we show that this online learning procedure can accurately recover the true network structure. We demonstrate using real data that we can accurately predict future interactions by learning the network structure in this online fashion, obtaining comparable performance to more expensive batch methods.
48

Factors Influencing Mode Choice For Intercity Travel From Northern New England To Major Northeastern Cities

Neely, Sean Patrick 01 January 2016 (has links)
Long-distance and intercity travel generally make up a small portion of the total number of trips taken by an individual, while representing a large portion of aggregate distance traveled on the transportation system. While some research exists on intercity travel behavior between large metropolitan centers, this thesis addresses a need for more research on travel behavior between non-metropolitan areas and large metropolitan centers. This research specifically considers travel from home locations in northern New England, going to Boston, New York City, Philadelphia, and Washington, DC. These trips are important for quality of life, multimodal planning, and rural economies. This research identifies and quantifies factors that influence people's mode choice (automobile, intercity bus, passenger rail, or commercial air travel) for these trips. The research uses survey questionnaire data, latent factor analysis, and discrete choice modeling methods. Factors include sociodemographic, built environment, latent attitudes, and trip characteristics. The survey, designed by the University of Vermont Transportation Research Center and the New England Transportation Institute, was conducted by Resource Systems Group, Inc. in 2014, with an initial sample size of 2560. Factor analysis was used to prepare 6 latent attitudinal factors, based on 70 attitudinal responses from the survey statements. The survey data were augmented with built environment variables using geographic information systems (GIS) analysis. A set of multinomial logit models, and a set of nested logit models, were estimated for business and non-business trip mode choice. Results indicate that for this type of travel, factors influencing mode choice for both business and non-business trips include trip distance; land use; personal use of technology; and latent attitudes about auto dependence, preference for automobile, and comfort with personal space and safety on public transportation. Gender is a less significant factor. Age is only significant for non-business trips. The results reinforce the importance and viability of modeling long-distance travel from less populated regions to large metropolitan areas, and the significant roles of trip distance, built environment, personal attitudes, and sociodemographic factors in how people choose to make these trips for different purposes. Future research should continue to improve these types of long-distance mode choice models by incorporating mode specific travel time and cost, developing more specific attitudinal statements to expand latent factor analysis, and further exploring built environment variables. Improving these models will promote better planning, engineering, operations, and infrastructure investment decisions in many regions and communities across the United States which have not yet been well studied, possibly impacting levels of service.
49

Fatores de risco para nascimento pré-termo - uma análise com modelagem de equações estruturais / Risk factors of preterm birth - an analysis with structural equation modeling

Oliveira, Adelaide Alves de 20 October 2014 (has links)
Introdução: Um dos principais fatores associados à mortalidade e morbidade no período perinatal é o nascimento pré-termo. Sua prevalência está aumentando em diversas localidades e os motivos para tal envolve complexa inter-relação de fatores que incluem aspectos biológicos, de assistência, psicológicos, sociais e econômicos, entre diversos outros associados ao nascimento pré-termo. Objetivo: O presente estudo objetiva analisar os efeitos de variáveis associadas ao nascimento pré-termo, via modelagem de equações estruturais (MEE), com a construção de variáveis latentes de vulnerabilidade socioeconômica, familiar, psicossocial, condições maternas e intercorrências. Outros objetivos são analisar as diversas fontes de obtenção da idade gestacional (IG) visando gerar uma variável latente para a idade gestacional e comparar modelos com diferentes tipos de variável de desfecho (contínuo, ordinal, binário e latente). Métodos: Esse estudo foi baseado na pesquisa do tipo caso-controle populacional sobre nascidos vivos hospitalares de mães residentes na cidade de Londrina, estado do Paraná, realizado entre junho de 2006 e março de 2007. A partir de um modelo conceitual, foi empregada a MEE para construir modelos com variáveis latentes e compostas com utilização de medidas de ajustes. Resultados: O Modelo final apontou os seguintes efeitos diretos sobre IG: intercorrências, atenção de pré-natal inadequada, gestação múltipla, condição reprodutiva, IMC, consumo de bebida alcoólica, prática de caminhadas e esforço físico. Além do efeito direto da atenção pré-natal, foi encontrado seu papel de variável de mediação entre as variáveis latentes: Vulnerabilidade Socioeconômica, Não Aceitação da Gravidez e Vulnerabilidade Familiar e o desfecho IG. Conclusões: Para a utilização de MEE, foi necessário rediscutir o papel das variáveis propostas no estudo original, o que sugere que, para a aplicação da metodologia, já seja considerada a formulação de variáveis latentes à priori na fase de planejamento de uma pesquisa. A aplicação do MEE permitiu construir variáveis latentes e variáveis compostas a partir do estudo original e identificar efeitos diretos, indiretos e de mediação sobre IG. Ademais, foi possível construir e utilizar a variável latente IG a partir das fontes de registro. / Introduction: One of the main factors associated with mortality and morbidity in the perinatal period is preterm birth. Its prevalence is increasing in many locations and the reasons for such involves a complex interplay of factors associated with preterm birth including biological aspects, assistance, psychological, social and economic factors birth, among many others associated with preterm birth. Objective: This study aims to analyze the effects of variables associated with preterm birth via structural equation modeling (SEM), with the construction of latent variables of socioeconomic vulnerability, family, psychosocial, and maternal complications. Other objectives are to analyze the use of various sources of registering gestational age in order to generate a latent variable for gestational age (GA) and compare models with different types of outcome (continuous, ordinal, binary and latent variable). Methods - This study was based on research in population case-control over hospital births to mothers resident in the city of Londrina, Paraná, conducted between June 2006 and March 2007. From a conceptual model it was used SEM to build models with latent and composite variables with use of adjustment measures. Results: The final model showed the following direct effects on GA: complications, prenatal care inadequate, multiple gestation, reproductive condition, BMI, alcohol consumption, going for walks and effort. Besides the direct effect of prenatal care, found their role of mediating variable between latent variables: Socioeconomic Vulnerability, Not Acceptance of Pregnancy, Vulnerability and Family and the outcome GA. Conclusions: For the use of SEM it was necessary to revisit the role of the variables proposed in the original study, which suggests that, for the application of the methodology, it is already considered the formulation of a priori latent variables in the planning phase of a research. The application allowed SEM to build latent variables and composite variables from the original study and identify direct, indirect and mediating effects on GA. Furthermore, it was possible to build and use the GA latent variable from the recording sources.
50

Multiple Calibrations in Integrative Data Analysis: A Simulation Study and Application to Multidimensional Family Therapy

Hall, Kristin Wynn 01 January 2013 (has links)
A recent advancement in statistical methodology, Integrative Data Analyses (IDA Curran & Hussong, 2009) has led researchers to employ a calibration technique as to not violate an independence assumption. This technique uses a randomly selected, simplified correlational structured subset, or calibration, of a whole data set in a preliminary stage of analysis. However, a single calibration estimator suffers from instability, low precision and loss of power. To overcome this limitation, a multiple calibration (MC; Greenbaum et al., 2013; Wang et al., 2013) approach has been developed to produce better estimators, while still removing a level of dependency in the data as to not violate independence assumption. The MC method is conceptually similar to multiple imputation (MI; Rubin, 1987; Schafer, 1997), so MI estimators were borrowed for comparison. A simulation study was conducted to compare the MC and MI estimators, as well as to evaluate the performance of the operating characteristics of the methods in a cross classified data characteristic design. The estimators were tested in the context of assessing change over time in a longitudinal data set. Multiple calibrations consisting of a single measurement occasion per subject were drawn from a repeated measures data set, analyzed separately, and then combined by the rules set forth by each method to produce the final results. The data characteristics investigated were effect size, sample size, and the number of repeated measures per subject. Additionally, a real data application of an MC approach in an IDA framework was conducted on data from three completed, randomized controlled trials studying the treatment effects of Multidimensional Family Therapy (MDFT; Liddle et al., 2002) on substance use trajectories for adolescents at a one year follow-up. The simulation study provided empirical evidence of how the MC method preforms, as well as how it compares to the MI method in a total of 27 hypothetical scenarios. There were strong asymptotic tendencies observed for the bias, standard error, mean square error and relative efficiency of an MC estimator to approach the whole set estimators as the number of calibrations approached 100. The MI combination rules proved not appropriate to borrow for the MC case because the standard error formulas were too conservative and performance with respect to power was not robust. As a general suggestion, 5 calibrations are sufficient to produce an estimator with about half the bias of a single calibration estimator and at least some indication of significance, while 20 calibrations are ideal. After 20 calibrations, the contribution of an additional calibration to the combined estimator greatly diminished. The MDFT application demonstrated a successful implementation of 5 calibration approach in an IDA on real data, as well as the risk of missing treatment effects when analysis is limited to a single calibration's results. Additionally, results from the application provided evidence that MDFT interventions reduced the trajectories of substance use involvement at a 1-year follow-up to a greater extent than any of the active control treatment groups, overall and across all gender and ethnicity subgroups. This paper will aid researchers interested in employing a MC approach in an IDA framework or whenever a level of dependency in a data set needs to be removed for an independence assumption to hold.

Page generated in 0.0917 seconds