• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 19
  • 12
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 41
  • 41
  • 9
  • 8
  • 7
  • 7
  • 7
  • 7
  • 6
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

An Investigation of Methods for Missing Data in Hierarchical Models for Discrete Data

Ahmed, Muhamad Rashid January 2011 (has links)
Hierarchical models are applicable to modeling data from complex surveys or longitudinal data when a clustered or multistage sample design is employed. The focus of this thesis is to investigate inference for discrete hierarchical models in the presence of missing data. This thesis is divided into two parts: in the first part, methods are developed to analyze the discrete and ordinal response data from hierarchical longitudinal studies. Several approximation methods have been developed to estimate the parameters for the fixed and random effects in the context of generalized linear models. The thesis focuses on two likelihood-based estimation procedures, the pseudo likelihood (PL) method and the adaptive Gaussian quadrature (AGQ) method. The simulation results suggest that AGQ is preferable to PL when the goal is to estimate the variance of the random intercept in a complex hierarchical model. AGQ provides smaller biases for the estimate of the variance of the random intercept. Furthermore, it permits greater flexibility in accommodating user-defined likelihood functions. In the second part, simulated data are used to develop a method for modeling longitudinal binary data when non-response depends on unobserved responses. This simulation study modeled three-level discrete hierarchical data with 30% and 40% missing data using a missing not at random (MNAR) missing-data mechanism. It focused on a monotone missing data-pattern. The imputation methods used in this thesis are: complete case analysis (CCA), last observation carried forward (LOCF), available case missing value (ACMVPM) restriction, complete case missing value (CCMVPM) restriction, neighboring case missing value (NCMVPM) restriction, selection model with predictive mean matching method (SMPM), and Bayesian pattern mixture model. All three restriction methods and the selection model used the predictive mean matching method to impute missing data. Multiple imputation is used to impute the missing values. These m imputed values for each missing data produce m complete datasets. Each dataset is analyzed and the parameters are estimated. The results from the m analyses are then combined using the method of Rubin(1987), and inferences are made from these results. Our results suggest that restriction methods provide results that are superior to those of other methods. The selection model provides smaller biases than the LOCF methods but as the proportion of missing data increases the selection model is not better than LOCF. Among the three restriction methods the ACMVPM method performs best. The proposed method provides an alternative to standard selection and pattern-mixture modeling frameworks when data are not missing at random. This method is applied to data from the third Waterloo Smoking Project, a seven-year smoking prevention study having substantial non-response due to loss-to-follow-up.
2

An Investigation of Methods for Missing Data in Hierarchical Models for Discrete Data

Ahmed, Muhamad Rashid January 2011 (has links)
Hierarchical models are applicable to modeling data from complex surveys or longitudinal data when a clustered or multistage sample design is employed. The focus of this thesis is to investigate inference for discrete hierarchical models in the presence of missing data. This thesis is divided into two parts: in the first part, methods are developed to analyze the discrete and ordinal response data from hierarchical longitudinal studies. Several approximation methods have been developed to estimate the parameters for the fixed and random effects in the context of generalized linear models. The thesis focuses on two likelihood-based estimation procedures, the pseudo likelihood (PL) method and the adaptive Gaussian quadrature (AGQ) method. The simulation results suggest that AGQ is preferable to PL when the goal is to estimate the variance of the random intercept in a complex hierarchical model. AGQ provides smaller biases for the estimate of the variance of the random intercept. Furthermore, it permits greater flexibility in accommodating user-defined likelihood functions. In the second part, simulated data are used to develop a method for modeling longitudinal binary data when non-response depends on unobserved responses. This simulation study modeled three-level discrete hierarchical data with 30% and 40% missing data using a missing not at random (MNAR) missing-data mechanism. It focused on a monotone missing data-pattern. The imputation methods used in this thesis are: complete case analysis (CCA), last observation carried forward (LOCF), available case missing value (ACMVPM) restriction, complete case missing value (CCMVPM) restriction, neighboring case missing value (NCMVPM) restriction, selection model with predictive mean matching method (SMPM), and Bayesian pattern mixture model. All three restriction methods and the selection model used the predictive mean matching method to impute missing data. Multiple imputation is used to impute the missing values. These m imputed values for each missing data produce m complete datasets. Each dataset is analyzed and the parameters are estimated. The results from the m analyses are then combined using the method of Rubin(1987), and inferences are made from these results. Our results suggest that restriction methods provide results that are superior to those of other methods. The selection model provides smaller biases than the LOCF methods but as the proportion of missing data increases the selection model is not better than LOCF. Among the three restriction methods the ACMVPM method performs best. The proposed method provides an alternative to standard selection and pattern-mixture modeling frameworks when data are not missing at random. This method is applied to data from the third Waterloo Smoking Project, a seven-year smoking prevention study having substantial non-response due to loss-to-follow-up.
3

The impact of ignoring a level of nesting structure in multilevel growth mixture model: a Monte Carlo study

Chen, Qi 2008 August 1900 (has links)
The number of longitudinal studies has increased steadily in various social science disciplines over the last decade. Growth Mixture Modeling (GMM) has emerged among the new approaches for analyzing longitudinal data. It can be viewed as a combination of Hierarchical Linear Modeling, Latent Growth Curve Modeling and Finite Mixture Modeling. The combination of both continuous and categorical latent variables makes GMM a flexible analysis procedure. However, when researchers analyze their data using GMM, some may assume that the units are independent of each other even though it may not always be the case. The purpose of this dissertation was to examine the impact of ignoring a higher nesting structure in Multilevel Growth Mixture Modeling on the accuracy of classification of individuals and the accuracy on tests of significance (i.e., Type I error rate and statistical power) of the parameter estimates for the model in each subpopulation. Two simulation studies were conducted. In the first study, the impact of misspecifying the multilevel mixture model is investigated by ignoring a level of nesting structure in cross-sectional data. In the second study, longitudinal clustered data (e.g., repeated measures nested within units and units nested within clusters) are analyzed correctly and with a misspecification ignoring the highest level of the nesting structure. Results indicate that ignoring a higher level nesting structure results in lower classification accuracy, less accurate fixed effect estimates, inflation of lower-level variance estimates, and less accurate standard error estimates, the latter result which in turn affects the accuracy of tests of significance for the fixed effects. The magnitude of the intra-class correlation (ICC) coefficient has a substantial impact when a higher level nesting structure is ignored; the higher the ICC, the more variance at the highest level is ignored, and the worse the performance of the model. The implication for applied researchers is that it is important to model the multilevel data structure in (growth) mixture modeling. In addition, researchers should be cautious in interpreting their results if ignoring a higher level nesting structure is inevitable. Limitations concerning appropriate use of latent class analysis in growth modeling include unknown effects of incorrect estimation of the number of latent classes, non-normal distribution effects, and different growth patterns within-group and between-group.
4

Extension of the cross-classified multiple membership growth curve model for longitudinal data

Li, Jie, active 2013 05 December 2013 (has links)
Student mobility is a common phenomenon in longitudinal data in educational research. The characteristics of education longitudinal data create a problem for the conventional multilevel model. Grady and Beretvas (2010) introduced a cross-classified multiple membership growth curve (CCMM-GCM) model to handle Student mobility over time by capturing complex higher level clustering structure in the data. There are some limitations in the CCMM-GCM model. By creating dummy coded indicators for each measurement occasion, the new model can improve the accuracy and provides an easier and more flexible structure at the higher level. This study provides some support that the new model better fits a dataset than the CCMM-GCM model / text
5

Longitudinal multilevel models analyzing the trends of land use effects on non-driving travel choice

Bai, Xiao, active 2013 22 April 2014 (has links)
Land use and transportation researchers have conducted numerous studies about land use effects on travel mode choice, and probed for effective policies to reduce driving, since less driving and more non-driving are widely recognized as more sustainable travel behaviors to resolve many environmental, energy and social equity issues. However, most of the previous studies rely on methodologies developed by cross-sectional data; only limited attention is explicitly given to explore the statistical techniques for longitudinal design and analysis. Using the neighborhood-level land use and persona-level travel mode choice data of 1997 and 2006 in the city of Austin, this paper attempts to establish and compare three distinct modeling approaches to analyze the trends of land use effects on people’s choice behavior of non-driving travel mode. The three modeling approaches are: a comparison approach with two cross-sectional multilevel Logit models using single-year data, a pooling approach by building one multilevel model with two-year data, and a longitudinal multilevel model. Empirical modeling results indicate that the longitudinal multilevel model is the most reasonable model for analyzing the longitudinal and multilevel datasets, since it is capable of estimating both time-invariant and time-variant land use effects, and internalizes time-variant random effects. The other two approaches may have several shortcomings. For example, the comparison approach fails to distinguish the time-variant and time-invariant effects; while the pooling model may lead to underestimated standard errors and t-statistics, and thus overestimate the significance of variables. / text
6

A Multilevel Analysis of the Contribution of Individual, Socioeconomic and Geographical Factors on Kindergarten Children’s Developmental Health: A Saskatchewan Province-Wide Study

2014 March 1900 (has links)
In current literature of child public health, a growing number of studies has been dedicated to early childhood development with a focus on child developmental health measured via the teacher completed Early Development Instrument (EDI). Using multilevel modeling as the optimal statistical method to analyze hierarchical EDI data, this study determines the strength of the effect and significance of predictors of children’s 5 EDI outcomes, vulnerability, and the multiple vulnerability by taking into account the hierarchy present in its design. In addition, this study conducts an extensive epidemiological review of the risk factors associated with a child’s developmental health at each level of the hierarchy, at cross-levels of the hierarchy and their variations across different levels of the hierarchy. This cross-sectional study considered 9045 Saskatchewan children who were ages 4-8 years in the 2008-2009 school years. Individual child characteristics, EDI domains, and vulnerability data were collected by the Ministry of Education teachers in the provincial 2008 EDI project; neighborhood contextual Census data were compiled by SPHERU staff at the University of Saskatchewan. Multilevel linear and logistic models were used to analyze the data. According to the results, individual characteristics, such as being Aboriginal, an ESL learner, male, and being absent from school; neighborhood characteristics such as income inequality; and geographical characteristics such as living in a large city have negative effects on EDI scores and exacerbating the odds of vulnerability. Compounding effects of Aboriginal-special skills, large city-Aboriginal, and large city-neighborhood median income were positive on the above outcomes with considerable either significance or strength, while those of neighborhood income inequality-Aboriginal, and large city-neighborhood income inequality were negative with notable significance and strength. Furthermore, neighborhood contextual variables contribute to a considerable proportion of health outcome variations and the results associated with neighborhood income inequality give further evidence of the income inequality hypothesis. The findings of this study recommend provincial child public health policy makers’ extended attention to Aboriginal children, children with ESL status, those children living in neighborhoods with high income inequality and children from Regina.
7

Essays on well-being during crisis in Europe

Pierewan, Adi Cilik January 2014 (has links)
The claim that economic crisis matters for well-being seems intuitive; supporting evidence, however, remains elusive. The present study aims to examine the individual and contextual determinants of well-being across regions in Europe during the 2007-2008 economic crisis. This study contributes to the existing research on the determinants of well-being in three ways. First, while most studies explain the determinants of well-being in the context of non-crisis, this study examines the determinants during a period of crisis. Second, while most research on well-being focuses on cross-national comparisons of well-being, this study investigates variations at both the regional and national levels. Third, while most studies use either individual or aggregate analyses to examine the determinants of well-being, this study uses multilevel models. This study uses datasets that combine individual, regional and country level data. Individual data is taken from the 2008 European Values Study (EVS) and the 2004-2010 European Social Survey (ESS). Regional level data comes from Eurostat and Euroboundarymaps, while country level data comes from the Inglehart Index, UNU-WIDER and Esping-Andersen categorisation on welfare states. To analyse the data, this study uses various multilevel models including multivariate multilevel model, multilevel simultaneous equations model and spatial dependence multilevel model. The main findings show that during the crisis under consideration, well-being is associated not only with individual determinants, but also with regional and national determinants. Results suggest that happiness and health are positively correlated at individual, regional and national levels. In terms of social capital, this study shows the reciprocal relationship between association membership and trust. Frequent Internet use at the time of crisis is positively associated with well-being. Finally, the findings suggest that, by means of unobserved factors, well-being is spatially correlated with the well-being of those neighbouring regions.
8

Novel regression models for discrete response

Peluso, Alina January 2017 (has links)
In a regression context, the aim is to analyse a response variable of interest conditional to a set of covariates. In many applications the response variable is discrete. Examples include the event of surviving a heart attack, the number of hospitalisation days, the number of times that individuals benefit of a health service, and so on. This thesis advances the methodology and the application of regression models with discrete response. First, we present a difference-in-differences approach to model a binary response in a health policy evaluation framework. In particular, generalized linear mixed methods are employed to model multiple dependent outcomes in order to quantify the effect of an adopted pay-for-performance program while accounting for the heterogeneity of the data at the multiple nested levels. The results show how the policy had a positive effect on the hospitals' quality in terms of those outcomes that can be more influenced by a managerial activity. Next, we focus on regression models for count response variables. In a parametric framework, Poisson regression is the simplest model for count data though it is often found not adequate in real applications, particularly in the presence of excessive zeros and in the case of dispersion, i.e. when the conditional mean is different to the conditional variance. Negative Binomial regression is the standard model for over-dispersed data, but it fails in the presence of under-dispersion. Poisson-Inverse Gaussian regression can be used in the case of over-dispersed data, Generalised-Poisson regression can be employed in the case of under-dispersed data, and Conway-Maxwell Poisson regression can be employed in both cases of over- or under-dispersed data, though the interpretability of these models is ot straightforward and they are often found computationally demanding. While Jittering is the default non-parametric approach for count data, inference has to be made for each individual quantile, separate quantiles may cross and the underlying uniform random sampling can generate instability in the estimation. These features motivate the development of a novel parametric regression model for counts via a Discrete Weibull distribution. This distribution is able to adapt to different types of dispersion relative to Poisson, and it also has the advantage of having a closed form expression for the quantiles. As well as the standard regression model, generalized linear mixed models and generalized additive models are presented via this distribution. Simulated and real data applications with different type of dispersion show a good performance of Discrete Weibull-based regression models compared with existing regression approaches for count data.
9

Análise de dados epidemiológicos incorporando planos amostrais complexos

Battisti, Iara Denise Endruweit January 2008 (has links)
Introdução: Muitos estudos epidemiológicos utilizam amostragem complexa para coleta de dados. A amostragem complexa pode ter uma ou mais das seguintes características: estratos, conglomerados e probabilidades desiguais de seleção. Se estas características não forem incorporadas na análise de dados, as estimativas pontuais e erros-padrões são incorretos. Assim é necessário ampliar a compreensão do impacto de cada característica nos resultados para incentivar os pesquisadores a utilizarem metodologias adequadas para análise dos dados, obtendo conclusões válidas para a população de onde provém a amostra. Para tratar as estruturas complexas do plano amostral existem duas principais metodologias: abordagem da amostragem complexa e abordagem de modelos multinível. Objetivos: Descrever e comparar métodos para tratamento de dados provindos de planos amostrais complexos através de duas abordagens: amostragem complexa e modelos multinível, utilizando dados de dois estudos epidemiológicos. Métodos: Para avaliar o impacto do plano amostral complexo, assim como de cada característica do plano amostral nas estimativas de média, proporção, coeficientes da regressão de Poisson e seus correspondentes erros padrões utilizaram-se os dados da busca ativa domiciliar dos participantes na Campanha Nacional de Detecção de Diabetes Mellitus – CNDDM de 2001, obtidos por amostragem estratificada com conglomerado em três estágios. Para comparar a abordagem da amostragem complexa e a abordagem de modelos multinível ajustaram-se modelos de regressão linear com e sem pesos amostrais utilizando os dados de um estudo do desempenho das crianças na avaliação de conhecimento, percepções e crenças sobre aleitamento materno, realizado com escolares da quinta série do ensino fundamental, no município de Ijuí/RS, estudo aleatorizado, com amostra estratificada por conglomerados. Resultados: As estimativas pontuais de média e proporção são semelhantes comparando-se amostragem complexa e amostragem aleatória simples, porém observou-se grande diferença nos erros padrões. O mesmo foi observado nas estimativas dos coeficientes da regressão de Poisson com menor efeito do plano amostral. Na comparação da abordagem da amostragem complexa com modelos multinível observou-se diferença nos erros padrões dos coeficientes da regressão entre as duas abordagens, sendo que os mesmos são maiores na amostragem complexa. Também, na análise não ponderada, as significâncias dos coeficientes no modelo final foram semelhantes entre as duas abordagens, porém houve diferença na análise ponderada para um dos coeficientes. Conclusões: Os resultados encontrados a partir dos dois estudos evidenciaram a necessidade de incorporar a complexidade do plano amostral na análise dos dados. A questão de pesquisa poderá ser um fator importante na escolha entre a abordagem da amostragem complexa e a abordagem de modelos multinível. / Introduction: Many epidemiological studies use complex samples for data collection. Complex sampling may have one or more of the following characteristics: stratification, clustering and unequal selection probabilities. If these characteristics are not incorporated into data analysis, point estimates and standard errors are incorrect. Greater understanding of the effect of each characteristic on results should stimulate researchers to use adequate methods for data analysis and, therefore, to reach conclusions that are valid for the population that generated the sample. Two major methods are used to deal with complex sampling designs: the complex sample approach and the multilevel model approach. Objective: To describe and compare methods to deal with data in complex sampling designs using complex sample and multilevel model approaches in two epidemiological studies. Method: Data retrieved from a house-to-house survey of participants in the 2001 Brazilian Diabetes Detection Campaign (Campanha Nacional de Detecção de Diabetes Melitus - CNDDM) and collected by stratified clustering sampling in three stages were used to evaluate the impact of complex sampling designs, as well as of each of their characteristics, on the estimates of means, proportions, Poisson regression coefficients and their corresponding standard errors. To compare the complex sample and the multilevel model approaches, linear regression models were adjusted with and without sample weights using data from a random study that used stratified cluster sampling and investigated the performance of children in the evaluation of knowledge, perceptions and beliefs about maternal breastfeeding conducted with fifth grade students in Ijuí, Brazil. Results: Mean and proportion point estimates were similar when complex sampling and simple random sampling were compared, but there was a great difference in standard errors. The same was found for estimates of Poisson regression coefficients that were less affected by sampling design. The complex sample approach showed significantly greater standard errors of the regression coefficients than the multilevel model approach. Also, unweighted analysis showed that the significance of coefficients in the final models was similar in the two approaches, but there was a difference in one of the coefficients in weighted analysis. Conclusions: Results of the two studies showed that sampling design complexity should be incorporated into data analysis. Research questions seem to be a determinant factor in the choice of either a complex sample or a multilevel model approach.
10

Análise de dados epidemiológicos incorporando planos amostrais complexos

Battisti, Iara Denise Endruweit January 2008 (has links)
Introdução: Muitos estudos epidemiológicos utilizam amostragem complexa para coleta de dados. A amostragem complexa pode ter uma ou mais das seguintes características: estratos, conglomerados e probabilidades desiguais de seleção. Se estas características não forem incorporadas na análise de dados, as estimativas pontuais e erros-padrões são incorretos. Assim é necessário ampliar a compreensão do impacto de cada característica nos resultados para incentivar os pesquisadores a utilizarem metodologias adequadas para análise dos dados, obtendo conclusões válidas para a população de onde provém a amostra. Para tratar as estruturas complexas do plano amostral existem duas principais metodologias: abordagem da amostragem complexa e abordagem de modelos multinível. Objetivos: Descrever e comparar métodos para tratamento de dados provindos de planos amostrais complexos através de duas abordagens: amostragem complexa e modelos multinível, utilizando dados de dois estudos epidemiológicos. Métodos: Para avaliar o impacto do plano amostral complexo, assim como de cada característica do plano amostral nas estimativas de média, proporção, coeficientes da regressão de Poisson e seus correspondentes erros padrões utilizaram-se os dados da busca ativa domiciliar dos participantes na Campanha Nacional de Detecção de Diabetes Mellitus – CNDDM de 2001, obtidos por amostragem estratificada com conglomerado em três estágios. Para comparar a abordagem da amostragem complexa e a abordagem de modelos multinível ajustaram-se modelos de regressão linear com e sem pesos amostrais utilizando os dados de um estudo do desempenho das crianças na avaliação de conhecimento, percepções e crenças sobre aleitamento materno, realizado com escolares da quinta série do ensino fundamental, no município de Ijuí/RS, estudo aleatorizado, com amostra estratificada por conglomerados. Resultados: As estimativas pontuais de média e proporção são semelhantes comparando-se amostragem complexa e amostragem aleatória simples, porém observou-se grande diferença nos erros padrões. O mesmo foi observado nas estimativas dos coeficientes da regressão de Poisson com menor efeito do plano amostral. Na comparação da abordagem da amostragem complexa com modelos multinível observou-se diferença nos erros padrões dos coeficientes da regressão entre as duas abordagens, sendo que os mesmos são maiores na amostragem complexa. Também, na análise não ponderada, as significâncias dos coeficientes no modelo final foram semelhantes entre as duas abordagens, porém houve diferença na análise ponderada para um dos coeficientes. Conclusões: Os resultados encontrados a partir dos dois estudos evidenciaram a necessidade de incorporar a complexidade do plano amostral na análise dos dados. A questão de pesquisa poderá ser um fator importante na escolha entre a abordagem da amostragem complexa e a abordagem de modelos multinível. / Introduction: Many epidemiological studies use complex samples for data collection. Complex sampling may have one or more of the following characteristics: stratification, clustering and unequal selection probabilities. If these characteristics are not incorporated into data analysis, point estimates and standard errors are incorrect. Greater understanding of the effect of each characteristic on results should stimulate researchers to use adequate methods for data analysis and, therefore, to reach conclusions that are valid for the population that generated the sample. Two major methods are used to deal with complex sampling designs: the complex sample approach and the multilevel model approach. Objective: To describe and compare methods to deal with data in complex sampling designs using complex sample and multilevel model approaches in two epidemiological studies. Method: Data retrieved from a house-to-house survey of participants in the 2001 Brazilian Diabetes Detection Campaign (Campanha Nacional de Detecção de Diabetes Melitus - CNDDM) and collected by stratified clustering sampling in three stages were used to evaluate the impact of complex sampling designs, as well as of each of their characteristics, on the estimates of means, proportions, Poisson regression coefficients and their corresponding standard errors. To compare the complex sample and the multilevel model approaches, linear regression models were adjusted with and without sample weights using data from a random study that used stratified cluster sampling and investigated the performance of children in the evaluation of knowledge, perceptions and beliefs about maternal breastfeeding conducted with fifth grade students in Ijuí, Brazil. Results: Mean and proportion point estimates were similar when complex sampling and simple random sampling were compared, but there was a great difference in standard errors. The same was found for estimates of Poisson regression coefficients that were less affected by sampling design. The complex sample approach showed significantly greater standard errors of the regression coefficients than the multilevel model approach. Also, unweighted analysis showed that the significance of coefficients in the final models was similar in the two approaches, but there was a difference in one of the coefficients in weighted analysis. Conclusions: Results of the two studies showed that sampling design complexity should be incorporated into data analysis. Research questions seem to be a determinant factor in the choice of either a complex sample or a multilevel model approach.

Page generated in 0.0792 seconds