Spelling suggestions: "subject:"[een] MULTILEVEL MODEL"" "subject:"[enn] MULTILEVEL MODEL""
1 |
An Investigation of Methods for Missing Data in Hierarchical Models for Discrete DataAhmed, Muhamad Rashid January 2011 (has links)
Hierarchical models are applicable to modeling data from complex
surveys or longitudinal data when a clustered or multistage sample
design is employed. The focus of this thesis is to investigate
inference for discrete hierarchical models in the presence of
missing data. This thesis is divided into two parts: in the first
part, methods are developed to analyze the discrete and ordinal
response data from hierarchical longitudinal studies. Several
approximation methods have been developed to estimate the parameters
for the fixed and random effects in the context of generalized
linear models. The thesis focuses on two likelihood-based
estimation procedures, the pseudo likelihood (PL) method and the adaptive
Gaussian quadrature (AGQ) method.
The simulation results suggest that AGQ
is preferable to PL when the
goal is to estimate the variance of the random intercept in a
complex hierarchical model. AGQ provides smaller biases
for the estimate of the variance of the random intercept.
Furthermore, it permits greater
flexibility in accommodating user-defined likelihood functions.
In the second part, simulated data are used to develop a method for
modeling longitudinal binary data when non-response depends on
unobserved responses. This simulation study modeled three-level
discrete hierarchical data with 30% and 40% missing data
using a missing not at random (MNAR) missing-data mechanism. It
focused on a monotone missing data-pattern. The imputation methods
used in this thesis are: complete case analysis (CCA), last
observation carried forward (LOCF), available case missing value
(ACMVPM) restriction, complete case missing value (CCMVPM)
restriction, neighboring case missing value (NCMVPM) restriction,
selection model with predictive mean matching method (SMPM), and
Bayesian pattern mixture model. All three restriction methods and
the selection model used the predictive mean matching method to
impute missing data. Multiple imputation is used to impute the
missing values. These m imputed values for each missing data
produce m complete datasets. Each dataset is analyzed and the
parameters are estimated. The results from the m analyses are then
combined using the method of Rubin(1987), and inferences are
made from these results. Our results suggest that restriction
methods provide results that are superior to those of other methods.
The selection model provides smaller biases than the LOCF methods
but as the proportion of missing data increases the selection model
is not better than LOCF. Among the three restriction methods the
ACMVPM method performs best. The proposed method provides an
alternative to standard selection and pattern-mixture modeling
frameworks when data are not missing at random. This method is
applied to data from the third Waterloo Smoking Project, a
seven-year smoking prevention study having substantial non-response
due to loss-to-follow-up.
|
2 |
An Investigation of Methods for Missing Data in Hierarchical Models for Discrete DataAhmed, Muhamad Rashid January 2011 (has links)
Hierarchical models are applicable to modeling data from complex
surveys or longitudinal data when a clustered or multistage sample
design is employed. The focus of this thesis is to investigate
inference for discrete hierarchical models in the presence of
missing data. This thesis is divided into two parts: in the first
part, methods are developed to analyze the discrete and ordinal
response data from hierarchical longitudinal studies. Several
approximation methods have been developed to estimate the parameters
for the fixed and random effects in the context of generalized
linear models. The thesis focuses on two likelihood-based
estimation procedures, the pseudo likelihood (PL) method and the adaptive
Gaussian quadrature (AGQ) method.
The simulation results suggest that AGQ
is preferable to PL when the
goal is to estimate the variance of the random intercept in a
complex hierarchical model. AGQ provides smaller biases
for the estimate of the variance of the random intercept.
Furthermore, it permits greater
flexibility in accommodating user-defined likelihood functions.
In the second part, simulated data are used to develop a method for
modeling longitudinal binary data when non-response depends on
unobserved responses. This simulation study modeled three-level
discrete hierarchical data with 30% and 40% missing data
using a missing not at random (MNAR) missing-data mechanism. It
focused on a monotone missing data-pattern. The imputation methods
used in this thesis are: complete case analysis (CCA), last
observation carried forward (LOCF), available case missing value
(ACMVPM) restriction, complete case missing value (CCMVPM)
restriction, neighboring case missing value (NCMVPM) restriction,
selection model with predictive mean matching method (SMPM), and
Bayesian pattern mixture model. All three restriction methods and
the selection model used the predictive mean matching method to
impute missing data. Multiple imputation is used to impute the
missing values. These m imputed values for each missing data
produce m complete datasets. Each dataset is analyzed and the
parameters are estimated. The results from the m analyses are then
combined using the method of Rubin(1987), and inferences are
made from these results. Our results suggest that restriction
methods provide results that are superior to those of other methods.
The selection model provides smaller biases than the LOCF methods
but as the proportion of missing data increases the selection model
is not better than LOCF. Among the three restriction methods the
ACMVPM method performs best. The proposed method provides an
alternative to standard selection and pattern-mixture modeling
frameworks when data are not missing at random. This method is
applied to data from the third Waterloo Smoking Project, a
seven-year smoking prevention study having substantial non-response
due to loss-to-follow-up.
|
3 |
Longitudinal multilevel models analyzing the trends of land use effects on non-driving travel choiceBai, Xiao, active 2013 22 April 2014 (has links)
Land use and transportation researchers have conducted numerous studies about land use effects on travel mode choice, and probed for effective policies to reduce driving, since less driving and more non-driving are widely recognized as more sustainable travel behaviors to resolve many environmental, energy and social equity issues. However, most of the previous studies rely on methodologies developed by cross-sectional data; only limited attention is explicitly given to explore the statistical techniques for longitudinal design and analysis. Using the neighborhood-level land use and persona-level travel mode choice data of 1997 and 2006 in the city of Austin, this paper attempts to establish and compare three distinct modeling approaches to analyze the trends of land use effects on people’s choice behavior of non-driving travel mode. The three modeling approaches are: a comparison approach with two cross-sectional multilevel Logit models using single-year data, a pooling approach by building one multilevel model with two-year data, and a longitudinal multilevel model. Empirical modeling results indicate that the longitudinal multilevel model is the most reasonable model for analyzing the longitudinal and multilevel datasets, since it is capable of estimating both time-invariant and time-variant land use effects, and internalizes time-variant random effects. The other two approaches may have several shortcomings. For example, the comparison approach fails to distinguish the time-variant and time-invariant effects; while the pooling model may lead to underestimated standard errors and t-statistics, and thus overestimate the significance of variables. / text
|
4 |
A Multilevel Analysis of the Contribution of Individual, Socioeconomic and Geographical Factors on Kindergarten Children’s Developmental Health: A Saskatchewan Province-Wide Study2014 March 1900 (has links)
In current literature of child public health, a growing number of studies has been dedicated to early childhood development with a focus on child developmental health measured via the teacher completed Early Development Instrument (EDI). Using multilevel modeling as the optimal statistical method to analyze hierarchical EDI data, this study determines the strength of the effect and significance of predictors of children’s 5 EDI outcomes, vulnerability, and the multiple vulnerability by taking into account the hierarchy present in its design. In addition, this study conducts an extensive epidemiological review of the risk factors associated with a child’s developmental health at each level of the hierarchy, at cross-levels of the hierarchy and their variations across different levels of the hierarchy. This cross-sectional study considered 9045 Saskatchewan children who were ages 4-8 years in the 2008-2009 school years. Individual child characteristics, EDI domains, and vulnerability data were collected by the Ministry of Education teachers in the provincial 2008 EDI project; neighborhood contextual Census data were compiled by SPHERU staff at the University of Saskatchewan. Multilevel linear and logistic models were used to analyze the data. According to the results, individual characteristics, such as being Aboriginal, an ESL learner, male, and being absent from school; neighborhood characteristics such as income inequality; and geographical characteristics such as living in a large city have negative effects on EDI scores and exacerbating the odds of vulnerability. Compounding effects of Aboriginal-special skills, large city-Aboriginal, and large city-neighborhood median income were positive on the above outcomes with considerable either significance or strength, while those of neighborhood income inequality-Aboriginal, and large city-neighborhood income inequality were negative with notable significance and strength. Furthermore, neighborhood contextual variables contribute to a considerable proportion of health outcome variations and the results associated with neighborhood income inequality give further evidence of the income inequality hypothesis. The findings of this study recommend provincial child public health policy makers’ extended attention to Aboriginal children, children with ESL status, those children living in neighborhoods with high income inequality and children from Regina.
|
5 |
Essays on well-being during crisis in EuropePierewan, Adi Cilik January 2014 (has links)
The claim that economic crisis matters for well-being seems intuitive; supporting evidence, however, remains elusive. The present study aims to examine the individual and contextual determinants of well-being across regions in Europe during the 2007-2008 economic crisis. This study contributes to the existing research on the determinants of well-being in three ways. First, while most studies explain the determinants of well-being in the context of non-crisis, this study examines the determinants during a period of crisis. Second, while most research on well-being focuses on cross-national comparisons of well-being, this study investigates variations at both the regional and national levels. Third, while most studies use either individual or aggregate analyses to examine the determinants of well-being, this study uses multilevel models. This study uses datasets that combine individual, regional and country level data. Individual data is taken from the 2008 European Values Study (EVS) and the 2004-2010 European Social Survey (ESS). Regional level data comes from Eurostat and Euroboundarymaps, while country level data comes from the Inglehart Index, UNU-WIDER and Esping-Andersen categorisation on welfare states. To analyse the data, this study uses various multilevel models including multivariate multilevel model, multilevel simultaneous equations model and spatial dependence multilevel model. The main findings show that during the crisis under consideration, well-being is associated not only with individual determinants, but also with regional and national determinants. Results suggest that happiness and health are positively correlated at individual, regional and national levels. In terms of social capital, this study shows the reciprocal relationship between association membership and trust. Frequent Internet use at the time of crisis is positively associated with well-being. Finally, the findings suggest that, by means of unobserved factors, well-being is spatially correlated with the well-being of those neighbouring regions.
|
6 |
Novel regression models for discrete responsePeluso, Alina January 2017 (has links)
In a regression context, the aim is to analyse a response variable of interest conditional to a set of covariates. In many applications the response variable is discrete. Examples include the event of surviving a heart attack, the number of hospitalisation days, the number of times that individuals benefit of a health service, and so on. This thesis advances the methodology and the application of regression models with discrete response. First, we present a difference-in-differences approach to model a binary response in a health policy evaluation framework. In particular, generalized linear mixed methods are employed to model multiple dependent outcomes in order to quantify the effect of an adopted pay-for-performance program while accounting for the heterogeneity of the data at the multiple nested levels. The results show how the policy had a positive effect on the hospitals' quality in terms of those outcomes that can be more influenced by a managerial activity. Next, we focus on regression models for count response variables. In a parametric framework, Poisson regression is the simplest model for count data though it is often found not adequate in real applications, particularly in the presence of excessive zeros and in the case of dispersion, i.e. when the conditional mean is different to the conditional variance. Negative Binomial regression is the standard model for over-dispersed data, but it fails in the presence of under-dispersion. Poisson-Inverse Gaussian regression can be used in the case of over-dispersed data, Generalised-Poisson regression can be employed in the case of under-dispersed data, and Conway-Maxwell Poisson regression can be employed in both cases of over- or under-dispersed data, though the interpretability of these models is ot straightforward and they are often found computationally demanding. While Jittering is the default non-parametric approach for count data, inference has to be made for each individual quantile, separate quantiles may cross and the underlying uniform random sampling can generate instability in the estimation. These features motivate the development of a novel parametric regression model for counts via a Discrete Weibull distribution. This distribution is able to adapt to different types of dispersion relative to Poisson, and it also has the advantage of having a closed form expression for the quantiles. As well as the standard regression model, generalized linear mixed models and generalized additive models are presented via this distribution. Simulated and real data applications with different type of dispersion show a good performance of Discrete Weibull-based regression models compared with existing regression approaches for count data.
|
7 |
Análise de dados epidemiológicos incorporando planos amostrais complexosBattisti, Iara Denise Endruweit January 2008 (has links)
Introdução: Muitos estudos epidemiológicos utilizam amostragem complexa para coleta de dados. A amostragem complexa pode ter uma ou mais das seguintes características: estratos, conglomerados e probabilidades desiguais de seleção. Se estas características não forem incorporadas na análise de dados, as estimativas pontuais e erros-padrões são incorretos. Assim é necessário ampliar a compreensão do impacto de cada característica nos resultados para incentivar os pesquisadores a utilizarem metodologias adequadas para análise dos dados, obtendo conclusões válidas para a população de onde provém a amostra. Para tratar as estruturas complexas do plano amostral existem duas principais metodologias: abordagem da amostragem complexa e abordagem de modelos multinível. Objetivos: Descrever e comparar métodos para tratamento de dados provindos de planos amostrais complexos através de duas abordagens: amostragem complexa e modelos multinível, utilizando dados de dois estudos epidemiológicos. Métodos: Para avaliar o impacto do plano amostral complexo, assim como de cada característica do plano amostral nas estimativas de média, proporção, coeficientes da regressão de Poisson e seus correspondentes erros padrões utilizaram-se os dados da busca ativa domiciliar dos participantes na Campanha Nacional de Detecção de Diabetes Mellitus – CNDDM de 2001, obtidos por amostragem estratificada com conglomerado em três estágios. Para comparar a abordagem da amostragem complexa e a abordagem de modelos multinível ajustaram-se modelos de regressão linear com e sem pesos amostrais utilizando os dados de um estudo do desempenho das crianças na avaliação de conhecimento, percepções e crenças sobre aleitamento materno, realizado com escolares da quinta série do ensino fundamental, no município de Ijuí/RS, estudo aleatorizado, com amostra estratificada por conglomerados. Resultados: As estimativas pontuais de média e proporção são semelhantes comparando-se amostragem complexa e amostragem aleatória simples, porém observou-se grande diferença nos erros padrões. O mesmo foi observado nas estimativas dos coeficientes da regressão de Poisson com menor efeito do plano amostral. Na comparação da abordagem da amostragem complexa com modelos multinível observou-se diferença nos erros padrões dos coeficientes da regressão entre as duas abordagens, sendo que os mesmos são maiores na amostragem complexa. Também, na análise não ponderada, as significâncias dos coeficientes no modelo final foram semelhantes entre as duas abordagens, porém houve diferença na análise ponderada para um dos coeficientes. Conclusões: Os resultados encontrados a partir dos dois estudos evidenciaram a necessidade de incorporar a complexidade do plano amostral na análise dos dados. A questão de pesquisa poderá ser um fator importante na escolha entre a abordagem da amostragem complexa e a abordagem de modelos multinível. / Introduction: Many epidemiological studies use complex samples for data collection. Complex sampling may have one or more of the following characteristics: stratification, clustering and unequal selection probabilities. If these characteristics are not incorporated into data analysis, point estimates and standard errors are incorrect. Greater understanding of the effect of each characteristic on results should stimulate researchers to use adequate methods for data analysis and, therefore, to reach conclusions that are valid for the population that generated the sample. Two major methods are used to deal with complex sampling designs: the complex sample approach and the multilevel model approach. Objective: To describe and compare methods to deal with data in complex sampling designs using complex sample and multilevel model approaches in two epidemiological studies. Method: Data retrieved from a house-to-house survey of participants in the 2001 Brazilian Diabetes Detection Campaign (Campanha Nacional de Detecção de Diabetes Melitus - CNDDM) and collected by stratified clustering sampling in three stages were used to evaluate the impact of complex sampling designs, as well as of each of their characteristics, on the estimates of means, proportions, Poisson regression coefficients and their corresponding standard errors. To compare the complex sample and the multilevel model approaches, linear regression models were adjusted with and without sample weights using data from a random study that used stratified cluster sampling and investigated the performance of children in the evaluation of knowledge, perceptions and beliefs about maternal breastfeeding conducted with fifth grade students in Ijuí, Brazil. Results: Mean and proportion point estimates were similar when complex sampling and simple random sampling were compared, but there was a great difference in standard errors. The same was found for estimates of Poisson regression coefficients that were less affected by sampling design. The complex sample approach showed significantly greater standard errors of the regression coefficients than the multilevel model approach. Also, unweighted analysis showed that the significance of coefficients in the final models was similar in the two approaches, but there was a difference in one of the coefficients in weighted analysis. Conclusions: Results of the two studies showed that sampling design complexity should be incorporated into data analysis. Research questions seem to be a determinant factor in the choice of either a complex sample or a multilevel model approach.
|
8 |
Análise de dados epidemiológicos incorporando planos amostrais complexosBattisti, Iara Denise Endruweit January 2008 (has links)
Introdução: Muitos estudos epidemiológicos utilizam amostragem complexa para coleta de dados. A amostragem complexa pode ter uma ou mais das seguintes características: estratos, conglomerados e probabilidades desiguais de seleção. Se estas características não forem incorporadas na análise de dados, as estimativas pontuais e erros-padrões são incorretos. Assim é necessário ampliar a compreensão do impacto de cada característica nos resultados para incentivar os pesquisadores a utilizarem metodologias adequadas para análise dos dados, obtendo conclusões válidas para a população de onde provém a amostra. Para tratar as estruturas complexas do plano amostral existem duas principais metodologias: abordagem da amostragem complexa e abordagem de modelos multinível. Objetivos: Descrever e comparar métodos para tratamento de dados provindos de planos amostrais complexos através de duas abordagens: amostragem complexa e modelos multinível, utilizando dados de dois estudos epidemiológicos. Métodos: Para avaliar o impacto do plano amostral complexo, assim como de cada característica do plano amostral nas estimativas de média, proporção, coeficientes da regressão de Poisson e seus correspondentes erros padrões utilizaram-se os dados da busca ativa domiciliar dos participantes na Campanha Nacional de Detecção de Diabetes Mellitus – CNDDM de 2001, obtidos por amostragem estratificada com conglomerado em três estágios. Para comparar a abordagem da amostragem complexa e a abordagem de modelos multinível ajustaram-se modelos de regressão linear com e sem pesos amostrais utilizando os dados de um estudo do desempenho das crianças na avaliação de conhecimento, percepções e crenças sobre aleitamento materno, realizado com escolares da quinta série do ensino fundamental, no município de Ijuí/RS, estudo aleatorizado, com amostra estratificada por conglomerados. Resultados: As estimativas pontuais de média e proporção são semelhantes comparando-se amostragem complexa e amostragem aleatória simples, porém observou-se grande diferença nos erros padrões. O mesmo foi observado nas estimativas dos coeficientes da regressão de Poisson com menor efeito do plano amostral. Na comparação da abordagem da amostragem complexa com modelos multinível observou-se diferença nos erros padrões dos coeficientes da regressão entre as duas abordagens, sendo que os mesmos são maiores na amostragem complexa. Também, na análise não ponderada, as significâncias dos coeficientes no modelo final foram semelhantes entre as duas abordagens, porém houve diferença na análise ponderada para um dos coeficientes. Conclusões: Os resultados encontrados a partir dos dois estudos evidenciaram a necessidade de incorporar a complexidade do plano amostral na análise dos dados. A questão de pesquisa poderá ser um fator importante na escolha entre a abordagem da amostragem complexa e a abordagem de modelos multinível. / Introduction: Many epidemiological studies use complex samples for data collection. Complex sampling may have one or more of the following characteristics: stratification, clustering and unequal selection probabilities. If these characteristics are not incorporated into data analysis, point estimates and standard errors are incorrect. Greater understanding of the effect of each characteristic on results should stimulate researchers to use adequate methods for data analysis and, therefore, to reach conclusions that are valid for the population that generated the sample. Two major methods are used to deal with complex sampling designs: the complex sample approach and the multilevel model approach. Objective: To describe and compare methods to deal with data in complex sampling designs using complex sample and multilevel model approaches in two epidemiological studies. Method: Data retrieved from a house-to-house survey of participants in the 2001 Brazilian Diabetes Detection Campaign (Campanha Nacional de Detecção de Diabetes Melitus - CNDDM) and collected by stratified clustering sampling in three stages were used to evaluate the impact of complex sampling designs, as well as of each of their characteristics, on the estimates of means, proportions, Poisson regression coefficients and their corresponding standard errors. To compare the complex sample and the multilevel model approaches, linear regression models were adjusted with and without sample weights using data from a random study that used stratified cluster sampling and investigated the performance of children in the evaluation of knowledge, perceptions and beliefs about maternal breastfeeding conducted with fifth grade students in Ijuí, Brazil. Results: Mean and proportion point estimates were similar when complex sampling and simple random sampling were compared, but there was a great difference in standard errors. The same was found for estimates of Poisson regression coefficients that were less affected by sampling design. The complex sample approach showed significantly greater standard errors of the regression coefficients than the multilevel model approach. Also, unweighted analysis showed that the significance of coefficients in the final models was similar in the two approaches, but there was a difference in one of the coefficients in weighted analysis. Conclusions: Results of the two studies showed that sampling design complexity should be incorporated into data analysis. Research questions seem to be a determinant factor in the choice of either a complex sample or a multilevel model approach.
|
9 |
Mechanisms Linking Daily Pain and Depressive Symptoms: The Application of Diary Assessment and Bio-Psycho-Social ProfilingJanuary 2018 (has links)
abstract: Despite the strong link between pain and depressive symptoms, the mechanisms by which they are connected in the everyday lives of individuals with chronic pain are not well understood. In addition, previous investigations have tended to ignore biopsychosocial individual difference factors, assuming that all individuals respond to pain-related experiences and affect in the same manner. The present study tried to address these gaps in the existing literature. Two hundred twenty individuals with Fibromyalgia completed daily diaries during the morning, afternoon, and evening for 21 days. Findings were generally consistent with the hypotheses. Multilevel structural equation modeling revealed that morning pain and positive and negative affect are uniquely associated with morning negative pain appraisal, which in turn, is positively related to pain’s activity interference in the afternoon. Pain’s activity interference was the strongest predictor of evening depressive symptoms. Latent profile analysis using biopsychosocial measures identified three theoretically and clinically important subgroups (i.e., Low Functioning, Normative, and High Functioning groups). Although the daily pain-depressive symptoms link was not significantly moderated by these subgroups, individuals in the High Functioning group reported the lowest levels of average morning pain, negative affect, negative pain appraisal, afternoon pain’s activity interference, and evening depressive symptoms, and the highest levels of average morning positive affect across 21 days relative to the other two groups. The Normative group fared better on all measures than did the Low Functioning group. The findings of the present study suggest the importance of promoting morning positive affect and decreasing negative affect in disconnecting the within-day pain-depressive symptoms link, as well as the potential value of tailoring chronic pain interventions to those individuals who are in the greatest need. / Dissertation/Thesis / Doctoral Dissertation Psychology 2018
|
10 |
Análise de dados epidemiológicos incorporando planos amostrais complexosBattisti, Iara Denise Endruweit January 2008 (has links)
Introdução: Muitos estudos epidemiológicos utilizam amostragem complexa para coleta de dados. A amostragem complexa pode ter uma ou mais das seguintes características: estratos, conglomerados e probabilidades desiguais de seleção. Se estas características não forem incorporadas na análise de dados, as estimativas pontuais e erros-padrões são incorretos. Assim é necessário ampliar a compreensão do impacto de cada característica nos resultados para incentivar os pesquisadores a utilizarem metodologias adequadas para análise dos dados, obtendo conclusões válidas para a população de onde provém a amostra. Para tratar as estruturas complexas do plano amostral existem duas principais metodologias: abordagem da amostragem complexa e abordagem de modelos multinível. Objetivos: Descrever e comparar métodos para tratamento de dados provindos de planos amostrais complexos através de duas abordagens: amostragem complexa e modelos multinível, utilizando dados de dois estudos epidemiológicos. Métodos: Para avaliar o impacto do plano amostral complexo, assim como de cada característica do plano amostral nas estimativas de média, proporção, coeficientes da regressão de Poisson e seus correspondentes erros padrões utilizaram-se os dados da busca ativa domiciliar dos participantes na Campanha Nacional de Detecção de Diabetes Mellitus – CNDDM de 2001, obtidos por amostragem estratificada com conglomerado em três estágios. Para comparar a abordagem da amostragem complexa e a abordagem de modelos multinível ajustaram-se modelos de regressão linear com e sem pesos amostrais utilizando os dados de um estudo do desempenho das crianças na avaliação de conhecimento, percepções e crenças sobre aleitamento materno, realizado com escolares da quinta série do ensino fundamental, no município de Ijuí/RS, estudo aleatorizado, com amostra estratificada por conglomerados. Resultados: As estimativas pontuais de média e proporção são semelhantes comparando-se amostragem complexa e amostragem aleatória simples, porém observou-se grande diferença nos erros padrões. O mesmo foi observado nas estimativas dos coeficientes da regressão de Poisson com menor efeito do plano amostral. Na comparação da abordagem da amostragem complexa com modelos multinível observou-se diferença nos erros padrões dos coeficientes da regressão entre as duas abordagens, sendo que os mesmos são maiores na amostragem complexa. Também, na análise não ponderada, as significâncias dos coeficientes no modelo final foram semelhantes entre as duas abordagens, porém houve diferença na análise ponderada para um dos coeficientes. Conclusões: Os resultados encontrados a partir dos dois estudos evidenciaram a necessidade de incorporar a complexidade do plano amostral na análise dos dados. A questão de pesquisa poderá ser um fator importante na escolha entre a abordagem da amostragem complexa e a abordagem de modelos multinível. / Introduction: Many epidemiological studies use complex samples for data collection. Complex sampling may have one or more of the following characteristics: stratification, clustering and unequal selection probabilities. If these characteristics are not incorporated into data analysis, point estimates and standard errors are incorrect. Greater understanding of the effect of each characteristic on results should stimulate researchers to use adequate methods for data analysis and, therefore, to reach conclusions that are valid for the population that generated the sample. Two major methods are used to deal with complex sampling designs: the complex sample approach and the multilevel model approach. Objective: To describe and compare methods to deal with data in complex sampling designs using complex sample and multilevel model approaches in two epidemiological studies. Method: Data retrieved from a house-to-house survey of participants in the 2001 Brazilian Diabetes Detection Campaign (Campanha Nacional de Detecção de Diabetes Melitus - CNDDM) and collected by stratified clustering sampling in three stages were used to evaluate the impact of complex sampling designs, as well as of each of their characteristics, on the estimates of means, proportions, Poisson regression coefficients and their corresponding standard errors. To compare the complex sample and the multilevel model approaches, linear regression models were adjusted with and without sample weights using data from a random study that used stratified cluster sampling and investigated the performance of children in the evaluation of knowledge, perceptions and beliefs about maternal breastfeeding conducted with fifth grade students in Ijuí, Brazil. Results: Mean and proportion point estimates were similar when complex sampling and simple random sampling were compared, but there was a great difference in standard errors. The same was found for estimates of Poisson regression coefficients that were less affected by sampling design. The complex sample approach showed significantly greater standard errors of the regression coefficients than the multilevel model approach. Also, unweighted analysis showed that the significance of coefficients in the final models was similar in the two approaches, but there was a difference in one of the coefficients in weighted analysis. Conclusions: Results of the two studies showed that sampling design complexity should be incorporated into data analysis. Research questions seem to be a determinant factor in the choice of either a complex sample or a multilevel model approach.
|
Page generated in 2.1522 seconds