• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • 1
  • Tagged with
  • 11
  • 11
  • 5
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The Associations Between Bisphenol A and Phthalates, and Measures of Adiposity Among Canadians

McCormack, Daniel January 2016 (has links)
Bisphenol A (BPA) and phthalates are chemicals found in many consumer products including water bottles, food packaging and cosmetics. Previous research has shown that there is potential for these compounds to contribute to obesity. In this analysis, the Canadian Health Measures Survey was used to investigate possible associations between urinary concentrations of these compounds and measures of adiposity. BPA urine concentrations were found to decrease with age, and significant associations with BMI and waist circumference were found in linear regression in adults. No associations with measures of adiposity were found in logistic regression for adults and significant negative associations were found in children. A similar discrepancy was found for mono-(2-ethyl-5-hydroxyhexyl) phthalate and mono-(2-ethyl-5-oxohexyl) phthalate, which were significantly associated with obesity in adults, but showed several significant negative associations in children. Overall, this analysis showed that it is unlikely that BPA and phthalates are contributing to adiposity in the Canadian population.
2

Statistical modeling of longitudinal survey data with binary outcomes

Ghosh, Sunita 20 December 2007
Data obtained from longitudinal surveys using complex multi-stage sampling designs contain cross-sectional dependencies among units caused by inherent hierarchies in the data, and within subject correlation arising due to repeated measurements. The statistical methods used for analyzing such data should account for stratification, clustering and unequal probability of selection as well as within-subject correlations due to repeated measurements. <p>The complex multi-stage design approach has been used in the longitudinal National Population Health Survey (NPHS). This on-going survey collects information on health determinants and outcomes in a sample of the general Canadian population. <p>This dissertation compares the model-based and design-based approaches used to determine the risk factors of asthma prevalence in the Canadian female population of the NPHS (marginal model). Weighted, unweighted and robust statistical methods were used to examine the risk factors of the incidence of asthma (event history analysis) and of recurrent asthma episodes (recurrent survival analysis). Missing data analysis was used to study the bias associated with incomplete data. To determine the risk factors of asthma prevalence, the Generalized Estimating Equations (GEE) approach was used for marginal modeling (model-based approach) followed by Taylor Linearization and bootstrap estimation of standard errors (design-based approach). The incidence of asthma (event history analysis) was estimated using weighted, unweighted and robust methods. Recurrent event history analysis was conducted using Anderson and Gill, Wei, Lin and Weissfeld (WLW) and Prentice, Williams and Peterson (PWP) approaches. To assess the presence of bias associated with missing data, the weighted GEE and pattern-mixture models were used.<p>The prevalence of asthma in the Canadian female population was 6.9% (6.1-7.7) at the end of Cycle 5. When comparing model-based and design- based approaches for asthma prevalence, design-based method provided unbiased estimates of standard errors. The overall incidence of asthma in this population, excluding those with asthma at baseline, was 10.5/1000/year (9.2-12.1). For the event history analysis, the robust method provided the most stable estimates and standard errors. <p>For recurrent event history, the WLW method provided stable standard error estimates. Finally, for the missing data approach, the pattern-mixture model produced the most stable standard errors <p>To conclude, design-based approaches should be preferred over model-based approaches for analyzing complex survey data, as the former provides the most unbiased parameter estimates and standard errors.
3

Statistical modeling of longitudinal survey data with binary outcomes

Ghosh, Sunita 20 December 2007 (has links)
Data obtained from longitudinal surveys using complex multi-stage sampling designs contain cross-sectional dependencies among units caused by inherent hierarchies in the data, and within subject correlation arising due to repeated measurements. The statistical methods used for analyzing such data should account for stratification, clustering and unequal probability of selection as well as within-subject correlations due to repeated measurements. <p>The complex multi-stage design approach has been used in the longitudinal National Population Health Survey (NPHS). This on-going survey collects information on health determinants and outcomes in a sample of the general Canadian population. <p>This dissertation compares the model-based and design-based approaches used to determine the risk factors of asthma prevalence in the Canadian female population of the NPHS (marginal model). Weighted, unweighted and robust statistical methods were used to examine the risk factors of the incidence of asthma (event history analysis) and of recurrent asthma episodes (recurrent survival analysis). Missing data analysis was used to study the bias associated with incomplete data. To determine the risk factors of asthma prevalence, the Generalized Estimating Equations (GEE) approach was used for marginal modeling (model-based approach) followed by Taylor Linearization and bootstrap estimation of standard errors (design-based approach). The incidence of asthma (event history analysis) was estimated using weighted, unweighted and robust methods. Recurrent event history analysis was conducted using Anderson and Gill, Wei, Lin and Weissfeld (WLW) and Prentice, Williams and Peterson (PWP) approaches. To assess the presence of bias associated with missing data, the weighted GEE and pattern-mixture models were used.<p>The prevalence of asthma in the Canadian female population was 6.9% (6.1-7.7) at the end of Cycle 5. When comparing model-based and design- based approaches for asthma prevalence, design-based method provided unbiased estimates of standard errors. The overall incidence of asthma in this population, excluding those with asthma at baseline, was 10.5/1000/year (9.2-12.1). For the event history analysis, the robust method provided the most stable estimates and standard errors. <p>For recurrent event history, the WLW method provided stable standard error estimates. Finally, for the missing data approach, the pattern-mixture model produced the most stable standard errors <p>To conclude, design-based approaches should be preferred over model-based approaches for analyzing complex survey data, as the former provides the most unbiased parameter estimates and standard errors.
4

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data

Ashmead, Robert D. January 2014 (has links)
No description available.
5

Epidemiological Study of Coccidioidomycosis in Greater Tucson, Arizona

Tabor, Joseph Anthony January 2009 (has links)
The goal of this dissertation is to characterize the distribution and determinants of coccidioidomycosis in greater Tucson, Arizona, using landscape ecology and complex survey methods to control for environmental factors that affect <italic>Coccidioides</italic> exposure. Notifiable coccidioidomycosis cases reported to the health department in Arizona have dramatically increased since 1997 and indicate a potential epidemic of unknown causes. Epidemic determination is confounded by concurrent changes in notifiable disease reporting-compliance, misdiagnosis, and changing demographics of susceptible populations. A stratified, two-stage, address-based telephone survey of greater Tucson, Arizona, was conducted in 2002 and 2003. Subjects were recruited from direct marketing data by census block groups and landscape strata as determined using a geographic information system (GIS). Subjects were interviewed about potential risk factors. Address-level state health department notifiable-disease surveillance data were compared with self-reported survey data to estimate the true disease frequency.Comparing state surveillance data with the survey data, no coccidioidomycosis epidemic was detectable from 1992 to 2006 after adjusting surveillance data for reporting compliance. State health department surveillance reported only 20% of the probable reportable cases in 2001.Utilizing survey data and geographic coding, it was observed that spatial and temporal disease frequency was highly variable at the census block-group scale and indicates that localized soil disturbance events are a major group-level risk factor. Poststratification by 2000 census demographic data adjusted for selection bias into the survey and response rate. Being Hispanic showed similar odds ratio of self-reporting coccidioidomycosis diagnosis as of being non-Hispanic White race-ethnicity when controlled by other risk factors. Cigarette smoking in the home and having a home located in the low Hispanic foothills and low Hispanic riparian strata were associated with elevated risk of odds ratios for coccidioidomycosis. Sample stratification by landscape and demographics controlled for differential classification of susceptibility and exposures between strata.Clustered, address-based telephone surveys provide a feasible and valid method to recruit populations from address-based lists by using a GIS to design a survey and population survey statistical methods for the analysis. Notifiable coccidioidomycosis case surveillance can be improved by including reporting compliance in the analysis. Pathogen exposures and host susceptibility are important predictable group-level determinants of coccidioidomycosis that were controlled by stratified sampling using a landscape ecology approach.
6

On estimating variances for Gini coefficients with complex surveys: theory and application

Hoque, Ahmed 29 September 2016 (has links)
Obtaining variances for the plug-in estimator of the Gini coefficient for inequality has preoccupied researchers for decades with the proposed analytic formulae often being regarded as being too cumbersome to apply, as well as usually based on the assumption of an iid structure. We examine several variance estimation techniques for a Gini coefficient estimator obtained from a complex survey, a sampling design often used to obtain sample data in inequality studies. In the first part of the dissertation, we prove that Bhattacharya’s (2007) asymptotic variance estimator when data arise from a complex survey is equivalent to an asymptotic variance estimator derived by Binder and Kovačević (1995) nearly twenty years earlier. In addition, to aid applied researchers, we also show how auxiliary regressions can be used to generate the plug-in Gini estimator and its asymptotic variance, irrespective of the sampling design. In the second part of the dissertation, using Monte Carlo (MC) simulations with 36 data generating processes under the beta, lognormal, chi-square, and the Pareto distributional assumptions with sample data obtained under various complex survey designs, we explore two finite sample properties of the Gini coefficient estimator: bias of the estimator and empirical coverage probabilities of interval estimators for the Gini coefficient. We find high sensitivity to the number of strata and the underlying distribution of the population data. We compare the performance of two standard normal (SN) approximation interval estimators using the asymptotic variance estimators of Binder and Kovačević (1995) and Bhattacharya (2007), another SN approximation interval estimator using a traditional bootstrap variance estimator, and a standard MC bootstrap percentile interval estimator under a complex survey design. With few exceptions, namely with small samples and/or highly skewed distributions of the underlying population data where the bootstrap methods work relatively better, the SN approximation interval estimators using asymptotic variances perform quite well. Finally, health data on the body mass index and hemoglobin levels for Bangladeshi women and children, respectively, are used as illustrations. Inequality analysis of these two important indicators provides a better understanding about the health status of women and children. Our empirical results show that statistical inferences regarding inequality in these well-being variables, measured by the Gini coefficients, based on Binder and Kovačević’s and Bhattacharya’s asymptotic variance estimators, give equivalent outcomes. Although the bootstrap approach often generates slightly smaller variance estimates in small samples, the hypotheses test results or widths of interval estimates using this method are practically similar to those using the asymptotic variance estimators. Our results are useful, both theoretically and practically, as the asymptotic variance estimators are simpler and require less time to calculate compared to those generated by bootstrap methods, as often previously advocated by researchers. These findings suggest that applied researchers can often be comfortable in undertaking inferences about the inequality of a well-being variable using the Gini coefficient employing asymptotic variance estimators that are not difficult to calculate, irrespective of whether the sample data are obtained under a complex survey or a simple random sample design. / Graduate / 0534 / 0501 / 0463 / aahoque@gmail.com
7

Evaluation of Cross-Survey Research Methods for the Estimation of Low-Incidence Populations

Magidin de Kramer, Raquel January 2016 (has links)
Thesis advisor: Henry Braun / This study evaluates the accuracy, precision, and stability of three different methods of cross-survey analysis in order to determine their suitability for estimating the proportions of low-incidence populations. Population parameters of size and demographic distribution are necessary for planning and policy development. The estimation of these parameters for low-incidence populations poses a number of methodological challenges. Cross-survey analysis methodologies offer an alternative to generate useful, low-incidence population estimates not readily available in today's census without conducting targeted, costly surveys to estimate group size directly. The cross-survey methods evaluated in the study are meta-analysis of complex surveys (MACS), pooled design-based cross-survey (PDCS), and Bayesian multilevel regression with post-stratification (BMRP). The accuracy and precision of these methods were assessed by comparing the estimates of the proportion of the adult Jewish population in Canada generated by each method with benchmark estimates. The stability of the estimates, in turn, was determined by cross-validating estimates obtained with data from two random stratified subsamples drawn from a large pool of US surveys. The findings of the study indicate that, under the right conditions, cross-survey methods have the potential to produce very accurate and precise estimates of low-incidence populations. The study did find that the level of accuracy and precision of these estimates varied depending on the cross-survey method used and on the conditions under which the estimates were produced. The estimates obtained with PDCS and BMRP methodologies were more accurate than the ones generated by the MACS approach. The BMRP approach generated the most accurate estimates. The pooled design-based cross-survey method generated relatively accurate estimates across all the scenarios included in the study. The precision of the estimates was found to be related to the number of surveys considered in the analyses. Overall, the findings clearly show that cross-survey analysis methods provide a useful alternative for estimation of low-incidence populations. More research is needed to fully understand the factors that affect the accuracy and precision of estimates generated by these cross-survey methods. / Thesis (PhD) — Boston College, 2016. / Submitted to: Boston College. Lynch School of Education. / Discipline: Educational Research, Measurement and Evaluation.
8

Statistical models for estimating the intake of nutrients and foods from complex survey data

Pell, David Andrew January 2019 (has links)
Background: The consequences of poor nutrition are well known and of wide concern. Governments and public health agencies utilise food and diet surveillance data to make decisions that lead to improvements in nutrition. These surveys often utilise complex sample designs for efficient data collection. There are several challenges in the statistical analysis of dietary intake data collected using complex survey designs, which have not been fully addressed by current methods. Firstly, the shape of the distribution of intake can be highly skewed due to the presence of outlier observations and a large proportion of zero observations arising from the inability of the food diary to capture consumption within the period of observation. Secondly, dietary data is subject to variability arising from day-to-day individual variation in food consumption and measurement error, to be accounted for in the estimation procedure for correct inferences. Thirdly, the complex sample design needs to be incorporated into the estimation procedure to allow extrapolation of results into the target population. This thesis aims to develop novel statistical methods to address these challenges, applied to the analysis of iron intake data from the UK National Diet and Nutrition Survey Rolling Programme (NDNS RP) and UK national prescription data of iron deficiency medication. Methods: 1) To assess the nutritional status of particular population groups a two-part model with a generalised gamma (GG) distribution was developed for intakes that show high frequencies of zero observations. The two-part model accommodated the sources of data variation of dietary intake with a random intercept in each component, which could be correlated to allow a correlation between the probability of consuming and the amount consumed. 2) To identify population groups at risk of low nutrient intakes, a linear quantile mixed-effects model was developed to model quantiles of the distribution of intake as a function of explanatory variables. The proposed approach was illustrated by comparing the quantiles of iron intake with Lower Reference Nutrient Intakes (LRNI) recommendations using NDNS RP. This thesis extended the estimation procedures of both the two-part model with GG distribution and the linear quantile mixed-effects model to incorporate the complex sample design in three steps: the likelihood function was multiplied by the sample weightings; bootstrap methods for the estimation of the variance and finally, the variance estimation of the model parameters was stratified by the survey strata. 3) To evaluate the allocation of resources to alleviate nutritional deficiencies, a quantile linear mixed-effects model was used to analyse the distribution of expenditure on iron deficiency medication across health boards in the UK. Expenditure is likely to depend on the iron status of the region; therefore, for a fair comparison among health boards, iron status was estimated using the method developed in objective 2) and used in the specification of the median amount spent. Each health board is formed by a set of general practices (GPs), therefore, a random intercept was used to induce correlation between expenditure from two GPs from the same health board. Finally, the approaches in objectives 1) and 2) were compared with the traditional approach based on weighted linear regression modelling used in the NDNS RP reports. All analyses were implemented using SAS and R. Results: The two-part model with GG distribution fitted to amount of iron consumed from selected episodically food, showed that females tended to have greater odds of consuming iron from foods but consumed smaller amounts. As age groups increased, consumption tended to increase relative to the reference group though odds of consumption varied. Iron consumption also appeared to be dependent on National Statistics Socio-Economic Classification (NSSEC) group with lower social groups consuming less, in general. The quantiles of iron intake estimated using the linear quantile mixed-effects model showed that more than 25% of females aged 11-50y are below the LRNI, and that 11-18y girls are the group at highest of deficiency in the UK. Predictions of spending on iron medication in the UK based on the linear quantile mixed-effects model showed areas of higher iron intake resulted in lower spending on treating iron deficiency. In a geographical display of expenditure, Northern Ireland featured the lowest amount spent. Comparing the results from the methods proposed here showed that using the traditional approach based on weighted regression analysis could result in spurious associations. Discussion: This thesis developed novel approaches to the analysis of dietary complex survey data to address three important objectives of diet surveillance, namely the mean estimation of food intake by population groups, identification of groups at high risk of nutrient deficiency and allocation of resources to alleviate nutrient deficiencies. The methods provided models of good fit to dietary data, accounted for the sources of data variability and extended the estimation procedures to incorporate the complex sample survey design. The use of a GG distribution for modelling intake is an important improvement over existing methods, as it includes many distributions with different shapes and its domain takes non-negative values. The two-part model accommodated the sources of data variation of dietary intake with a random intercept in each component, which could be correlated to allow a correlation between the probability of consuming and the amount consumed. This also improves existing approaches that assume a zero correlation. The linear quantile mixed-effects model utilises the asymmetric Laplace distribution which can also accommodate many different distributional shapes, and likelihood-based estimation is robust to model misspecification. This method is an important improvement over existing methods used in nutritional research as it explicitly models the quantiles in terms of explanatory variables using a novel quantile regression model with random effects. The application of these models to UK national data confirmed the association of poorer diets and lower social class, identified the group of 11-50y females as a group at high risk of iron deficiency, and highlighted Northern Ireland as the region with the lowest expenditure on iron prescriptions.
9

Comparing Model-based and Design-based Structural Equation Modeling Approaches in Analyzing Complex Survey Data

Wu, Jiun-Yu 2010 August 1900 (has links)
Conventional statistical methods assuming data sampled under simple random sampling are inadequate for use on complex survey data with a multilevel structure and non-independent observations. In structural equation modeling (SEM) framework, a researcher can either use the ad-hoc robust sandwich standard error estimators to correct the standard error estimates (Design-based approach) or perform multilevel analysis to model the multilevel data structure (Model-based approach) to analyze dependent data. In a cross-sectional setting, the first study aims to examine the differences between the design-based single-level confirmatory factor analysis (CFA) and the model-based multilevel CFA for model fit test statistics/fit indices, and estimates of the fixed and random effects with corresponding statistical inference when analyzing multilevel data. Several design factors were considered, including: cluster number, cluster size, intra-class correlation, and the structure equality of the between-/within-level models. The performance of a maximum modeling strategy with the saturated higher-level and true lower-level model was also examined. Simulation study showed that the design-based approach provided adequate results only under equal between/within structures. However, in the unequal between/within structure scenarios, the design-based approach produced biased fixed and random effect estimates. Maximum modeling generated consistent and unbiased within-level model parameter estimates across three different scenarios. Multilevel latent growth curve modeling (MLGCM) is a versatile tool to analyze the repeated measure sampled from a multi-stage sampling. However, researchers often adopt latent growth curve models (LGCM) without considering the multilevel structure. This second study examined the influences of different model specifications on the model fit test statistics/fit indices, between/within-level regression coefficient and random effect estimates and mean structures. Simulation suggested that design-based MLGCM incorporating the higher-level covariates produces consistent parameter estimates and statistical inferences comparable to those from the model-based MLGCM and maintain adequate statistical power even with small cluster number.
10

Impact of Ignoring Nested Data Structures on Ability Estimation

Shropshire, Kevin O'Neil 03 June 2014 (has links)
The literature is clear that intentional or unintentional clustering of data elements typically results in the inflation of the estimated standard error of fixed parameter estimates. This study is unique in that it examines the impact of multilevel data structures on subject ability which are random effect predictions known as empirical Bayes estimates in the one-parameter IRT / Rasch model. The literature on the impact of complex survey design on latent trait models is mixed and there is no "best practice" established regarding how to handle this situation. A simulation study was conducted to address two questions related to ability estimation. First, what impacts does design based clustering have with respect to desirable statistical properties when estimating subject ability with the one-parameter IRT / Rasch model? Second, since empirical Bayes estimators have shrinkage properties, what impacts does clustering of first-stage sampling units have on measurement validity-does the first-stage sampling unit impact the ability estimate, and if so, is this desirable and equitable? Two models were fit to a factorial experimental design where the data were simulated over various conditions. The first model Rasch model formulated as a HGLM ignores the sample design (incorrect model) while the second incorporates a first-stage sampling unit (correct model). Study findings generally showed that the two models were comparable with respect to desirable statistical properties under a majority of the replicated conditions-more measurement error in ability estimation is found when the intra-class correlation is high and the item pool is small. In practice this is the exception rather than the norm. However, it was found that the empirical Bayes estimates were dependent upon the first-stage sampling unit raising the issue of equity and fairness in educational decision making. A real-world complex survey design with binary outcome data was also fit with both models. Analysis of the data supported the simulation design results which lead to the conclusion that modeling binary Rasch data may resort to a policy tradeoff between desirable statistical properties and measurement validity. / Ph. D.

Page generated in 0.0575 seconds