• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 127
  • 25
  • 20
  • 17
  • 4
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 250
  • 250
  • 77
  • 53
  • 53
  • 52
  • 35
  • 33
  • 31
  • 25
  • 25
  • 24
  • 23
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

A Monte Carlo Study of Single Imputation in Survey Sampling

Xu, Nuo January 2013 (has links)
Missing values in sample survey can lead to biased estimation if not treated. Imputation was posted asa popular way to deal with missing values. In this paper, based on Särndal (1994, 2005)’s research, aMonte-Carlo simulation is conducted to study how the estimators work in different situations and howdifferent imputation methods work for different response distributions.
12

An Investigation of Methods for Missing Data in Hierarchical Models for Discrete Data

Ahmed, Muhamad Rashid January 2011 (has links)
Hierarchical models are applicable to modeling data from complex surveys or longitudinal data when a clustered or multistage sample design is employed. The focus of this thesis is to investigate inference for discrete hierarchical models in the presence of missing data. This thesis is divided into two parts: in the first part, methods are developed to analyze the discrete and ordinal response data from hierarchical longitudinal studies. Several approximation methods have been developed to estimate the parameters for the fixed and random effects in the context of generalized linear models. The thesis focuses on two likelihood-based estimation procedures, the pseudo likelihood (PL) method and the adaptive Gaussian quadrature (AGQ) method. The simulation results suggest that AGQ is preferable to PL when the goal is to estimate the variance of the random intercept in a complex hierarchical model. AGQ provides smaller biases for the estimate of the variance of the random intercept. Furthermore, it permits greater flexibility in accommodating user-defined likelihood functions. In the second part, simulated data are used to develop a method for modeling longitudinal binary data when non-response depends on unobserved responses. This simulation study modeled three-level discrete hierarchical data with 30% and 40% missing data using a missing not at random (MNAR) missing-data mechanism. It focused on a monotone missing data-pattern. The imputation methods used in this thesis are: complete case analysis (CCA), last observation carried forward (LOCF), available case missing value (ACMVPM) restriction, complete case missing value (CCMVPM) restriction, neighboring case missing value (NCMVPM) restriction, selection model with predictive mean matching method (SMPM), and Bayesian pattern mixture model. All three restriction methods and the selection model used the predictive mean matching method to impute missing data. Multiple imputation is used to impute the missing values. These m imputed values for each missing data produce m complete datasets. Each dataset is analyzed and the parameters are estimated. The results from the m analyses are then combined using the method of Rubin(1987), and inferences are made from these results. Our results suggest that restriction methods provide results that are superior to those of other methods. The selection model provides smaller biases than the LOCF methods but as the proportion of missing data increases the selection model is not better than LOCF. Among the three restriction methods the ACMVPM method performs best. The proposed method provides an alternative to standard selection and pattern-mixture modeling frameworks when data are not missing at random. This method is applied to data from the third Waterloo Smoking Project, a seven-year smoking prevention study having substantial non-response due to loss-to-follow-up.
13

The handling, analysis and reporting of missing data in patient reported outcome measures for randomised controlled trials

Rombach, Ines January 2016 (has links)
Missing data is a potential source of bias in the results of randomised controlled trials (RCTs), which can have a negative impact on guidance derived from them, and ultimately patient care. This thesis aims to improve the understanding, handling, analysis and reporting of missing data in patient reported outcome measures (PROMs) for RCTs. A review of the literature provided evidence of discrepancies between recommended methodology and current practice in the handling and reporting of missing data. Particularly, missed opportunities to minimise missing data, the use of inappropriate analytical methods and lack of sensitivity analyses were noted. Missing data patterns were examined and found to vary between PROMs as well as across RCTs. Separate analyses illustrated difficulties in predicting missing data, resulting in uncertainty about assumed underlying missing data mechanisms. Simulation work was used to assess the comparative performance of statistical approaches for handling missing available in standard statistical software. Multiple imputation (MI) at either the item, subscale or composite score level was considered for missing PROMs data at a single follow-up time point. The choice of an MI approach depended on a multitude of factors, with MI at the item level being more beneficial than its alternatives for high proportions of item missingness. The approaches performed similarly for high proportions of unit-nonresponse; however, convergence issues were observed for MI at the item level. Maximum likelihood (ML), MI and inverse probability weighting (IPW) were evaluated for handling missing longitudinal PROMs data. MI was less biased than ML when additional post-randomisation data were available, while IPW introduced more bias compared to both ML and MI. A case study was used to explore approaches to sensitivity analyses to assess the impact of missing data. It was found that trial results could be susceptible to varying assumptions about missing data, and the importance of interpreting the results in this context was reiterated. This thesis provides researchers with guidance for the handling and reporting of missing PROMs data in order to decrease bias arising from missing data in RCTs.
14

Applying missing data methods to routine data using the example of a population-based register of patients with diabetes

Read, Stephanie Helen January 2015 (has links)
Background: Routinely-collected data offer great potential for epidemiological research and could be used to make randomised controlled trials (RCTs) more efficient. The use of routine data for research has been limited by concerns surrounding data quality, particularly data completeness. To fully exploit these information-rich data sources it is necessary to identify approaches capable of overcoming high proportions of missing data. Using a 2008 extract of the Scottish Care Information – Diabetes Collaboration (SCIDC) database, a population-based register of people with a diagnosis of diabetes in Scotland, I compared the findings of several methods for handling missing data in a retrospective cohort study investigating the association between body mass index (BMI) and all-cause mortality in patients with type 2 diabetes. Methods: Discussions with clinicians and logistic regression analyses were used to determine the likely mechanisms of missingness and the relative appropriateness of a selection of missing data methods, such as multiple imputation. Sequentially more complicated imputation approaches were used to handle missing data. Cox proportional hazard model coefficients for the association between BMI and all-cause mortality were compared for each missing data method. Age-standardised mortality rates by categories of BMI at around the time of diagnosis were also presented. Results: There were 66,472 patients diagnosed with type 2 DM between 2004 and 2008. Of these patients, 21% of patients did not have a recording of BMI at time of diagnosis. Amongst patients with complete BMI data, there were 5,491 deaths during 296,584 person years of follow-up. Amongst patients with incomplete data, there were 2,090 deaths during 79,067 person-years of follow-up. Analyses indicated that the primary mechanism of missingness was missing at random, conditional on patient year of diagnosis and vital status. In particular, patients with missing data had considerably worse survival than patients without missing data. Regardless of the method for handling the missing data, a U-shaped relationship between BMI and mortality was observed. Compared to complete case analysis, the association between BMI and alliii cause mortality was weaker using multiple imputation approaches with estimates moving towards the null. Closest observation imputation had the smallest effect on estimates compared to complete case analysis. Risk of mortality was consistently highest in the less than 25kg/m² BMI group. For example, estimates obtained using multiple imputation using chained equations indicated that patients with a BMI below 25kg/m² had a 38% higher risk of mortality than patients in the 25 to less than 30kg/m² BMI category. Conclusions: Alternative methods to complete case analysis can be computationally intensive with many important practical considerations. However, it remains valuable to explore the robustness of estimates to departures from the assumptions made by complete case analysis. The use of these methods can preserve the sample size and therefore may be useful in developing risk prediction scores. Mortality was lowest amongst overweight or obese patients relative to normal weight. Further work is required to identify optimal approaches to weight management amongst patients with diabetes.
15

Study on a Hierarchy Model

Che, Suisui 23 March 2012 (has links)
The statistical inferences about the parameters of Binomial-Poisson hierarchy model are discussed. Based on the estimators of paired observations we consider the other two cases with extra observations on both the first and second layer of the model. The MLEs of lambda and p are derived and it is also proved the MLE lambda is also the UMVUE of lambda. By using multivariate central limit theory and large sample theory, both the estimators based on extra observations on the first and second layer are obtained respectively. The performances of the estimators are compared numerically based on extensive Monte Carlo simulation. Simulation studies indicate that the performance of the estimators is more efficient than those only based on paired observations. Inference about the confidence interval for p is presented for both cases. The efficiency of the estimators is compared with condition given that same number of extra observations is provided.
16

Longitudinal Data Analysis Using Generalized Linear Model with Missing Responses

Park, Jeanseong January 2015 (has links)
Longitudinal studies rely on data collected at several occasions from a set of selected individuals. The purpose of these studies is to use a regression-type model to express a response variable as a function of explanatory variables, or covariates. In this thesis, we use marginal models for the analysis of such data, which, coupled with the method of estimating equations, provide estimators of the main regression parameter. When some of the responses are missing or there is error in the recorded covariates, the original estimating equation may be biased. We use techniques available in the literature to modify it and regain the unbiasedness property. We prove the asymptotic normality of the regression estimator obtained under these more realistic circumstances, and provide theoretical and numerical examples to illustrate this approach.
17

Comparative evaluation of methods that adjust for reporting biases in participatory surveillance systems

Baltrusaitis, Kristin 12 November 2019 (has links)
Over the past decade the widespread proliferation of mobile devices and wearable technology has significantly changed the landscape of epidemiological data gathering and evolved into a field known as Digital Epidemiology. One source of active digital data collection is online participatory syndromic surveillance systems. These systems actively engage the general public in reporting health-related information and provide timely information about disease trends within the community. This dissertation comprehensively addresses how researchers can effectively use this type of data to answer questions about Influenza-like Illness (ILI) disease burden in the general population. We assess the representativeness and reporting habits of volunteers for these systems and use this information to develop statistically rigorous methods that adjust for potential biases. Specifically, we evaluate how different missing data methods, such as complete case and multiple imputation models, affect estimates of ILI disease burden using both simulated data as well as data from the Australian system, Flutracking.net. We then extend these methods to data from the American system, Flu Near You, which has different patterns. Finally, we provide examples of how this data has been used to answer questions about ILI in the general community and promote better understanding of disease surveillance and data literacy among volunteers.
18

New statistical methods for the evaluation of effectivenss and safety of a medical intervention in using observational data

Zhan, Jia 05 December 2016 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Observational studies offer unique advantages over randomized clinical trials (RCTs) in many situations where RCTs are not feasible or suffer from major limitations such as insufficient sample sizes and narrowly focused populations. Because observational data are relatively easy and inexpensive to access, and contain rich and comprehensive demographic and medical information on large and representative populations, they have played a major role in the assessment of the effectiveness and safety of medical interventions. However, observational data also have the challenges of higher rates of missing data and the confounding effect. My proposal is on the development of three statistical methods to address these challenges. The first method is on the refinement and extension of a multiply robust (MR) estimation procedure that simultaneously accounts for the confounding effect and missing covariate process, where we derived the asymptotic variance estimator and extended the method to the scenario where the missing covariate is continuous. The second method focuses on the improvement of estimation precision in an RCT by a historical control cohort. This was achieved through augmenting the conventional effect estimator with an extra mean zero (approximately) term correlated with the conventional effect estimator. In the third method, we calibrated the hidden database bias of an electronic medical records database and utilized an empirical Bayes method to improve the accuracy of the estimation of the risk of acute myocardial infarction associated with a drug by borrowing information from other drugs.
19

The Role of Missing Data Imputation in Clinical Studies

Peng, Zhimin January 2018 (has links)
No description available.
20

The wild bootstrap resampling in regression imputation algorithm with a Gaussian Mixture Model

Mat Jasin, A., Neagu, Daniel, Csenki, Attila 08 July 2018 (has links)
Yes / Unsupervised learning of finite Gaussian mixture model (FGMM) is used to learn the distribution of population data. This paper proposes the use of the wild bootstrapping to create the variability of the imputed data in single miss-ing data imputation. We compare the performance and accuracy of the proposed method in single imputation and multiple imputation from the R-package Amelia II using RMSE, R-squared, MAE and MAPE. The proposed method shows better performance when compared with the multiple imputation (MI) which is indeed known as the golden method of missing data imputation techniques.

Page generated in 0.0917 seconds