Global ETD Search

61	Missing Data Methods for Clustered Longitudinal Data Modur, Sharada P. 30 August 2010 (has links) No description available. Statistics Longitudinal models multilevel models missing data analysis
62	A Comparison of Last Observation Carried Forward and Multiple Imputation in a Longitudinal Clinical Trial Carmack, Tara Lynn 25 June 2012 (has links) No description available. Biostatistics LOCF Multiple Imputation missing data randomized clinical trials
63	Statistical inferences for missing data/causal inferences based on modified empirical likelihood Sharghi, Sima 01 September 2021 (has links) No description available. Statistics
64	A phylogeny and revised classification of Squamata, including 4161 species of lizards and snakes Pyron, R., Burbrink, Frank, Wiens, John January 2013 (has links) BACKGROUND:The extant squamates (>9400 known species of lizards and snakes) are one of the most diverse and conspicuous radiations of terrestrial vertebrates, but no studies have attempted to reconstruct a phylogeny for the group with large-scale taxon sampling. Such an estimate is invaluable for comparative evolutionary studies, and to address their classification. Here, we present the first large-scale phylogenetic estimate for Squamata.RESULTS:The estimated phylogeny contains 4161 species, representing all currently recognized families and subfamilies. The analysis is based on up to 12896 base pairs of sequence data per species (average = 2497 bp) from 12 genes, including seven nuclear loci (BDNF, c-mos, NT3, PDC, R35, RAG-1, and RAG-2), and five mitochondrial genes (12S, 16S, cytochrome b, ND2, and ND4). The tree provides important confirmation for recent estimates of higher-level squamate phylogeny based on molecular data (but with more limited taxon sampling), estimates that are very different from previous morphology-based hypotheses. The tree also includes many relationships that differ from previous molecular estimates and many that differ from traditional taxonomy.CONCLUSIONS:We present a new large-scale phylogeny of squamate reptiles that should be a valuable resource for future comparative studies. We also present a revised classification of squamates at the family and subfamily level to bring the taxonomy more in line with the new phylogenetic hypothesis. This classification includes new, resurrected, and modified subfamilies within gymnophthalmid and scincid lizards, and boid, colubrid, and lamprophiid snakes. Amphisbaenia Lacertilia Likelihood support measures Missing data Serpentes Squamata Phylogenetics Reptilia Supermatrices Systematics
65	Statistical Approaches for Handling Missing Data in Cluster Randomized Trials Fiero, Mallorie H. January 2016 (has links) In cluster randomized trials (CRTs), groups of participants are randomized as opposed to individual participants. This design is often chosen to minimize treatment arm contamination or to enhance compliance among participants. In CRTs, we cannot assume independence among individuals within the same cluster because of their similarity, which leads to decreased statistical power compared to individually randomized trials. The intracluster correlation coefficient (ICC) is crucial in the design and analysis of CRTs, and measures the proportion of total variance due to clustering. Missing data is a common problem in CRTs and should be accommodated with appropriate statistical techniques because they can compromise the advantages created by randomization and are a potential source of bias. In three papers, I investigate statistical approaches for handling missing data in CRTs. In the first paper, I carry out a systematic review evaluating current practice of handling missing data in CRTs. The results show high rates of missing data in the majority of CRTs, yet handling of missing data remains suboptimal. Fourteen (16%) of the 86 reviewed trials reported carrying out a sensitivity analysis for missing data. Despite suggestions to weaken the missing data assumption from the primary analysis, only five of the trials weakened the assumption. None of the trials reported using missing not at random (MNAR) models. Due to the low proportion of CRTs reporting an appropriate sensitivity analysis for missing data, the second paper aims to facilitate performing a sensitivity analysis for missing data in CRTs by extending the pattern mixture approach for missing clustered data under the MNAR assumption. I implement multilevel multiple imputation (MI) in order to account for the hierarchical structure found in CRTs, and multiply imputed values by a sensitivity parameter, k, to examine parameters of interest under different missing data assumptions. The simulation results show that estimates of parameters of interest in CRTs can vary widely under different missing data assumptions. A high proportion of missing data can occur among CRTs because missing data can be found at the individual level as well as the cluster level. In the third paper, I use a simulation study to compare missing data strategies to handle missing cluster level covariates, including the linear mixed effects model, single imputation, single level MI ignoring clustering, MI incorporating clusters as fixed effects, and MI at the cluster level using aggregated data. The results show that when the ICC is small (ICC ≤ 0.1) and the proportion of missing data is low (≤ 25\%), the mixed model generates unbiased estimates of regression coefficients and ICC. When the ICC is higher (ICC > 0.1), MI at the cluster level using aggregated data performs well for missing cluster level covariates, though caution should be taken if the percentage of missing data is high. Dropout Missing data Multiple imputation Pattern mixture model Sensitivity analysis Biostatistics Cluster randomized trials
66	Multiple Imputation Methods for Nonignorable Nonresponse, Adaptive Survey Design, and Dissemination of Synthetic Geographies Paiva, Thais Viana January 2014 (has links) <p>This thesis presents methods for multiple imputation that can be applied to missing data and data with confidential variables. Imputation is useful for missing data because it results in a data set that can be analyzed with complete data statistical methods. The missing data are filled in by values generated from a model fit to the observed data. The model specification will depend on the observed data pattern and the missing data mechanism. For example, when the reason why the data is missing is related to the outcome of interest, that is nonignorable missingness, we need to alter the model fit to the observed data to generate the imputed values from a different distribution. Imputation is also used for generating synthetic values for data sets with disclosure restrictions. Since the synthetic values are not actual observations, they can be released for statistical analysis. The interest is in fitting a model that approximates well the relationships in the original data, keeping the utility of the synthetic data, while preserving the confidentiality of the original data. We consider applications of these methods to data from social sciences and epidemiology.</p><p>The first method is for imputation of multivariate continuous data with nonignorable missingness. Regular imputation methods have been used to deal with nonresponse in several types of survey data. However, in some of these studies, the assumption of missing at random is not valid since the probability of missing depends on the response variable. We propose an imputation method for multivariate data sets when there is nonignorable missingness. We fit a truncated Dirichlet process mixture of multivariate normals to the observed data under a Bayesian framework to provide flexibility. With the posterior samples from the mixture model, an analyst can alter the estimated distribution to obtain imputed data under different scenarios. To facilitate that, I developed an R application that allows the user to alter the values of the mixture parameters and visualize the imputation results automatically. I demonstrate this process of sensitivity analysis with an application to the Colombian Annual Manufacturing Survey. I also include a simulation study to show that the correct complete data distribution can be recovered if the true missing data mechanism is known, thus validating that the method can be meaningfully interpreted to do sensitivity analysis.</p><p>The second method uses the imputation techniques for nonignorable missingness to implement a procedure for adaptive design in surveys. Specifically, I develop a procedure that agencies can use to evaluate whether or not it is effective to stop data collection. This decision is based on utility measures to compare the data collected so far with potential follow-up samples. The options are assessed by imputation of the nonrespondents under different missingness scenarios considered by the analyst. The variation in the utility measures is compared to the cost induced by the follow-up sample sizes. We apply the proposed method to the 2007 U.S. Census of Manufactures.</p><p>The third method is for imputation of confidential data sets with spatial locations using disease mapping models. We consider data that include fine geographic information, such as census tract or street block identifiers. This type of data can be difficult to release as public use files, since fine geography provides information that ill-intentioned data users can use to identify individuals. We propose to release data with simulated geographies, so as to enable spatial analyses while reducing disclosure risks. We fit disease mapping models that predict areal-level counts from attributes in the file, and sample new locations based on the estimated models. I illustrate this approach using data on causes of death in North Carolina, including evaluations of the disclosure risks and analytic validity that can result from releasing synthetic geographies.</p> / Dissertation Statistics Confidential data Missing data Multiple Imputation Nonignorable missingness Spatial model Synthetic data
67	Estimation of Aerodynamic Parameters in Real-Time : Implementation and Comparison of a Sequential Frequency Domain Method and a Batch Method Nyman, Lina January 2016 (has links) The flight testing and evaluation of collected data must be efficient during intensive flight-test programs such as the ones conducted during development of new aircraft. The aim of this thesis has thus been to produce a first version of an aerodynamic derivative estimation program that is to be used during real-time flight tests. The program is to give a first estimate of the aerodynamic derivatives as well as check the quality of the data collected and thus serve as a decision support during tests. The work that has been performed includes processing of data in order to use it in computations, comparing a batch and a sequential estimation method using real-time data and programming a user interface. All computations and programming has been done in Matlab. The estimation methods that have been compared are both built on transforming data to the frequency domain using a Chirp z-transform and then estimating the aerodynamic derivatives using complex least squares with instrumental variables.The sequential frequency domain method performs estimates at a given interval while the batch method performs one estimation at the end of the maneuver. Both methods compared in this thesis produce equal results. The continuous updates of the sequential method was however found to be better suited for a real-time application than the single estimation of the batch method. The telemetric data received from the aircraft must be synchronized to a common frequency of 60 Hz. Missing samples of the data stream must be linearly interpolated and different units of measured parameters must be corrected in order to be able to perform these estimations in the real-time test environment. Aerodynamic coefficients least squares chirp z-transform instrumental variables missing data
68	LATENT VARIABLE MODELS GIVEN INCOMPLETELY OBSERVED SURROGATE OUTCOMES AND COVARIATES Ren, Chunfeng 01 January 2014 (has links) Latent variable models (LVMs) are commonly used in the scenario where the outcome of the main interest is an unobservable measure, associated with multiple observed surrogate outcomes, and affected by potential risk factors. This thesis develops an approach of efficient handling missing surrogate outcomes and covariates in two- and three-level latent variable models. However, corresponding statistical methodologies and computational software are lacking efficiently analyzing the LVMs given surrogate outcomes and covariates subject to missingness in the LVMs. We analyze the two-level LVMs for longitudinal data from the National Growth of Health Study where surrogate outcomes and covariates are subject to missingness at any of the levels. A conventional method for efficient handling of missing data is to reexpress the desired model as a joint distribution of variables, including the surrogate outcomes that are subject to missingness conditional on all of the covariates that are completely observable, and estimate the joint model by maximum likelihood, which is then transformed to the desired model. The joint model, however, identifies more parameters than desired, in general. The over-identified joint model produces biased estimates of LVMs so that it is most necessary to describe how to impose constraints on the joint model so that it has a one-to-one correspondence with the desired model for unbiased estimation. The constrained joint model handles missing data efficiently under the assumption of ignorable missing data and is estimated by a modified application of the expectation-maximization (EM) algorithm. Biostatistics
69	A Comparison for Longitudinal Data Missing Due to Truncation Liu, Rong 01 January 2006 (has links) Many longitudinal clinical studies suffer from patient dropout. Often the dropout is nonignorable and the missing mechanism needs to be incorporated in the analysis. The methods handling missing data make various assumptions about the missing mechanism, and their utility in practice depends on whether these assumptions apply in a specific application. Ramakrishnan and Wang (2005) proposed a method (MDT) to handle nonignorable missing data, where missing is due to the observations exceeding an unobserved threshold. Assuming that the observations arise from a truncated normal distribution, they suggested an EM algorithm to simplify the estimation.In this dissertation the EM algorithm is implemented for the MDT method when data may include missing at random (MAR) cases. A data set, where the missing data occur due to clinical deterioration and/or improvement is considered for illustration. The missing data are observed at both ends of the truncated normal distribution. A simulation study is conducted to compare the performance of other relevant methods. The factors chosen for the simulation study included, the missing data mechanisms, the forms of response functions, missing at one or two time points, dropout rates, sample sizes and different correlations with AR(1) structure. It was found that the choice of the method for dealing with the missing data is important, especially when a large proportion is missing. The MDT method seems to perform the best when there is reason to believe that the assumption of truncated normal distribution is appropriate.A multiple imputation (MI) procedure under the MDT method to accommodate the uncertainty introduced by imputation is also proposed. The proposed method combines the MDT method with Rubin's (1987) MI method. A procedure to implement the MI method is described. missing data data sets patient dropout algorithm simulation MDT Biostatistics Physical Sciences and Mathematics Statistics and Probability
70	Bushing diagnosis using artificial intelligence and dissolved gas analysis Dhlamini, Sizwe Magiya 20 June 2008 (has links) This dissertation is a study of artificial intelligence for diagnosing the condition of high voltage bushings. The techniques include neural networks, genetic algorithms, fuzzy set theory, particle swarm optimisation, multi-classifier systems, factor analysis, principal component analysis, multidimensional scaling, data-fusion techniques, automatic relevance determination and autoencoders. The classification is done using Dissolved Gas Analysis (DGA) data based on field experience together with criteria from IEEEc57.104 and IEC60599. A review of current literature showed that common methods for the diagnosis of bushings are: partial discharge, DGA, tan- (dielectric dissipation factor), water content in oil, dielectric strength of oil, acidity level (neutralisation value), visual analysis of sludge in suspension, colour of the oil, furanic content, degree of polymerisation (DP), strength of the insulating paper, interfacial tension or oxygen content tests. All the methods have limitations in terms of time and accuracy in decision making. The fact that making decisions using each of these methods individually is highly subjective, also the huge size of the data base of historical data, as well as the loss of skills due to retirement of experienced technical staff, highlights the need for an automated diagnosis tool that integrates information from the many sensors and recalls the historical decisions and learns from new information. Three classifiers that are compared in this analysis are radial basis functions (RBF), multiple layer perceptrons (MLP) and support vector machines (SVM). In this work 60699 bushings were classified based on ten criteria. Classification was done based on a majority vote. The work proposes the application of neural networks with particle swarm optimisation (PSO) and genetic algorithms (GA) to compensate for missing data in classifying high voltage bushings. The work also proposes the application of fuzzy set theory (FST) to diagnose the condition of high voltage bushings. The relevance and redundancy detection methods were able to prune the redundant measured variables and accurately diagnose the condition of the bushing with fewer variables. Experimental results from bushings that were evaluated in the field verified the simulations. The results of this work can help to develop real-time monitoring and decision making tools that combine information from chemical, electrical and mechanical measurements taken from bushings. Bushings Oil impregnated paper neural networks fuzzy set theory missing data relevance determination

Search results