Spelling suggestions: "subject:"[een] MISSING DATA"" "subject:"[enn] MISSING DATA""
91 |
Model-based Multiple Imputation by Chained-equations for Multilevel Data below the Limit of DetectionXu, Peixin 24 May 2022 (has links)
No description available.
|
92 |
Placebo response characteristic in sequential parallel comparison design studiesRybin, Denis V. 13 February 2016 (has links)
The placebo response can affect inference in analysis of data from clinical trials. It can bias the estimate of the treatment effect, jeopardize the effort of all involved in a clinical trial and ultimately deprive patients of potentially efficacious treatment. The Sequential Parallel Comparison Design (SPCD) is one of the novel approaches addressing placebo response in clinical trials. The analysis of SPCD clinical trial data typically involves classification of subjects as ‘placebo responders’ or ‘placebo non-responders’. This classification is done using a specific criterion and placebo response is treated as a measurable characteristic. However, the use of criterion may lead to subject misclassification due to measurement error or incorrect criterion selection. Subsequently, misclassification can directly affect SPCD treatment effect estimate. We propose to view placebo response as an unknown random characteristic that can be estimated based on information collected during the trial. Two strategies are presented here. First strategy is to model placebo response using criterion classification as a starting point or the observed data, and to include the placebo response estimate into the treatment effect estimation. Second strategy is to jointly model latent placebo response and the observed data, and estimate treatment effect from the joint model. We evaluate both strategies on a wide range of simulated data scenarios in terms of type I error control, mean squared error and power. We then evaluate the strategies in presence of missing data and propose a method for missing data imputation under the non-informative missingness assumption. The data from a recent SPCD clinical trial is used to compare results of the proposed methods with reported results of the trial. / 2018-01-01T00:00:00Z
|
93 |
Impact de l’échantillonnage sur l’inférence de structures dans les réseaux : application aux réseaux d’échanges de graines et à l’écologie / Impact of sampling on structure inference in networks : application to seed exchange networks and to ecologyTabouy, Timothée 30 September 2019 (has links)
Dans cette thèse nous nous intéressons à l’étude du modèle à bloc stochastique (SBM) en présence de données manquantes. Nous proposons une classification des données manquantes en deux catégories Missing At Random et Not Missing At Random pour les modèles à variables latentes suivant le modèle décrit par D. Rubin. De plus, nous nous sommes attachés à décrire plusieurs stratégies d’échantillonnages de réseau et leurs lois. L’inférence des modèles de SBM avec données manquantes est faite par l’intermédiaire d’une adaptation de l’algorithme EM : l’EM avec approximation variationnelle. L’identifiabilité de plusieurs des SBM avec données manquantes a pu être démontrée ainsi que la consistance et la normalité asymptotique des estimateurs du maximum de vraisemblance et des estimateurs avec approximation variationnelle dans le cas où chaque dyade (paire de nœuds) est échantillonnée indépendamment et avec même probabilité. Nous nous sommes aussi intéressés aux modèles de SBM avec covariables, à leurs inférence en présence de données manquantes et comment procéder quand les covariables ne sont pas disponibles pour conduire l’inférence. Finalement, toutes nos méthodes ont été implémenté dans un package R disponible sur le CRAN. Une documentation complète sur l’utilisation de ce package a été écrite en complément. / In this thesis we are interested in studying the stochastic block model (SBM) in the presence of missing data. We propose a classification of missing data into two categories Missing At Random and Not Missing At Random for latent variable models according to the model described by D. Rubin. In addition, we have focused on describing several network sampling strategies and their distributions. The inference of SBMs with missing data is made through an adaptation of the EM algorithm : the EM with variational approximation. The identifiability of several of the SBM models with missing data has been demonstrated as well as the consistency and asymptotic normality of the maximum likelihood estimators and variational approximation estimators in the case where each dyad (pair of nodes) is sampled independently and with equal probability. We also looked at SBMs with covariates, their inference in the presence of missing data and how to proceed when covariates are not available to conduct the inference. Finally, all our methods were implemented in an R package available on the CRAN. A complete documentation on the use of this package has been written in addition.
|
94 |
Assessing Internationalization of Higher Education Research: Mixed Methods Research Quality and Missing Data Reporting PracticesMcKinley, Keanen January 2021 (has links)
No description available.
|
95 |
An Approach to Estimation and Selection in Linear Mixed Models with Missing DataLee, Yi-Ching 07 August 2019 (has links)
No description available.
|
96 |
Complications In Clinical Trials: Bayesian Models For Repeated Measures And Simulators For NonadherenceAhmad Hakeem Abdul Wahab (11186256) 28 July 2021 (has links)
<p>Clinical trials are the gold standard for inferring the causal effects of treatments or interventions. This thesis is concerned with the development of methodologies for two problems in modern clinical trials. First is analyzing binary repeated measures in clinical trials using models that reflect the complicated autocorrelation patterns in the data, so as to obtain high power when inferring treatment effects. Second is simulating realistic outcomes and subject nonadherence mechanisms in Phase III pharmaceutical clinical trials under the Tripartite Framework.</p><p> </p><p><b>Bayesian Models for Binary Repeated Data: The Bayesian General Logistic Autoregressive Model and the Polya-Gamma Logistic Autoregressive Model</b></p><p>Autoregressive processes in generalized linear mixed effects regression models are convenient for the analysis of clinical trials that have a moderate to large number of binary repeated measurements, collected across a fixed set of structured time points, for each subject. However, much of the existing literature and methods for autoregressive processes on repeated binary measurements permit only one order and only one autoregressive process in the model. This limits the flexibility of the resulting generalized linear mixed effects regression model to fully capture the dynamics in the data, which can result in decreased power for testing treatment effects. Nested autoregressive structures enable more holistic modeling of clinical trials that can lead to increased power for testing effects.</p><p> </p><p>We introduce the Bayesian General Logistic Autoregressive Model (BGLAM) for the analysis of repeated binary measures in clinical trials. The BGLAM extends previous Bayesian models for binary repeated measures by accommodating flexible and nested autoregressive processes with non-informative priors. We describe methods for selecting the order of the autoregressive process in the BGLAM based on the Deviance Information Criterion (DIC) and marginal log-likelihood, and develop an importance sampling-weighted posterior predictive p-value to test for treatment effects in BGLAM. The frequentist properties of BGLAM compared to existing likelihood- and non-likelihood-based statistical models are evaluated by means of extensive simulation studies involving different data generation mechanisms.</p><p> </p><p>Two features of BGLAM that can limit its application in practice is the computational effort involved in executing it and the inability to integrate added heterogeneity across time in its autoregressive processes. We develop the Polya-Gamma Logistic Autoregressive Model (PGLAM) for addressing these limiting features of the BGLAM. This new model enables the integration of additional layers of variability through random effects and heterogeneity across time in nested autoregressive processes. Furthermore, PGLAM is computationally more efficient than BGLAM because it eliminates the need to use the complex types of samplers for truncated latent variables that is involved in the Markov Chain Monte Carlo algorithm for BGLAM.</p><p> </p><p><b>Data Generating Model for Phase III Clinical Trials With Intercurrent Events</b></p><p>Although clinical trials are designed with strict controls, inevitably complications will arise during the course of the trials. One significant type of complication is missing subject outcomes due to subject drop-out or nonadherence during the trial, which are referred to in general as intercurrent events. This complication can arise from, among other causes, adverse reactions, lack of efficacy of the assigned treatment, administrative reasons, and excess efficacy from the assigned treatment. Intercurrent events typically confound causal inferences on the effects of the treatments under investigation because the missingness that occurs as a result corresponds to a Missing Not at Random missing data mechanism, the pharmaceutical industry is increasingly focused on developing methods for obtaining valid causal inferences on the receipt of treatment in clinical trials with intercurrent events. However, it is extremely difficult to compare the frequentist properties and performance of these competing methods, as real-life clinical trial data cannot be easily accessed or shared, and as the different methods consider distinct assumptions for the underlying data generating mechanism in the clinical trial. We develop a novel simulation model for clinical trials with intercurrent events. Our simulator operates under the Rubin Causal Model. We implement the simulator by means of an R Shiny application. This app enables users to control patient compliance through different sources of discontinuity with varying functional trends, and understand the frequentist properties of treatment effect estimators obtained by different models for various estimands.</p>
|
97 |
New Technique for Imputing Missing Item Responses for an Ordinal Variable: Using Tennessee Youth Risk Behavior Survey as an Example.Ahmed, Andaleeb Abrar 15 December 2007 (has links) (PDF)
Surveys ordinarily ask questions in an ordinal scale and often result in missing data. We suggest a regression based technique for imputing missing ordinal data. Multilevel cumulative logit model was used with an assumption that observed responses of certain key variables can serve as covariate in predicting missing item responses of an ordinal variable. Individual predicted probabilities at each response level were obtained. Average individual predicted probabilities for each response level were used to randomly impute the missing responses using a uniform distribution. Finally, likelihood ratio chi square statistics was used to compare the imputed and observed distributions. Two other forms of multiple imputation algorithms were performed for comparison. Performance of our imputation technique was comparable to other 2 established algorithms. Our method being simpler does not involve any complex algorithms and with further research can potentially be used as an imputation technique for missing ordinal variables.
|
98 |
Autologous Stem Cell Transplant: Factors Predicting the Yield of CD34+ CellsLawson, Elizabeth Anne 02 December 2005 (has links) (PDF)
Stem cell transplant is often considered the last hope for the survival for many cancer patients. The CD34+ cell content of a collection of stem cells has appeared as the most reliable indicator of the quantity of desired cells in a peripheral blood stem cell harvest and is used as a surrogate measure of the sample quality. Factors predicting the yield of CD34+ cells in a collection are not yet fully understood. Throughout the literature, there has been conflicting evidence with regards to age, gender, disease status, and prior radiation. In addition to the factors that have already been explored, we are interested in finding a cancer-chemotherapy interaction and to develop a predictive model to better identify which patients will be good candidates for this procedure. Because the amount of CD34+ cells is highly skewed, most traditional statistical methods are inappropriate without some transformation. A Bayesian generalized regression model was used to explain the variation of CD34+ collected from the sample by the cancer chemotherapy interaction. Missing data was modeled as unknown parameters to include the entire data set in the analysis. Posterior estimates are obtained using Markov chain methods. Posterior distributions identified weight and gender as well as some cancer-chemotherapy interactions as significant factors. Predictive posterior distributions can be used to identify which patients are good candidates for this procedure.
|
99 |
TOWARD ROBUST AND INTERPRETABLE GRAPH AND IMAGE REPRESENTATION LEARNINGJuan Shu (14816524) 27 April 2023 (has links)
<p>Although deep learning models continue to gain momentum, their robustness and interpretability have always been a big concern because of the complexity of such models. In this dissertation, we studied several topics on the robustness and interpretability of convolutional neural networks (CNNs) and graph neural networks (GNNs). We first identified the structural problem of deep convolutional neural networks that leads to the adversarial examples and defined DNN uncertainty regions. We also argued that the generalization error, the large sample theoretical guarantee established for DNN, cannot adequately capture the phenomenon of adversarial examples. Secondly, we studied the dropout in GNNs, which is an effective regularization approach to prevent overfitting. Contrary to CNN, GNN usually has a shallow structure because a deep GNN normally sees performance degradation. We studied different dropout schemes and established a connection between dropout and over-smoothing in GNNs. Therefore we developed layer-wise compensation dropout, which allows GNN to go deeper without suffering performance degradation. We also developed a heteroscedastic dropout which effectively deals with a large number of missing node features due to heavy experimental noise or privacy issues. Lastly, we studied the interpretability of graph neural networks. We developed a self-interpretable GNN structure that denoises useless edges or features, leading to a more efficient message-passing process. The GNN prediction and explanation accuracy were boosted compared with baseline models. </p>
|
100 |
Is It More Advantageous to Administer Libqual+® Lite Over Libqual+®? an Analysis of Confidence Intervals, Root Mean Square Errors, and BiasPonce, Hector F. 08 1900 (has links)
The Association of Research Libraries (ARL) provides an option for librarians to administer a combination of LibQUAL+® and LibQUAL+® Lite to measure users' perceptions of library service quality. LibQUAL+® Lite is a shorter version of LibQUAL+® that uses planned missing data in its design. The present study investigates the loss of information in commonly administered proportions of LibQUAL+® and LibQUAL+® Lite when compared to administering LibQUAL+® alone. Data from previous administrations of LibQUAL+® protocol (2005, N = 525; 2007, N = 3,261; and 2009, N = 2,103) were used to create simulated datasets representing various proportions of LibQUAL+® versus LibQUAL+® Lite administration (0.2:0.8, 0.4:0.6. 0.5:0.5, 0.6:0.4, and 0.8:0.2). Statistics (i.e., means, adequacy and superiority gaps, standard deviations, Pearson product-moment correlation coefficients, and polychoric correlation coefficients) from simulated and real data were compared. Confidence intervals captured the original values. Root mean square errors and absolute and relative biases of correlations showed that accuracy in the estimates decreased with increase in percentage of planned missing data. The recommendation is to avoid using combinations with more than 20% planned missing data.
|
Page generated in 0.0656 seconds