Global ETD Search

1	Informative censoring with an imprecise anchor event: estimation of change over time and implications for longitudinal data analysis Collins, Jamie Elizabeth 22 January 2016 (has links) A number of methods have been developed to analyze longitudinal data with dropout. However, there is no uniformly accepted approach. Model performance, in terms of the bias and accuracy of the estimator, depends on the underlying missing data mechanism and it is unclear how existing methods will perform when little is known about the missing data mechanism. Here we evaluate methods for estimating change over time in longitudinal studies with informative dropout in three settings: using a linear mixed effect (LME) estimator in the presence of multiple types of dropout; proposing an update to the pattern mixture modeling (PMM) approach in the presence of imprecision in identifying informative dropouts; and utilizing this new approach in the presence of prognostic factor by dropout interaction. We demonstrate that amount of dropout, the proportion of dropout that is informative, and the variability in outcome all affect the performance of an LME estimator in data with a mixture of informative and non-informative dropout. When the amount of dropout is moderate to large (>20% overall) the potential for relative bias greater than 10% increases, especially with large variability in outcome measure, even under scenarios where only a portion of the dropouts are informative. Under conditions where LME models do not perform well, it is necessary to take the missing data mechanism into account. We develop a method that extends the PMM approach to account for uncertainty in identifying informative dropouts. In scenarios with this uncertainty, the proposed method outperformed the traditional method in terms of bias and coverage. In the presence of interaction between dropout and a prognostic factor, the LME model performed poorly, in terms of bias and coverage, in estimating prognostic factor-specific slopes and the interaction between the prognostic factor and time. The update to the PMM approach, proposed here, outperformed both the LME and traditional PMM. Our work suggests that investigators must be cautious with any analysis of data with informative dropout. We found that particular attention must be paid to the model assumptions when the missing data mechanism is not well understood. Biostatistics Informative censoring Longitudinal study Missing data Pattern mixture model
2	Statistical Approaches for Handling Missing Data in Cluster Randomized Trials Fiero, Mallorie H. January 2016 (has links) In cluster randomized trials (CRTs), groups of participants are randomized as opposed to individual participants. This design is often chosen to minimize treatment arm contamination or to enhance compliance among participants. In CRTs, we cannot assume independence among individuals within the same cluster because of their similarity, which leads to decreased statistical power compared to individually randomized trials. The intracluster correlation coefficient (ICC) is crucial in the design and analysis of CRTs, and measures the proportion of total variance due to clustering. Missing data is a common problem in CRTs and should be accommodated with appropriate statistical techniques because they can compromise the advantages created by randomization and are a potential source of bias. In three papers, I investigate statistical approaches for handling missing data in CRTs. In the first paper, I carry out a systematic review evaluating current practice of handling missing data in CRTs. The results show high rates of missing data in the majority of CRTs, yet handling of missing data remains suboptimal. Fourteen (16%) of the 86 reviewed trials reported carrying out a sensitivity analysis for missing data. Despite suggestions to weaken the missing data assumption from the primary analysis, only five of the trials weakened the assumption. None of the trials reported using missing not at random (MNAR) models. Due to the low proportion of CRTs reporting an appropriate sensitivity analysis for missing data, the second paper aims to facilitate performing a sensitivity analysis for missing data in CRTs by extending the pattern mixture approach for missing clustered data under the MNAR assumption. I implement multilevel multiple imputation (MI) in order to account for the hierarchical structure found in CRTs, and multiply imputed values by a sensitivity parameter, k, to examine parameters of interest under different missing data assumptions. The simulation results show that estimates of parameters of interest in CRTs can vary widely under different missing data assumptions. A high proportion of missing data can occur among CRTs because missing data can be found at the individual level as well as the cluster level. In the third paper, I use a simulation study to compare missing data strategies to handle missing cluster level covariates, including the linear mixed effects model, single imputation, single level MI ignoring clustering, MI incorporating clusters as fixed effects, and MI at the cluster level using aggregated data. The results show that when the ICC is small (ICC ≤ 0.1) and the proportion of missing data is low (≤ 25\%), the mixed model generates unbiased estimates of regression coefficients and ICC. When the ICC is higher (ICC > 0.1), MI at the cluster level using aggregated data performs well for missing cluster level covariates, though caution should be taken if the percentage of missing data is high. Dropout Missing data Multiple imputation Pattern mixture model Sensitivity analysis Biostatistics Cluster randomized trials
3	Examining Random-Coeffcient Pattern-Mixture Models forLongitudinal Data with Informative Dropout Bishop, Brenden 07 December 2017 (has links) No description available. Psychology Pattern-Mixture Model Longitudinal Dropout Missing Data NMAR Nonignorable Missingness
4	Analysis of survey data in the presence of non-ignorable missing-data and selection mechanisms Hammon, Angelina 04 July 2023 (has links) Diese Dissertation beschäftigt sich mit Methoden zur Behandlung von nicht-ignorierbaren fehlenden Daten und Stichprobenverzerrungen – zwei häufig auftretenden Problemen bei der Analyse von Umfragedaten. Beide Datenprobleme können die Qualität der Analyseergebnisse erheblich beeinträchtigen und zu irreführenden Inferenzen über die Population führen. Daher behandle ich innerhalb von drei verschiedenen Forschungsartikeln, Methoden, die eine Durchführung von sogenannten Sensitivitätsanalysen in Bezug auf Missing- und Selektionsmechanismen ermöglichen und dabei auf typische Survey-Daten angewandt werden können. Im Rahmen des ersten und zweiten Artikels entwickele ich Verfahren zur multiplen Imputation von binären und ordinal Mehrebenen-Daten, welche es zulassen, einen potenziellen Missing Not at Random (MNAR) Mechanismus zu berücksichtigen. In unterschiedlichen Simulationsstudien konnte bestätigt werden, dass die neuen Imputationsmethoden in der Lage sind, in allen betrachteten Szenarien unverzerrte sowie effiziente Schätzungen zuliefern. Zudem konnte ihre Anwendbarkeit auf empirische Daten aufgezeigt werden. Im dritten Artikel untersuche ich ein Maß zur Quantifizierung und Adjustierung von nicht ignorierbaren Stichprobenverzerrungen in Anteilswerten, die auf der Basis von nicht-probabilistischen Daten geschätzt wurden. Es handelt sich hierbei um die erste Anwendung des Index auf eine echte nicht-probabilistische Stichprobe abseits der Forschergruppe, die das Maß entwickelt hat. Zudem leite ich einen allgemeinen Leitfaden für die Verwendung des Index in der Praxis ab und validiere die Fähigkeit des Maßes vorhandene Stichprobenverzerrungen korrekt zu erkennen. Die drei vorgestellten Artikel zeigen, wie wichtig es ist, vorhandene Schätzer auf ihre Robustheit hinsichtlich unterschiedlicher Annahmen über den Missing- und Selektionsmechanismus zu untersuchen, wenn es Hinweise darauf gibt, dass die Ignorierbarkeitsannahme verletzt sein könnte und stellen erste Lösungen zur Umsetzung bereit. / This thesis deals with methods for the appropriate handling of non-ignorable missing data and sample selection, which are two common challenges of survey data analysis. Both issues can dramatically affect the quality of analysis results and lead to misleading inferences about the population. Therefore, in three different research articles, I treat methods for the performance of so-called sensitivity analyses with regards to the missing data and selection mechanism that are usable with typical survey data. In the first and second article, I provide novel procedures for the multiple imputation of binary and ordinal multilevel data that are supposed to be Missing not At Random (MNAR). The methods’ suitability to produce unbiased and efficient estimates could be demonstrated in various simulation studies considering different data scenarios. Moreover, I could show their applicability to empirical data. In the third article, I investigate a measure to quantify and adjust non-ignorable selection bias in proportions estimated based on non-probabilistic data. In doing so, I provide the first application of the suggested index to a real non-probability sample outside its original research group. In addition, I derive general guidelines for its usage in practice, and validate the measure’s performance in properly detecting selection bias. The three presented articles highlight the necessity to assess the sensitivity of estimates towards different assumptions about the missing-data and selection mechanism if it seems realistic that the ignorability assumption might be violated, and provide first solutions to enable such robustness checks for specific data situations. Missing Not at Random Multiple Imputation Fully conditional specification Mehrebenen Daten Selektionsmodell Selection Not at Random Stichprobenverzerrung Nicht-probabilistische Stichprobe Pattern-mixture Modell Sensitivitätsanalyse Missing Not at Random Multiple imputation Fully conditional specification Multilevel data Selection model Selection Not at Random Selection bias Non-probability sample Pattern-mixture model Sensitivity analysis 300 Sozialwissenschaften ddc:300 ddc:519
5	Regression modeling with missing outcomes : competing risks and longitudinal data / Contributions aux modèles de régression avec réponses manquantes : risques concurrents et données longitudinales Moreno Betancur, Margarita 05 December 2013 (has links) Les données manquantes sont fréquentes dans les études médicales. Dans les modèles de régression, les réponses manquantes limitent notre capacité à faire des inférences sur les effets des covariables décrivant la distribution de la totalité des réponses prévues sur laquelle porte l'intérêt médical. Outre la perte de précision, toute inférence statistique requière qu'une hypothèse sur le mécanisme de manquement soit vérifiée. Rubin (1976, Biometrika, 63:581-592) a appelé le mécanisme de manquement MAR (pour les sigles en anglais de « manquant au hasard ») si la probabilité qu'une réponse soit manquante ne dépend pas des réponses manquantes conditionnellement aux données observées, et MNAR (pour les sigles en anglais de « manquant non au hasard ») autrement. Cette distinction a des implications importantes pour la modélisation, mais en général il n'est pas possible de déterminer si le mécanisme de manquement est MAR ou MNAR à partir des données disponibles. Par conséquent, il est indispensable d'effectuer des analyses de sensibilité pour évaluer la robustesse des inférences aux hypothèses de manquement.Pour les données multivariées incomplètes, c'est-à-dire, lorsque l'intérêt porte sur un vecteur de réponses dont certaines composantes peuvent être manquantes, plusieurs méthodes de modélisation sous l'hypothèse MAR et, dans une moindre mesure, sous l'hypothèse MNAR ont été proposées. En revanche, le développement de méthodes pour effectuer des analyses de sensibilité est un domaine actif de recherche. Le premier objectif de cette thèse était de développer une méthode d'analyse de sensibilité pour les données longitudinales continues avec des sorties d'étude, c'est-à-dire, pour les réponses continues, ordonnées dans le temps, qui sont complètement observées pour chaque individu jusqu'à la fin de l'étude ou jusqu'à ce qu'il sorte définitivement de l'étude. Dans l'approche proposée, on évalue les inférences obtenues à partir d'une famille de modèles MNAR dits « de mélange de profils », indexés par un paramètre qui quantifie le départ par rapport à l'hypothèse MAR. La méthode a été motivée par un essai clinique étudiant un traitement pour le trouble du maintien du sommeil, durant lequel 22% des individus sont sortis de l'étude avant la fin.Le second objectif était de développer des méthodes pour la modélisation de risques concurrents avec des causes d'évènement manquantes en s'appuyant sur la théorie existante pour les données multivariées incomplètes. Les risques concurrents apparaissent comme une extension du modèle standard de l'analyse de survie où l'on distingue le type d'évènement ou la cause l'ayant entrainé. Les méthodes pour modéliser le risque cause-spécifique et la fonction d'incidence cumulée supposent en général que la cause d'évènement est connue pour tous les individus, ce qui n'est pas toujours le cas. Certains auteurs ont proposé des méthodes de régression gérant les causes manquantes sous l'hypothèse MAR, notamment pour la modélisation semi-paramétrique du risque. Mais d'autres modèles n'ont pas été considérés, de même que la modélisation sous MNAR et les analyses de sensibilité. Nous proposons des estimateurs pondérés et une approche par imputation multiple pour la modélisation semi-paramétrique de l'incidence cumulée sous l'hypothèse MAR. En outre, nous étudions une approche par maximum de vraisemblance pour la modélisation paramétrique du risque et de l'incidence sous MAR. Enfin, nous considérons des modèles de mélange de profils dans le contexte des analyses de sensibilité. Un essai clinique étudiant un traitement pour le cancer du sein de stade II avec 23% des causes de décès manquantes sert à illustrer les méthodes proposées. / Missing data are a common occurrence in medical studies. In regression modeling, missing outcomes limit our capability to draw inferences about the covariate effects of medical interest, which are those describing the distribution of the entire set of planned outcomes. In addition to losing precision, the validity of any method used to draw inferences from the observed data will require that some assumption about the mechanism leading to missing outcomes holds. Rubin (1976, Biometrika, 63:581-592) called the missingness mechanism MAR (for “missing at random”) if the probability of an outcome being missing does not depend on missing outcomes when conditioning on the observed data, and MNAR (for “missing not at random”) otherwise. This distinction has important implications regarding the modeling requirements to draw valid inferences from the available data, but generally it is not possible to assess from these data whether the missingness mechanism is MAR or MNAR. Hence, sensitivity analyses should be routinely performed to assess the robustness of inferences to assumptions about the missingness mechanism. In the field of incomplete multivariate data, in which the outcomes are gathered in a vector for which some components may be missing, MAR methods are widely available and increasingly used, and several MNAR modeling strategies have also been proposed. On the other hand, although some sensitivity analysis methodology has been developed, this is still an active area of research. The first aim of this dissertation was to develop a sensitivity analysis approach for continuous longitudinal data with drop-outs, that is, continuous outcomes that are ordered in time and completely observed for each individual up to a certain time-point, at which the individual drops-out so that all the subsequent outcomes are missing. The proposed approach consists in assessing the inferences obtained across a family of MNAR pattern-mixture models indexed by a so-called sensitivity parameter that quantifies the departure from MAR. The approach was prompted by a randomized clinical trial investigating the benefits of a treatment for sleep-maintenance insomnia, from which 22% of the individuals had dropped-out before the study end. The second aim was to build on the existing theory for incomplete multivariate data to develop methods for competing risks data with missing causes of failure. The competing risks model is an extension of the standard survival analysis model in which failures from different causes are distinguished. Strategies for modeling competing risks functionals, such as the cause-specific hazards (CSH) and the cumulative incidence function (CIF), generally assume that the cause of failure is known for all patients, but this is not always the case. Some methods for regression with missing causes under the MAR assumption have already been proposed, especially for semi-parametric modeling of the CSH. But other useful models have received little attention, and MNAR modeling and sensitivity analysis approaches have never been considered in this setting. We propose a general framework for semi-parametric regression modeling of the CIF under MAR using inverse probability weighting and multiple imputation ideas. Also under MAR, we propose a direct likelihood approach for parametric regression modeling of the CSH and the CIF. Furthermore, we consider MNAR pattern-mixture models in the context of sensitivity analyses. In the competing risks literature, a starting point for methodological developments for handling missing causes was a stage II breast cancer randomized clinical trial in which 23% of the deceased women had missing cause of death. We use these data to illustrate the practical value of the proposed approaches. Données manquantes Données longitudinales Risques concurrents Régression Réponses manquantes Sorties d'étude Cause d'évènement manquante Imputation multiple Estimateurs pondérés Maximum de vraisemblance Modèle de mélange de profils Analyse de sensibilité Modèle linéaire mixte Fonction d'incidence cumulée Risque cause-spécifique Pseudo-valeurs Missing data Longitudinal data Competing risks Regression Missing outcomes Drop-out Missing cause of failure Multiple imputation Inverse probability weighting Direct likelihood Pattern-mixture model Sensitivity analysis Linear mixed model Cumulative incidence function Cause-specific hazard Pseudo-values

1

Page generated in 0.0988 seconds