• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 235
  • 118
  • 118
  • 118
  • 118
  • 118
  • 115
  • 22
  • 6
  • 3
  • Tagged with
  • 396
  • 396
  • 176
  • 173
  • 104
  • 70
  • 44
  • 44
  • 37
  • 34
  • 31
  • 24
  • 20
  • 15
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

New matching algorithm -- Outlier First Matching (OFM) and its performance on Propensity Score Analysis (PSA) under new Stepwise Matching Framework (SMF)

Sun, Yi 14 October 2014 (has links)
<p> An observational study is an empirical investigation of treatment effect when randomized experimentation is not ethical or feasible (Rosenbaum 2009). Observational studies are common in real life due to the following reasons: a) randomization is not feasible due to the ethical or financial reason; b) data are collected from survey or other resources where the object and design of the study has not been determined (e.g. retrospective study using administrative records); c) little knowledge on the given region so that some preliminary studies of observational data are conducted to formulate hypotheses to be tested in subsequent experiments. When statistical analysis are done using observational studies, the following issues need to be considered: a) the lack of randomization may lead to a selection bias; b) representativeness of sampling with respect to the problem under consideration (e.g. study of factors influencing a rare disease using a nationally representative survey with respective to race, income, and gender but not with respect to the rare disease condition).We will use the following sample to illustrate the challenges of observational studies and possible mitigation measures. </p><p> Our example is based on the study by Lalonde (1986), which evaluated the impact of job training on the earnings improvement of low-skilled workers in 1970's (In Paper 1 section 1.5.2, we will discuss this data set in more detail). The treatment effect estimated from the observational study was quite different from the one obtained using the baseline randomized "National Supported Work (NSW) Experiment" carried out in the mid-1970's. Now we understand the treatment effect which is the impact of job training. Selection bias may contaminate the treatment effect, in other words, workers who receive the job training may be fundamentally different from those who do not. Furthermore, the sample of control group selected for observational study by Lalonde may not represent the sample of control group from the original NSW experiment. </p><p> In this study, we address the issue of lack of randomization by applying a new matching algorithm (Outlier First Matching, OFM) which can be used in conjunction with the Propensity Score Analysis (PSA) or other similar methods to achieve the convincible treatment effect estimation in observational studies. </p><p> This dissertation consists of three papers. </p><p> Paper 1 proposes a new "Stepwise Matching Framework (SMF)" and rationalizes its usage in causal inference study (especially for PSA study using observational data). Furthermore, under the new framework of SMF, one new matching algorithm (Outlier First Matching or OFM in short) will be introduced. Its performance along with other well-known matching algorithms will be studied using the cross sectional data. </p><p> Paper 2 extends methods of paper 1 to correlated data (especially to longitudinal data). In the circumstance of correlated data (e.g. longitudinal data), besides the selection bias as in cross-sectional observational data, the repeated measures bring out the between-subject and within-subject correlation. Furthermore, the repeated measures can also bring out the missing value problem and rolling enrollment problem. All of above challenges from correlated data complexity the data structure and need to be addressed using more complex model and methodology. Our methodology calculate the variant p-score of control subjects at each time point and generate the p-score difference from each control subject to every treatment subject at treatment subject's time point. Then such p-score differences are summarized to create the distance matrix for next step analysis. Once again, the performance of OFM and other well-established matching algorithms are compared side by side and the conclusion will be summarized through simulation and real data applications. </p><p> Paper 3 handles missing value problem in longitudinal data. As we have mentioned in paper 2, the complexity of data structure of longitudinal data often comes with the problem of missing data. Due to the possibility of between subject and within subject correlation, the traditional imputation methodology will probably ignore the above two correlations so that it may lead to biased or inefficient imputation of missing data. We adopt one missing value imputation strategy introduced by Schafer and Yucel (2002) through one R package "pan" to handle the above two correlations. The "imputed complete data" will be treated using the similar methodology as paper 2. Then MI results will be summarized using Rubin's rule (1987). The conclusion will be drawn based on the findings through simulation study and compared to what we have found in complete longitudinal data study in paper 2. </p><p> In last section, we conclude the dissertation with the discussion of preliminary results, as well as the strengths and limitations of the present research. Also we will point out the direction of the future study and provide suggestions to practice works.</p>
42

A recursive Polya tree mixture model: computationally efficient Bayesian nonparametric modelling

Li, Shujie January 2014 (has links)
This thesis describes a flexible and computationally efficient Bayesian nonparametric modelling approach based on a recursive Polya tree mixture model. This approach is motivated by the need to capture the heterogeneity observed in many areas of biostatistics such as meta analysis of clinical trials, survival analysis and recurrent event data analysis. Let Y_{1},..,Y_{N} be mutually independent observations such that Y_i has distribution h(.|\theta_{i}). It is assumed that the parameters theta_{1},..., theta_{N} arise from an unknown distribution F and that the prior on F is a Polya tree distribution. An empirical Bayesian approach is adopted for the choice of the prior's base distribution. As the parameters theta_{1},..., theta_{N} are latent, a data augmentation algorithm is used to simulate pseudo values iteratively. The empirical distribution of these pseudo values can then guide the choice of the base distribution of the Polya tree prior. The theoretical properties of this procedure are explored.Despite its simplicity, the proposed model is practical and computationally efficient. In addition to providing a good approximation for more complicated Bayesian nonparametric models, it can be used to handle difficult problems in classical Bayesian nonparametric modelling. In this thesis, the use of the model is illustrated using the famous data of Brown (2008) and Liu (1996), which are often viewed as test cases for Bayesian nonparametric modelling. It is also shown that the proposed approach can be applied to density estimation (including in the bivariate case) and meta analysis in biostatistics. Moreover, a Bayesian semi-parametric accelerated failure time (AFT) model based on the proposed approach is considered, and an extension of the AFT model to recurrent event data analysis is introduced. / Cette thèse décrit une approche flexible et numériquement efficace de modélisation non paramétrique bayésienne au moyen d'un modèle de mélange fondé sur une arborescence de Pólya récursive. Cette approche est motivée par la nécessité de prendre en compte l'hétérogénéité fréquemment observée en biostatistique, notamment lors de la méta-analyse d'essais cliniques ou de l'analyse de durées de vie et d'événements récurrents. Soient Y_{1},...,Y_{N} des observations mutuellement indépendantes telles que Y_i est de loi $h(.|theta_{i})$. On suppose que les paramètres theta_{1},...,theta_{N}$ proviennent d'une loi F inconnue et que la loi a priori sur F est une arborescence de Pólya. On adopte une approche bayésienne empirique pour le choix de la loi a priori de base. Les paramètres theta_{1},..., theta_{N} êtant latents, on a recours à un algorithme d'augmentation de données pour en simuler des valeurs de façon itérative. La loi empirique de ces pseudo observations permet alors de guider le choix de la loi a priori de base. Les propriétés théoriques de cette procédure sont explorées. Malgré sa simplicité, le modèle proposé est pratique et efficace au plan calcul. En plus de fournir une bonne approximation de modèles bayésiens non paramétriques plus complexes, il facilite le traitement de problèmes réputés difficiles en modélisation bayésienne non paramétrique classique. Dans cette thèse, l'emploi du modèle est illustré au moyen des célèbres données de Brown (2008) et de Liu (1996) souvent considérées comme bancs d'essai pour la modélisation bayésienne non paramétrique. Comme on le fait valoir, l'approche peut aussi servir à estimer une densité (y compris bivariée) et à des fins de méta-analyse en biostatistique. On étudie en outre un modèle bayésien semi-paramétrique à temps de panne accéléré (TPA) fondé sur cette approche et on propose une généralisation du modèle TPA pour l'analyse d'événements récurrents.
43

Causal effects in randomized trials in the presence of partial compliance: breastfeeding on infant growth

Guo, Tong January 2009 (has links)
There has been considerable growth in the statistics literature on methods for estimating causal effects from randomized controlled trials in which non-compliance occurs. However, the focus has been limited to all-or-none compliance. This thesis develops new methodology to estimate causal effects in a randomized trial setting in which non-compliance can be better classified as "full-partial-none" compliance and where subjects in both the experimental and control arm could receive experimental treatment to varying degrees regardless of treatment assignment. This new approach to address the problem is based on principal stratification theory. We define compliance stratification effects as a special case of principal stratification and use dual propensity scores (propensity scores estimated under both possible treatment assignments) to estimate compliance principal effects. We demonstrate that dual propensity scores have many of the attractive properties of the ordinary propensity score and that compliance stratification effects become estimable by adjusting for the estimated dual propensity scores using stratification, matching or regression. We apply our methodology to a breastfeeding promotion intervention trial and assess the causal effects of prolonged and exclusive breastfeeding on infant growth (weight or length) at one year of age. / La littérature statistique a connu un important essor en ce qui concerne les méthodes employées pour estimer les effets causaux à partir d'essais sur des échantillons aléatoires contrôlés en présence de la non-conformité. L'attention a toutefois été portée sur la présence ou l'absence totale de la conformité. Ce mémoire élabore une nouvelle méthodologie qui sert à estimer les effets causaux d'essais sur des échantillons aléatoires où la « non-conformité » est remplacée par une conformité « Totale-partielle-absente » et où les sujets, à la fois des côtés de l'expérimentation et du contrôle, pouvaient recevoir des traitements expérimentaux à différents degrés, indépendamment de l'application du traitement. Cette nouvelle façon d'aborder le problème se base sur la théorie de stratification principale. Nous définissons les effets de la stratification de la conformité comme étant un cas particulier de la stratification principale et utilisons des scores de propension duaux (estimés au-dessous des deux applications du traitement possibles) pour estimer les effets principaux de la conformité. Nous démontrons que les scores de propension duaux conservent beaucoup de propriétés intéressantes du score de propension normal et qu'ils peuvent servir à estimer les effets de stratification de la conformité. Nous appliquons notre méthodologie à l'allaitement naturel et évaluons les effets causaux d'un allaitement naturel exclusif et prolongé sur la croissance (le poids et la taille) du nourrisson à l'âge d'un an.
44

Comparison of small n statistical tests of differential expression applied to oligonucleotide arrays

Murie, Carl. January 2005 (has links)
DNA microarrays provide data for genome wide patterns of expression between varying conditions. Microarray studies often have small samples sizes, however, due to cost constraints or specimen availability. This can lead to poor random error estimates and inaccurate statistical tests of differential expression. We compare the performance of the standard t-test, simple fold change, and three small n statistical test methods designed to circumvent these problems, by applying them to simulated and experimental microarray data. The Empirical Bayes t-statistic was the most robust and effective method across simulated data and experimental data. Overall, the Empirical Bayes methodology provided the most optimal balance between specificity and sensitivity in detecting differential expression.
45

Bayesian estimation of diagnostic test parameters in the presence of verification bias

Lu, Ying, 1968- January 2006 (has links)
The statistical evaluation of diagnostic tests may be affected by several potential biases. These biases include those caused by a study design that results in a non-representative sub-sample who are further verified by the reference test (verification bias), and those caused by the absence of a definitive diagnostic test (gold standard test) for many diseases and conditions. In practice, an imperfect reference test is often assumed to be a perfect gold standard, potentially resulting in a large bias. Both Bayesian and frequentist methods have been proposed to adjust for each of these biases independently. To our knowledge, there is no Bayesian solution for that adjusts for both of these biases simultaneously. The objective of this thesis is to present a Bayesian method for the evaluation of diagnostic tests when both of these potential biases may be operating simultaneously. We develop a likelihood function that models both sources of bias, and suggest convenient prior distributions that simplify deriving posterior distributions. The models are based on dichotomous test results and the parameters of interest are estimated using a Gibbs sampler. Using both simulated and real data examples, we demonstrate that the method presented here can correct the verification bias even when a perfect gold standard test does not exist.
46

Population, time and medication histories in research on NSAIDs

Van Staa, T. P. (Tjeerd-Pieter) January 1991 (has links)
This thesis addressed the value of prescription information in research on NSAIDs. Three related studies were conducted using virtually complete longitudinal information, drawn from 36 Dutch pharmacies (1988, 83,000 NSAIDs prescriptions). / Firstly, studies that use prescription information must assign a number of days of therapy to each prescription, traditionally a fixed period. If this window is made unduly long, then validity is compromised. Fixed windows could also confound comparisons between age-gender strata and drugs. / Secondly, analyses based on two assumptions for estimation of the duration of use after dispensing showed substantive differences between NSAIDs in the distribution of patterns and histories of exposure and various other risk factors. / Thirdly, switching of one NSAID to another occurred rather frequently within a proxy duration of use, especially in people with a history of unstable usage patterns of NSAIDs. / In conclusion, this thesis demonstrated the importance of patterns and histories of exposure in pharmaco-epidemiology.
47

Seasonal variation in risk of Parkinson's disease

Postuma, Ronald B. January 2006 (has links)
Parkinson's disease (PD) is a common neurodegenerative condition characterized by progressive motor, speech, swallowing, and gait difficulties. Risk factors for PD include male sex, pesticide exposure, head trauma, and rural living, but these account for only a small amount of the variation in risk. Recent studies suggest that for many neurological diseases, people born at a certain time of year are at higher risk of developing the disease. Small-scale studies have also suggested that persons born in the spring may be at higher risk of developing PD late in life. We examined the birth dates of 8168 PD patients collected from subspecialty movement disorder clinics across Canada. Patterns of seasonality in births or clusters of birth dates were examined and compared with the general Canadian population (from the 2001 census). We found no evidence of seasonal variation in PD incidence by birth date or clustering of birth dates in PD patients.
48

Non-linear effects and clustering in estimation of propensity scores

Mahmud, Mamun. January 2006 (has links)
In observational studies, propensity scores are used to reduce bias and to increase precision of the treatment effect estimation. Previous studies on propensity scores methodology, and its applications in epidemiological studies, have consistently relied on conventional multiple logistic regression methods that rely on the assumption of independent outcomes. Yet, in most epidemiologic studies, one may expect that several patients have been treated by the same physician resulting in the inter-correlation between treatments prescribed to individual patients of the same physician. In such situations, one may expect an important violation of the independence assumptions, and failure to account for such correlations will lead to incorrect statistical inferences. Another limitation is that conventional logistic regression relies on the assumption that the relationship between a continuous covariate and logit of the outcome is linear, which may be incorrect in many applications. All these problems may affect the estimation of propensity scores, influencing the selection of covariates to be included in the model and estimated effects of these covariates. / In this thesis, I have investigated the above mentioned two issues to evaluate the potential benefits of using more refined methods, such as Generalized Additive Modeling, Generalized Estimating Equations and Generalized Linear Mixed Model, at the stage of estimation of the propensity scores. First, I analyzed an empirical database and then ran a small scale simulation. / GAM modeling revealed a statistically significant non-linear effect of age and Charlson Comorbidity Index in predicting the choice of benzodiazepine (Lorazepam vs. Oxazepam or Lorazepam vs. Flurozepam). However, these results did not have any impact on the estimates of propensity score as very high correlation (>0.99) was detected between the propensity scores from different models. / In a simulation study, I investigated in particular the clustering issue. The results show that, in contrast to conventional logistic model, both GEE and GLMM models do account for variance inflation due to clustering and, therefore, yield reasonably accurate coverage rates. However, as revealed in empirical analysis, propensity scores from different models were almost perfectly correlated (>0.99), even though these estimates have been obtained under different assumptions and under different methods. / Overall, my results confirm that non-linearity and, to a lesser extent, clustering require more advanced statistical methods to be applied in epidemiological studies. However, both empirical results and a small simulation suggest that a failure of most previous pharmaco-epidemiological studies to account for such methodological complexities likely had only a minimal impact on the accuracy of the estimated propensity scores.
49

Risk time-window specification and its impact on the assessment of medication-related adverse events

Cournoyer, Daniel. January 2006 (has links)
Post-marketing studies using medical administrative databases are often conducted to assess medication-related adverse events (AE). The determination of the risk time-window, defined as the period of time during which a medication-related AE could occur, is a crucial and challenging step toward the correct assessment of these AEs. In general, the unknown risk time-window consists of the number of days supplied, TS, for the medication and of a time-window. TW, that starts after TS and during which the medication could still produce AEs. Arguments have been made in favor of both short and long TW, durations. The aim of this thesis is to determine the impact of varying TW values on the assessment of the rate of cardiovascular AEs, using simulated data, and in a real-life example using administrative databases from Quebec. Results indicated that longer TW values tended to bias results toward the null.
50

Utilisation de l'analyse des correspondances multiples et de la classification hierarchique pour modeliser la valeur pronostique des marqueurs presents chez les patients avec polyarthrite d'installation recente.

Carrier, Nathalie. Unknown Date (has links)
Thèse (M.Sc.)--Université de Sherbrooke (Canada), 2008. / Titre de l'écran-titre (visionné le 1 février 2007). In ProQuest dissertations and theses. Publié aussi en version papier.

Page generated in 0.1318 seconds