• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 235
  • 118
  • 118
  • 118
  • 118
  • 118
  • 115
  • 22
  • 6
  • 3
  • Tagged with
  • 396
  • 396
  • 176
  • 173
  • 104
  • 70
  • 44
  • 44
  • 37
  • 34
  • 31
  • 24
  • 20
  • 15
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Population-averaged models for diagnostic accuracy studies and meta-analysis

Powers, James Murray 10 August 2013 (has links)
<p> Modern medical decision making often involves one or more diagnostic tools (such as laboratory tests and/or radiographic images) that must be evaluated for their discriminatory ability to detect presence (or absence) of current health state. The first paper of this dissertation extends regression model diagnostics to the Receiver Operating Characteristic (ROC) curve generalized linear model (ROC-GLM) in the setting of individual-level data from a single study through application of generalized estimating equations (GEE) within a correlated binary data framework (Alonzo and Pepe, 2002). Motivated by the need for model diagnostics for the ROC-GLM model (Krzanowski and Hand, 2009), GEE cluster-deletion diagnostics (Preisser and Qaqish, 1996) are applied in an example data set to identify cases that have undue influence on the model parameters describing the ROC curve. In addition, deletion diagnostics are applied in an earlier stage in the estimation of the ROC-GLM, when a linear model is chosen to represent the relationship between the test measurement and covariates in the control subjects. The second paper presents a new model for diagnostic test accuracy meta-analysis. The common analysis framework for the meta-analysis of diagnostic studies is the generalized linear mixed model, in particular, the bivariate logistic-normal random effects model. Considering that such cluster-specific models are most appropriately used if the model for a given cluster (i.e. study) is of interest, a population-average (PA) model may be appropriate in diagnostic test meta-analysis settings where mean estimates of sensitivity and specificity are desired. A PA model for correlated binomial outcomes is estimated with GEE in the meta-analysis of two data sets. It is compared to an indirect method of estimation of PA parameters based on transformations of bivariate random effects model parameters. The third paper presents an analysis guide for a new SAS macro, PAMETA (Population-averaged meta-analysis), for fitting population-averaged (PA) diagnostic accuracy models with GEE as described in the second paper. The impact of covariates, influential clusters and observations is investigated in the analysis of two example data sets.</p>
22

SURVIVORSHIP MODELS FOR SEXUALLY TRANSMISSIBLE DISEASES

SMITH, MELVYN LEE January 1980 (has links)
Motivated by the type of data naturally generated at Venereal Disease clinics, the author develops a general theory of survivorship analysis of sexually transmissible diseases (STD), with specific reference to estimating the survival distribution of incubation times and comparing groups with respect to incubation period distributions. Data pertaining to STD are often right-censored (known only to be greater than a certain value) or left-censored (known only to be less than a certain value), and samples are often truncated (potential sample points less than or greater than certain values excluded from the sample, with no knowledge of how many values are excluded); hence a theory of censorship (observation censoring and sample truncation) is presented, permitting a logical classification of censorship statistical literature and greater insight into the relationship between event and censoring times. A generalized Wilcoxon test for k samples with double-censoring (mixed right- and left-censoring) and unequal censoring distributions is derived. This test may be used to compare survival experiences (distrbutions of incubation periods) in groups with double-censoring where censoring patterns may not be equal. Application is made of survival techniques to incubation periods of gonorrheal and Chlamydial urethral and eye infections. The parametic form of the survival curve is shown to relate to the effectiveness of the immunologic process in altering the course of the disease: (1) mucosal infections with no effective immunity (gonorrheal urethritis and newborn conjunctivitis; Chlamydial newborn conjunctivitis and pneumonitis) are lognormal; (2) mucosal infections with some effective immunity (Chalmydial urethritis and Chlamydial adult eye disease) tend to be Weibull; (3) non-mucosal infections (neonatal goncoccal arthritis) are not necessarily lognormal. A mathematical derivation of the lognormal distribution incorporating the notion of no effective immunologic response is given. The symptomatic incubation period of male gonorrheal urethritis is shown to be lognormal. Further, it is shown that the mean, median, and geometric dispersion of this distribution have progressively increased since the introduction of sulfa in the 1930's and penicillin in the 1940's. It is proposed that this increase in the gonorrhea incubation period is the prime reason for failure to control the epidemic. Comprehensive reviews of statistical censorship literature and of ghonorrhea-Chlamydia literature are included.
23

A STATISTICAL INVESTIGATION OF THE PULMONARY EFFECTS OF EXPOSURE TO ASBESTOS

ATKINSON, EDWARD NEELY January 1981 (has links)
In this study, data collected on a group of 1112 workers formerly employed in the production of insulation containing amosite asbestos at a plant in Tyler, Texas are examined and a subset of continuously employed workers with no other known exposure to asbestos is selected for detailed analysis. The response variables examined are scores for tests of pulmonary mechanics and the explanatory variables are age, height, weight, race, employment history and smoking behavior; the method of analysis is linear least squares regression. The numerical problems associated with presence of strong correlations in the data matrix are considered, and a method for the choice of a subset regression which attempts to alleviate these problems is proposed. Equations for each lung function test are selected and examined in detail. Regression techniques are used to attempt to detect an interactive effect between smoking and asbestos exposure with respect to lung function; such an effect has been reported with respect to carcinoma of the lung. No interactive effect is detected. Finally, maximum likelihood estimation is used to identify a subset of workers who seem particularly susceptible to the effects of asbestos.
24

Statistical analysis of a high-content screening assay of microtubule polymerization status

Lo, Ernest January 2010 (has links)
The present work describes the analysis of the first high-content, double-immunofluorescence assay of microtubule polymerization status. Two novel features of the work are the extraction of a new class of cell metrics that target fiber-based cell phenotypes (using the Fiberscore algorithm) in a high-content assay, and the development of a non-uniformity correction algorithm that allows the unbiased analysis of dim cells. / Findings relevant to HCS data analysis in general include: (1) Spatial plate biases are significant in HCS data and manifest differently for different cell-level metrics; (2) Individual plates are separate statistical entities; thus cellular data in HCS cannot in general be pooled before proper normalization procedures are applied; (3) Inter-plate variance is significant in HCS data such that inter-plate replicates are a necessity. However HCS data also appears to be amenable for empirical Bayes methods for improving sensitivity to 'hit' compounds; (4) Cell populations are observed to respond heterogeneously to treatment compounds. Initial tests of an alternative cell population summary statistic (the AUC) thought to be suited for the detection of cell subpopulations, did not indicate significant improvements in sensitivity over conventional measures (such as the population median) however. / The correlation texture metric was identified as showing greatly increased sensitivity, when used on the Tyr-tubulin-specific channel. The identification of this cell-level metric provides a preliminary demonstration that high-content assays have the potential to provide superior performance over conventional whole-well HTS assays. An image-processing related issue termed the 'area-intensity confound', was also identified as a possible major source of variability that limited the performance of the alternative cell-level metrics that were developed. A resolution to this issue is proposed. / Many open questions and avenues of further investigation remain, and the current study represents only a preliminary step in the ongoing analysis of the HCS microtubule polymerization status assay, and the development of pertinent statistical inference methods. / La présente étude décrit l'analyse du premier criblage à haut-contenu (HCS) d'un test multi-paramétrique utilisant trois marqueurs fluorescents; ce test permet d'explorer l'état de polymérisation des microtubules. / Ce travail se distingue par deux caractéristiques novatrices: / 1) L'extraction d'une nouvelle catégorie de mesures cellulaires qui caractérisent des organites cellulaires de type « fibres » (en utilisant l'algorithme Fiberscore) dans un criblage haut-contenu, / 2) Le développement d'un algorithme pour la correction du manque d'uniformité des images permettant l'analyse non biaisée des cellules sombres. / Les résultats inhérents à l'analyse de données de HCS incluent en général les points suivants: (1) Les biais spatiaux liés aux plaques sont significatifs et se manifestent différemment selon les paramètres mesurés; (2) Chaque plaque est une entité statistique distincte; donc les données de HCS ne peuvent pas être compilées sans utilisation préalable de méthodes de normalisation appropriées; (3) La variance entre les plaques est significative; ainsi la présence de réplicats est cruciale. Cependant, les données HCS sont conformes à l'application de la méthode de Bayes empirique en vue d'améliorer la sensibilité aux composés bio-actifs; (4) Les cellules répondent de façon hétérogène au traitement par des petites molécules chimiques. Cependant, contrairement à notre attente, les tests réalisés avec l'approche statistique 'AUC' n'ont pas indiqué une amélioration de puissance statistique. / La mesure texturale de 'corrélation' a été identifiée comme une mesure qui présente une puissance statistique plus importante sur le canal de 'Tyr-tubulin'. L'identification de cette mesure fournit une démonstration préliminaire que les criblages HCS pourraient être plus performants que les criblages à haut débit conventionnels (HTS). Un problème lié au traitement d'images et plus particulièrement à la segmentation des images, que nous avons appellé 'the area-intensity confound', a été identifié comme une source importante de variabilité : ceci a limité la performance des mesures cellulaires alternatives qui ont été développées. Une solution à ce problème est proposée. / Il reste plusieurs questions ouvertes et de nombreuses voies de recherches à explorer. Cette étude n'est qu'une étape préliminaire dans l'analyse du criblage à haut-contenu sur l'état de polymérisation des microtubules, et le développement des méthodes statistiques pertinentes.
25

Estimating and modelling rates of evolution with applications to phylogenetics and codon selection

Bevan, Rachel Bronwen. January 2006 (has links)
This thesis addresses two problems that have applications in evolution and phylogenetics: (i) estimating and accounting for evolutionary rate heterogeneity in a phylogenetic context (Chapters 2 and 3); (ii) detecting synonymous selection upon a set of codons (Chapter 4). / Chapter 2 presents a fast algorithm (DistR) to estimate gene/protein evolutionary rates based on pairwise distances between pairs of taxa derived from gene/protein sequence data. Simulation studies indicate that this algorithm accurately estimates rates and is robust to missing data. Moreover, by including evolutionary rates estimated by the DistR algorithm as additional parameters into a phylogenetic model, a significantly improved fit over the concatenated model is obtained as measured by the Akaike Information Criterion (AIC). / However, allowing every gene/protein to have its own evolutionary rate - termed the n-parameter approach - is only one method of accounting for gene rate heterogeneity in phylogenetic inference. Under the alpha-parameter approach, a Gamma distribution is fit to the gene rates in order to account for rate heterogeneity, a method that is much slower than the n-parameter approach. Comparison of the n-parameter to the alpha-parameter approaches (Chapter 3) indicates that the n-parameter method provides a better fit over the concatenated model than the alpha-parameter approach. Interestingly, improved model fit over the concatenated model is highly correlated with the presence of a gene that has a slow relative rate. / Chapter 4 addresses the question of detecting synonymous selection on sets of codons using parametric codon models. Parametric codon models are used to simulate data under the null hypothesis that there is no synonymous selection on a particular codon; codons that have unexpected synonymous usage in empirical data, compared to the null distribution, are classified as Highly Selected Codons (HSCs). Two different data sets are analyzed to identify HSCs: nuclear genes of various Saccharomyces species that are well-known to undergo translational selection; mitochondrial genes of several Reclinomonas species that are highly A+T biased. Eleven Saccharomyces codons are determined to be under synonymous selection (HSCs). Nine of these codons were previously identified as undergoing translational selection. Similarly, 10 Reclinomonas codons are identified as undergoing synonymous selection. Comparison to traditional nonparametric approaches shows that these methods do not identify any Reclinomonas codons as under synonymous selection due to the high A+T bias of the genes.
26

Instrumental variables in observational studies of drug effects with application to the treatment of atrial fibrillation

Ionescu-Ittu, Raluca Dacina January 2011 (has links)
Unmeasured confounding is one of the main challenges of database studies evaluating drug effects. The instrumental variable (IV) method can remove bias due to unmeasured confounding if certain conditions apply. In this thesis, I studied the comparative effectiveness of rhythm versus rate control therapy in reducing mortality in patients with atrial fibrillation (AF) and assessed the use of the IV methodology in this context. In the 1st manuscript I conducted a database comparative effectiveness study of rhythm versus rate control treatment and mortality in patients with AF. The study population included 30,664 elderly patients hospitalized with an AF diagnosis who did not have AF-related drug prescriptions in the year prior to the admission, but received an AF prescription within 7 days of discharge. Using multivariable Cox regression, we found that mortality was 23% lower for those who initiated rhythm control therapy after 8 years of follow-up (HR 0.77, 0.68-0.86).This analysis was adjusted for many potential confounders, but its results may have been affected by residual bias due to unobserved individual-level confounders. To address this limitation, the data was reanalyzed in the 2nd manuscript using 9 alternative provider prescribing preference IVs. Although all 9 IVs met the conventional criteria for the IV strength and reduced the covariate imbalance between the treatment groups, there were large variations in the IV-based point estimates and confidence intervals. We show that this variation is correlated with the IV strength and discuss how the estimates based on the weakest alternative IV may amplify significantly even a very small residual bias due to unobserved provider-level confounders. In the 3rd manuscript I relied on simulations in which I varied the strength of the correlation between the treatment and a provider-based IV to assess how the performance of the IV estimates is affected, in the presence of unobserved confounding, by the strength of the IV. The results indicated that the bias/variance trade-off between the conventional and IV analyses depends critically on the strength of the IV. Because the IV analyses using 2-stage least squares regression and binary outcomes rely on the linearity assumption, in the 4th manuscript I evaluated in simulations 2 criteria to assess model misspecification. Results suggested that both criteria are likely to identify model misspecification in many realistic scenarios. Furthermore, in simulations, the bias due to model misspecification was often negligible. / Le problème des variables de confusion non-mesurées est considéré comme un des principaux défis des études sur les effets des médicaments effectuées à partir de bases de données. La méthode de la variable instrumentale (VI) est une méthode générique pouvant éliminer le biais dû aux effets des variables de confusion non-mesurées si certaines conditions sont remplies. Dans cette thèse, j'ai étudié l'efficacité comparative de la thérapie par le contrôle du rythme versus la fréquence pour la réduction de la mortalité des patients avec fibrillation auriculaire (FA) et j'ai évalué l'utilisation de la méthodologie VI dans ce contexte. Dans le 1er manuscrit, j'ai effectué une étude de cohorte rétrospective basée sur des données administratives pour évaluer l'efficacité comparative du traitement par le contrôle du rythme versus la fréquence pour la réduction du risque de mortalité des patients avec FA. La population étudiée incluait 30 664 patients âgés, hospitalisés avec un diagnostic de FA et sans prescription de médicaments reliés à la FA dans l'année précédant l'admission, mais ayant reçu une prescription dans les 7 jours suivant la sortie. Tandis que la mortalité était similaire pour ceux ayant initié la thérapie par le contrôle du rythme versus la fréquence après les 3 premières années de suivi, la mortalité est devenue 23% plus faible après 8 ans de suivi pour ceux ayant initié la thérapie par le contrôle du rythme (HR 0.77, 0.68-0.86). Bien que l'analyse fût ajustée pour un grand nombre de variables de confusion potentielles, les résultats peuvent avoir été affectés par le biais résiduel dû aux variables de confusion non-mesurées. Afin de répondre à cette limitation, les données ont été de nouveau analysées dans le 2ème manuscrit en utilisant neuf différentes VI basées sur la préférence à prescrire une thérapie par le contrôle du rythme vs de la fréquence. Les neuf VI vérifiaient les critères conventionnels de la force de l'instrument et réduisaient la différence de distribution de la variable entre les groupes de traitement. Il y avait de grandes variations dans l'estimation et dans l'étendue de l'intervalle de confiance obtenues avec les neuf différentes VI. Nous avons montré que cette variation est corrélée avec la force de la corrélation entre les différentes VI. Dans le 3ème manuscrit, une étude de simulations, dans laquelle variait la force de la corrélation entre la VI au niveau du prescripteur et le traitement, permettait d'évaluer comment la performance des estimations de la VI était affectée par ce facteur. Les résultats indiquaient qu'en présence d'un facteur de confusion non observé, les estimations de la VI étaient uniformément moins biaisées que les estimations par les méthodes de régression conventionnelles, mais avaient une plus grande variance et que le compromis entre biais et variance dépendait fortement de la force de l'instrument. L'implémentation de la méthodologie des VI aux variables réponses binaires repose sur l'hypothèse que l'effet du traitement est cohérent avec un modèle linéaire. Ainsi, j'ai évalué par simulations, dans le 4ème manuscrit, deux critères pouvant identifier une possible mauvaise spécification du modèle. Dans plusieurs scenarios réalistes, les résultats suggèrent que les deux critères sont plausibles pour identifier une mauvaise spécification dans un modèle. En outre, le biais dû à la mauvaise spécification du modèle est souvent négligeable dans les simulations. À partir de mes résultats, j'ai conclu que (1) la méthodologie des VI est une méthode prometteuse qui peut fournir des estimations de risque moins biaisées lorsque les conditions spécifiques sont respectées et (2) qu'une décision réfléchie sur utiliser des analyses conventionnelles ou alors les VI devrait être basée sur une évaluation approfondie et systématique des différents compromis entre le biais et la variance qui peuvent vraisemblablement affecter chacun des types d'estimations.
27

Towards a coherent framework for the multi-scale analysis of spatial observational data: linking concepts, statistical tools and ecological understanding

Larocque, Guillaume January 2009 (has links)
Recent technological advances facilitating the acquisition of spatial observational data and an increasing awareness of issues of spatial pattern and scale have fostered the development and use of statistical methods for multi-scale analysis. These methods can be interesting tools to improve our understanding of natural systems, but their use must be guided by a good comprehension of the statistics and their assumptions. This thesis is an effort to develop a coherent framework for multi-scale analysis and to identify theoretical, statistical and practical issues and solutions. After defining terminology and concepts, several methods are compared using a common dataset in Chapter 2. The geostatistical method of regionalized multivariate analysis is identified as possessing several advantages, but shortcomings are identified, discussed and addressed in two manuscripts. In the first one (Chapter 3), a mathematical formalism is presented to characterize the spatial uncertainty of cokriged regionalized components and an approach is proposed for the conditional Gaussian co-simulation of regionalized components. In the second manuscript (Chapter 4), the theory underlying coregionalization analysis is discussed and its robustness and limits are assessed through a theoretical and mathematical framework. The assumptions underlying the method and the high levels of uncertainty associated with its use highlight problems with the interpretation of results, and issues with the application of probabilistic models in a spatial context (Chapter 5). Coregionalization analysis with a drift (CRAD), presented in detail in two co-authored publications, is proposed as a sensible alternative / Des avancées technologiques récentes facilitant l'acquisition de données observationnelles spatiales, et la conscientisation grandissante des chercheurs aux problèmes d'échelles, ont favorisé le développement et l'utilisation de méthodes statistiques d'analyse multi-échelles. Ces méthodes peuvent être des outils intéressants pour améliorer notre compréhension des écosystèmes, mais leur usage nécessite une bonne connaissance des théories statistiques et des hypothèses qui leur sont sous-jacentes. Cette thèse a pour but de contribuer au développement d'un cadre conceptuel cohérent pour l'analyse multi-échelles en identifiant des problématiques et des solutions théoriques, statistiques et pratiques. La terminologie et les concepts appropriés sont d'abord définis. Ensuite, dans le chapitre 2, plusieurs méthodes sont comparées en utilisant un jeu de données commun. La méthode géostatistique d'analyse multivariable régionalisée semble offrir plusieurs avantages, mais certains problèmes sont identifiés, discutés et traités dans deux articles publiés. Le premier (chapitre 3) présente un formalisme mathématique servant à caractériser l'incertitude spatiale des composantes régionalisées cokrigées et propose une approche pour la co-simulation conditionnelle Gaussienne de composantes régionalisées. Dans le deuxième article (chapitre 4), l'analyse de corégionalisation est discutée et la robustesse ainsi que les limites de cette méthode sont évaluées selon une analyse théorique et mathématique. Les postulats de cette méthode et le haut niveau d'incertitude lié à son utilisation mettent en évidence des problèmes associés
28

Sample size determination for prevalence estimation in the absence of a gold standard diagnostic test

Rahme, Elham H. January 1996 (has links)
A common problem in medical research is the estimation of the prevalence of a disease in a given population. This is usually accomplished by applying a diagnostic test to a sample of subjects from the target population. In this thesis, we investigate the sample size requirements for the accurate estimation of disease prevalence for such experiments. When a gold standard diagnostic test is available, estimating the prevalence of a disease can be viewed as a problem in estimating a binomial proportion. In this case, we discuss some anomalies in the classical sample size criteria for binomial parameter estimation. These are especially important with small sample sizes. When a gold standard test is not available, one must take into account misclassification errors in order to avoid misleading results. When the sensitivity and the specificity of the diagnostic test are both known, a new adjustment to the maximum likelihood estimator of the prevalence is suggested, and confidence intervals and sample size estimates that arise from this estimator are given. A Bayesian approach is taken when the sensitivity and specificity of the diagnostic test are not exactly known. Here, a method to determine the sample size needed to satisfy a Bayesian sample size criterion that averages over the preposterior marginal distribution of the data is provided. Exact methods are given in some cases, and a sampling importance resampling algorithm is used for more complex situations. A main conclusion is that the degree to which the properties of a diagnostic test are known can have a very large effect on the sample size requirements.
29

Methodologic issues in the analysis of data from a population based osteoporosis study : adjusting for selection bias and measurement error

Kmetic, Andrew Martin January 2004 (has links)
Two issues related to osteoporosis are addressed. (1) To estimate the prevalence of osteoporosis in Canada. (2) To estimate the effect of initial bone mass on bone mineral density (BMD) decline rates over a period of three years. We employ data from the Canadian Multicenter Osteoporosis Study (Cantos), a prospective, fixed cohort study comprised of 9,423 randomly selected subjects from nine different regions in Canada. / For the first objective, Cantos had a relatively low participation rate (42%), so that selection bias is a concern. The Cantos study design, however, included a brief risk factor questionnaire for those invitees that declined further participation. These risk factors were then used to estimate the missing osteoporosis status for non-participants using Bayesian multiple imputation, thus adjusting for nonresponse bias. Both ignorable and non-ignorable imputation models are considered. / Complicating study of the relationship between the initial BMD and rate of decline is the issue of measurement error, which can cause a spurious negative association between the rate of change in a variable and the initial value of the variable. A novel variation on the Bayesian methods of Richardson and Gilks (1993b) is used to adjust for measurement error in both the initial and year three BMD values. / After adjusting for selection bias, prevalence increased from negligible in the youngest age group to 38.9% for women and 15.4% for men in the 80+ age group. Selection bias was estimated to be relatively small, except in the oldest age group, where the bias was 2.4% for women and 5.4% for men. / For women the unadjusted relationship between the rate of decline of BMD and initial BMD is negative, -0.040 (95% CI = -0.053; -0.028). Correcting for measurement error results in estimates closer to zero. For example, using a hierarchical model with measurement error correction yields an estimate of -0.030 (95% CI = -0.050; -0.009). For men the unadjusted relationship between rate of decline and initial BMD was -0.026 (95% CI = -0.043; -0.009). Correcting for measurement error using the hierarchical model yields a much lower estimate of -0.013 (95% CI = -0.041; 0.016). It is clear that ignoring measurement error results in a heavily biased estimate of the effect of initial BMD on rate of decline of BMD for both women and men. The measurement error adjusted estimates were of a lesser magnitude than the unadjusted estimate, with reductions of approximately 25% to 50%. / The results from this thesis can be used to make decisions about osteoporosis treatment in Canada. Knowledge of the prevalence rates in Canada is useful to public health planners in allocating resources. Knowledge of the relationship between initial levels of BMD and subsequent decline in BMD allows better public health decision making, for example, in deciding whether to focus on bone health early in life, or whether it is more important to prevent decline later in life.
30

Statistical contributions to data analysis for high-throughput screening of chemical compounds

Malo, Nathalie. January 2006 (has links)
High-throughput Screening (HTS) is a relatively new process which allows several thousand chemical compounds to be tested rapidly in order to identify their potential as drug candidates. Despite increasing numbers of promising candidates, however, the numbers of new compounds that ultimately reach the market have declined. One way to improve upon this situation is to develop efficient and accurate data processing and statistical testing methods tailored for HTS. Human, biological or mechanical errors may develop across the several days it takes to run the entire screen and cause unwanted variation or "noise". Consequently, HTS data need to be preprocessed in order to reduce the effect of systematic errors. Robust statistical methods for outlier detection can then be applied to identify the most promising compounds. Current practice typically uses only single measurements, which negates the use of standard statistical methods and forces scientists to rely on strong untested assumptions and on arbitrary choices of significance thresholds. / The broad objectives of this research are to develop and evaluate robust and reliable statistical methods for both data preprocessing and statistical inference. This thesis is divided into three papers. The first manuscript is a critical review of the current practices in HTS data analysis. It includes several recommendations for improving sensitivity and specificity of screens. The second manuscript compares the performance of different robust preprocessing methods applied to replicated two-way data with respect to detection of outlying cells. The third manuscript evaluates some of the statistical methods described in the first manuscript with respect to their performance when applied to several empirical data sets.

Page generated in 0.1424 seconds