Spelling suggestions: "subject:"type I error"" "subject:"type I arror""
1 |
A simulation comparison of two methods for controlling the experiment-wise Type I error rate of correlated tests for contrasts in one-way completely randomized designsJiao, Yuanfang January 1900 (has links)
Master of Science / Department of Statistics / Paul I. Nelson / A Bonferroni and an ordered P-value solution to the problem of controlling the experiment-wise Type I error rate are studied and compared in terms of actual size and
power when carrying out correlated tests. Although both of these solutions can be used in a wide variety of settings, here they are only investigated in the context of multiple
testing that specified pairwise comparisons of means, selected before data are collected,
are all equal to zero in a completely randomized, balanced, one factor design where the
data are independent random samples from normal distributions all having the same
variance. Simulations indicate that both methods are very similar and effective in controlling experiment wise type error at a nominal rate of 0.05. Because the ordered P-value method has, almost uniformly, slightly greater power, it is my recommendation for
use in the setting of this report.
|
2 |
The impact of sample size re-estimation on the type I error rate in the analysis of a continuous end-pointZhao, Songnian January 1900 (has links)
Master of Science / Department of Statistics / Christopher Vahl / Sample size estimation is generally based on assumptions made during the planning stage of a clinical trial. Often, there is limited information available to estimate the initial sample size. This may result in a poor estimate. For instance, an insufficient sample size may not have the capability to produce statistically significant results, while an over-sized study will lead to a waste of resources or even ethical issues in that too many patients are exposed to potentially ineffective treatments. Therefore, an interim analysis in the middle of a trial may be worthwhile to assure that the significance level is at the nominal level and/or the power is adequate to detect a meaningful treatment difference. In this report, the impact of sample size re-estimation on the type I error rate for the continuous end-point in a clinical trial with two treatments is evaluated through a simulation study. Two sample size estimation methods are taken into consideration: blinded and partially unblinded. For the blinded method, all collected data for two groups are used to estimate the variance, while only data from the control group are used to re-estimate the sample size for the partially unblinded method. The simulation study is designed with different combinations of assumed variance, assumed difference in treatment means, and re-estimation methods. The end-point is assumed to follow normal distribution and the variance for both groups are assumed to be identical. In addition, equal sample size is required for each group. According to the simulation results, the type I error rates are preserved for all settings.
|
3 |
The effect of sample size re-estimation on type I error rates when comparing two binomial proportionsCong, Danni January 1900 (has links)
Master of Science / Department of Statistics / Christopher I. Vahl / Estimation of sample size is an important and critical procedure in the design of clinical
trials. A trial with inadequate sample size may not produce a statistically significant result. On the other hand, having an unnecessarily large sample size will definitely increase the expenditure of resources and may cause a potential ethical problem due to the exposure of unnecessary number of human subjects to an inferior treatment. A poor estimate of the necessary sample size is often due to the limited information at the planning stage. Hence, the adjustment of the sample size mid-trial has become a popular strategy recently. In this work, we introduce two methods for sample size re-estimation for trials with a binary endpoint utilizing the interim information collected from the trial: a blinded method and a partially unblinded method. The blinded method recalculates the sample size based on the first stage’s overall event proportion, while the partially unblinded method performs the calculation based only on the control event proportion from the first stage. We performed simulation studies with different combinations of expected proportions based on fixed ratios of response rates. In this study, equal sample size per group was considered. The study shows that for both methods, the type I error rates were preserved satisfactorily.
|
4 |
A Monte Carlo Analysis of Experimentwise and Comparisonwise Type I Error Rate of Six Specified Multiple Comparison Procedures When Applied to Small k's and Equal and Unequal Sample SizesYount, William R. 12 1900 (has links)
The problem of this study was to determine the differences in experimentwise and comparisonwise Type I error rate among six multiple comparison procedures when applied to twenty-eight combinations of normally distributed data. These were the Least Significant Difference, the Fisher-protected Least Significant Difference, the Student Newman-Keuls Test, the Duncan Multiple Range Test, the Tukey Honestly Significant Difference, and the Scheffe Significant Difference. The Spjøtvoll-Stoline and Tukey—Kramer HSD modifications were used for unequal n conditions. A Monte Carlo simulation was used for twenty-eight combinations of k and n. The scores were normally distributed (µ=100; σ=10). Specified multiple comparison procedures were applied under two conditions: (a) all experiments and (b) experiments in which the F-ratio was significant (0.05). Error counts were maintained over 1000 repetitions. The FLSD held experimentwise Type I error rate to nominal alpha for the complete null hypothesis. The FLSD was more sensitive to sample mean differences than the HSD while protecting against experimentwise error. The unprotected LSD was the only procedure to yield comparisonwise Type I error rate at nominal alpha. The SNK and MRT error rates fell between the FLSD and HSD rates. The SSD error rate was the most conservative. Use of the harmonic mean of the two unequal sample n's (HSD-TK) yielded uniformly better results than use of the minimum n (HSD-SS). Bernhardson's formulas controlled the experimentwise Type I error rate of the LSD and MRT to nominal alpha, but pushed the HSD below the 0.95 confidence interval. Use of the unprotected HSD produced fewer significant departures from nominal alpha. The formulas had no effect on the SSD.
|
5 |
A study on the type I error rate and power for generalized linear mixed model containing one random effectWang, Yu January 1900 (has links)
Master of Science / Department of Statistics / Christopher Vahl / In animal health research, it is quite common for a clinical trial to be designed to demonstrate the efficacy of a new drug where a binary response variable is measured on an individual experimental animal (i.e., the observational unit). However, the investigational treatments are applied to groups of animals instead of an individual animal. This means the experimental unit is the group of animals and the response variable could be modeled with the binomial distribution. Also, the responses of animals within the same experimental unit may then be statistically dependent on each other. The usual logit model for a binary response assumes that all observations are independent. In this report, a logit model with a random error term representing the group of animals is considered. This is model belongs to a class of models referred to as generalized linear mixed models and is commonly fit using the SAS System procedure PROC GLIMMIX. Furthermore, practitioners often adjust the denominator degrees of freedom of the test statistic produced by PROC GLIMMIX using one of several different methods. In this report, a simulation study was performed over a variety of different parameter settings to compare the effects on the type I error rate and power of two methods for adjusting the denominator degrees of freedom, namely “DDFM = KENWARDROGER” and “DDFM = NONE”. Despite its reputation for fine performance in linear mixed models with normally distributed errors, the “DDFM = KENWARDROGER” option tended to perform poorly more often than the “DDFM = NONE” option in the logistic regression model with one random effect.
|
6 |
Detecting rater effects in trend scoringAbdalla, Widad 01 May 2019 (has links)
Trend scoring is often used in large-scale assessments to monitor for rater drift when the same constructed response items are administered in multiple test administrations. In trend scoring, a set of responses from Time A are rescored by raters at Time B. The purpose of this study is to examine the ability of trend-monitoring statistics to detect rater effects in the context of trend scoring. The present study examines the percent of exact agreement and Cohen’s kappa as interrater agreement measures, and the paired t-test and Stuart’s Q as marginal homogeneity measures. Data that contains specific rater effects is simulated under two frameworks: the generalized partial credit model and the latent-class signal detection theory model.
The findings indicate that the percent of exact agreement, the paired t-test, and Stuart’s Q showed high Type I error rates under a rescore design in which half of the rescore papers have a uniform score distribution and the other half have a score distribution proportional to the population papers at Time A. All these Type I errors were reduced when using a rescore design in which all rescore papers have a score distribution proportional to the population papers at Time A. For the second rescore design, results indicate that the ability of the percent of exact agreement, Cohen’s kappa, and the paired t-test in detecting various effects varied across items, sample sizes, and type of rater effect. The only statistic that always detected every level of rater effect across items and frameworks was Stuart’s Q.
Although advances have been made in the automated scoring field, the fact is that many testing programs require humans to score constructed response items. Previous research indicates that rater effects are common in constructed response scoring. In testing programs that keep trends in data across time, changes in scoring across time confound the measurement of change in student performance. Therefore, the study of methods to ensure rating consistency across time, such as trend scoring, is important and needed to ensure fairness and validity.
|
7 |
Score Test and Likelihood Ratio Test for Zero-Inflated Binomial Distribution and Geometric DistributionDai, Xiaogang 01 April 2018 (has links)
The main purpose of this thesis is to compare the performance of the score test and the likelihood ratio test by computing type I errors and type II errors when the tests are applied to the geometric distribution and inflated binomial distribution. We first derive test statistics of the score test and the likelihood ratio test for both distributions. We then use the software package R to perform a simulation to study the behavior of the two tests. We derive the R codes to calculate the two types of error for each distribution. We create lots of samples to approximate the likelihood of type I error and type II error by changing the values of parameters.
In the first chapter, we discuss the motivation behind the work presented in this thesis. Also, we introduce the definitions used throughout the paper. In the second chapter, we derive test statistics for the likelihood ratio test and the score test for the geometric distribution. For the score test, we consider the score test using both the observed information matrix and the expected information matrix, and obtain the score test statistic zO and zI .
Chapter 3 discusses the likelihood ratio test and the score test for the inflated binomial distribution. The main parameter of interest is w, so p is a nuisance parameter in this case. We derive the likelihood ratio test statistics and the score test statistics to test w. In both tests, the nuisance parameter p is estimated using maximum likelihood estimator pˆ. We also consider the score test using both the observed and the expected information matrices.
Chapter 4 focuses on the score test in the inflated binomial distribution. We generate data to follow the zero inflated binomial distribution by using the package R. We plot the graph of the ratio of the two score test statistics for the sample data, zI /zO , in terms of different values of n0, the number of zero values in the sample.
In chapter 5, we discuss and compare the use of the score test using two types of information matrices. We perform a simulation study to estimate the two types of errors when applying the test to the geometric distribution and the inflated binomial distribution. We plot the percentage of the two errors by fixing different parameters, such as the probability p and the number of trials m.
Finally, we conclude by briefly summarizing the results in chapter 6.
|
8 |
A meta-analysis of Type I error rates for detecting differential item functioning with logistic regression and Mantel-Haenszel in Monte Carlo studiesVan De Water, Eva 12 August 2014 (has links)
Differential item functioning (DIF) occurs when individuals from different groups who have equal levels of a latent trait fail to earn commensurate scores on a testing instrument. Type I error occurs when DIF-detection methods result in unbiased items being excluded from the test while a Type II error occurs when biased items remain on the test after DIF-detection methods have been employed. Both errors create potential issues of injustice amongst examinees and can result in costly and protracted legal action. The purpose of this research was to evaluate two methods for detecting DIF: logistic regression (LR) and Mantel-Haenszel (MH).
To accomplish this, meta-analysis was employed to summarize Monte Carlo quantitative studies that used these methods in published and unpublished literature. The criteria employed for comparing these two methods were Type I error rates, the Type I error proportion, which was also the Type I error effect size measure, deviation scores, and power rates. Monte Carlo simulation studies meeting inclusion criteria, with typically 15 Type I error effect sizes per study, were compared to assess how the LR and MH statistical methods function to detect DIF.
Studied variables included DIF magnitude, nature of DIF (uniform or non-uniform), number of DIF items, and test length. I found that MH was better at Type I error control while LR was better at controlling Type II error. This study also provides a valuable summary of existing DIF methods and a summary of the types of variables that have been manipulated in DIF simulation studies with LR and MH. Consequently, this meta-analysis can serve as a resource for practitioners to help them choose between LR and MH for DIF detection with regard to Type I and Type II error control, and can provide insight for parameter selection in the design of future Monte Carlo DIF studies.
|
9 |
A Geometry-Based Multiple Testing Correction for Contingency Tables by Truncated Normal Distribution / 切断正規分布を用いた分割表の幾何学的マルチプルテスティング補正法Basak, Tapati 24 May 2021 (has links)
京都大学 / 新制・課程博士 / 博士(医学) / 甲第23367号 / 医博第4736号 / 新制||医||1051(附属図書館) / 京都大学大学院医学研究科医学専攻 / (主査)教授 森田 智視, 教授 川上 浩司, 教授 佐藤 俊哉 / 学位規則第4条第1項該当 / Doctor of Medical Science / Kyoto University / DFAM
|
10 |
Biomarker informed adaptive clinical trial designsWang, Jing 22 January 2016 (has links)
In adaptive design clinical trials, an endpoint at the final analysis that takes a long time to observe is not feasible to be used for making decisions at the interim analysis. For example, overall survival (OS) in oncology trials usually cannot be used to make interim decisions. However, biomarkers correlated to the final clinical endpoint can be used. Hence, considerable interest has been drawn towards the biomarker informed adaptive clinical trial designs.
Shun et al. (2008) proposed a "biomarker informed two-stage winner design" with 2 active treatment arms and a control arm, and proposed a normal approximation method to preserve type I error. However, their method cannot be extended to designs with more than 2 active treatment arms. In this dissertation, we propose a novel statistical approach for biomarker informed two-stage winner design that can accommodate multiple active arms and control type I error. We further propose another biomarker informed adaptive design called "biomarker informed add-arm design for unimodal response". This design utilizes existing knowledge about the shape of dose-response relationship to optimize the procedure of selecting best candidate treatment for a larger trial. The key element of the proposed design is that, some inferior treatments do not need to be explored and the design is shown to be more efficient than biomarker informed two-stage winner design mathematically.
Another important component in the study of biomarker informed adaptive designs is to model the relationship between the two endpoints. The conventional approach uses a one-level correlation model, which might be inappropriate if there is no solid historical knowledge of the two endpoints. A two-level correlation model is developed in this dissertation. In the new model a new variable that describes the mean level correlation is developed, so that the uncertainty of the historical knowledge could be more accurately reflected. We use this new model to study the "biomarker informed two-stage winner design" and the "biomarker informed add-arm design for unimodal response". We show the new proposed model performs better than conventional model via simulations.
The concordance of inference based on biomarker and primary endpoint is further studied in a real case.
|
Page generated in 0.049 seconds