Return to search

DETERMINING A SUFFICIENT LEVEL OF INTER-RATER RELIABILITY (POWER ANALYSIS, MISCLASSIFICATION, SAMPLE SIZE)

The reliability of a test or measurement procedure is, generally speaking, an index of the consistency of its results. Inter-rater reliability assesses the consistency of judgements among a set of raters. We model the observation taken on a subject by an unreliable procedure as the sum of a true score with mean (mu) and variance (sigma)(,T)('2) and an error term with mean 0 and variance (sigma)(,E)('2). The reliability coefficient then is (rho) = (sigma)(,T)('2)/((sigma)(,T)('2) + (sigma)(,E)('2)). / The reliability of an instrument or rating procedure is generally evaluated in an initial experiment (or series of experiments) known as a "reliability study." Once an instrument is established as having some degree of reliability, it is then used as a measurement tool in subsequent research, known as "decision studies." / An unreliable procedure measures imperfectly. The impact of the error in measurement is investigated as it relates to three broad areas of statistical procedures: estimation, hypothesis testing, and decision-making. / An unreliable measurement decreases the precision of estimates. The effect of an unreliable measurement on the width of a confidence interval for the population mean is examined. Also, an expression is developed to facilitate estimation of the reliability of a test or measurement in a decision study when the populations of interest may differ from those in the reliability study. / An unreliable instrument weakens hypothesis tests. The extent to which lack of reliability attenuates the power of the two-sample t-test, the F-test in the analysis of variance, and the t-test for statistically significant correlation between two variables is investigated. / An unreliable measurement engenders false classifications. A dichotomous decision is considered, and expressions for the probability of misclassifying a subject by a rating procedure with a given reliability are developed. Overall as well as directional misclassification rates are found under the model of true scores and errors distributed as independent normals. Effects of departures from this model, by heavy-tailed and skewed true score and error distributions, and by errors whose variance is a function of the true score, are considered. A general expression for this misclassification probability is found. A confidence interval for the misclassification probability is developed. / These results provide tools for a researcher better to make decisions concerning the design of an experiment. They permit the costs of increased reliability to be more knowledgeably compared with the consequences of using an unreliable measurement procedure in a given situation. / Source: Dissertation Abstracts International, Volume: 45-04, Section: B, page: 1232. / Thesis (Ph.D.)--The Florida State University, 1984.

Identiferoai:union.ndltd.org:fsu.edu/oai:fsu.digital.flvc.org:fsu_75324
ContributorsRASP, JOHN M., Florida State University
Source SetsFlorida State University
Detected LanguageEnglish
TypeText
Format103 p.
RightsOn campus use only.
RelationDissertation Abstracts International

Page generated in 0.002 seconds