Return to search

INVESTIGATION OF JUDGES' ERRORS IN ANGOFF AND CONTRASTING-GROUPS CUT-OFF SCORE METHODS (STANDARD SETTING, MASTERY TESTING, CRITERION-REFERENCED TESTING)

Methods for specifying cut-off scores for a criterion-referenced test usually rely on judgments about item content and/or examinees. Comparisons of cut-off score methods have found that different methods result in different cut-off scores. This dissertation focuses on understanding why and how cut-off score methods are different. The importance of this understanding is reflected in practitioners' needs to choose appropriate cut-off score methods, and to understand and control inappropriate factors that may influence the cut-off scores. First, a taxonomy of cut-off score methods was developed. The taxonomy identified the generic categories of setting cut-off scores. Second, the research focused on three methods for estimating the errors associated with setting cut-off scores: generalizability theory, item response theory and bootstrap estimation. These approaches were applied to Angoff and Contrasting-groups cut-off score methods. For the Angoff cut-off score method, the IRT index of consistency and analyses of the differences between judges' ratings and expected test item difficulty, provided useful information for reviewing specific test items that judges were inconsistent in rating. In addition, the generalizability theory and bootstrap estimates were useful for overall estimates of the errors in judges' ratings. For the Contrasting-groups cut-off score method, the decision accuracy of the classroom cut-off scores was useful for identifying classrooms in which the classification of students may need to be reviewed by teachers. The bootstrap estimate of the pooled sample of students provided a useful overall estimate of the errors in the resulting cut-off score. There are several extensions of this investigation that can be made. For example, there is a need to understand the magnitude of errors in relationship to the precision with which judges are able to rate test items or classify examinees; better ways of reporting and dealing with judges' inconsistencies need to be developed; and the analysis of errors needs to be extended to other cut-off score methods. Finally, these procedures can provide the operational criterion against which improvements and comparisons of cut-off score procedures can be evaluated.

Identiferoai:union.ndltd.org:UMASS/oai:scholarworks.umass.edu:dissertations-5454
Date01 January 1986
CreatorsARRASMITH, DEAN GORDON
PublisherScholarWorks@UMass Amherst
Source SetsUniversity of Massachusetts, Amherst
LanguageEnglish
Detected LanguageEnglish
Typetext
SourceDoctoral Dissertations Available from Proquest

Page generated in 0.0018 seconds