Return to search

Comparison of kernel equating and item response theory equating methods

The kernel method of test equating is a unified approach to test equating with some advantages over traditional equating methods. Therefore, it is important to evaluate in a comprehensive way the usefulness and appropriateness of the Kernel equating (KE) method, as well as its advantages and disadvantages compared with several popular item response theory (IRT) equating techniques. The purpose of this study was to evaluate the accuracy and stability of KE and IRT true score equating by manipulating several common factors that are known to influence the equating results. Three equating methods (Kernel post-stratification equating, Stocking-Lord and Mean/Sigma) were compared with an established equating criterion. A wide variety of conditions were simulated to match realistic situations that reflected differences in sample size, anchor test length and, group ability differences. The systematic error and random error of equating were summarized with bias statistics and the standard error of equating (SEE), and compared across the methods. The overall better equating methods under specific conditions were recommended based on the root mean squared error (RMSE). The equating results revealed that, and as expected, in general, equating error decreased as the number of anchor items was increased and sample size was increased across all the methods. Aside from method effects, group differences in ability produced the greatest impact on equating error in this particular study. The accuracy and stability of each equating method depended on the portion of the score scale range where comparisons were being made. Overall, Kernel equating was shown to be more stable in most situations but not as accurate as IRT equating for the conditions studied. The interactions between pairs of factors investigated in this study seemed to be more influential and beneficial to IRT equating than for KE. Further practical recommendations were suggested for future study: for example, using alternate methods of data simulation to remove the advantage of the IRT equating methods.

Identiferoai:union.ndltd.org:UMASS/oai:scholarworks.umass.edu:dissertations-6616
Date01 January 2012
CreatorsMeng, Yu
PublisherScholarWorks@UMass Amherst
Source SetsUniversity of Massachusetts, Amherst
LanguageEnglish
Detected LanguageEnglish
Typetext
SourceDoctoral Dissertations Available from Proquest

Page generated in 0.0023 seconds