Return to search

A comparison of item parameter estimates and ICCs produced with TESTGRAF and BILOG under different test lengths and sample sizes.

There are many procedures used to estimate IRT parameters; however, among the most popular techniques are those used in the LOGIST and BILOG computer programs. LOGIST requires large numbers of examinees and items (in the order of 1000 or more examinees and 40 or more items) for stable 3PL model parameter estimates. BILOG is a more recent estimation program and, in general, requires smaller numbers of examinees and items than LOGIST for stable 3PL model parameter estimates. It also has been found that, regardless of sample size and test length, BILOG estimates tend to be uniformly more or at least as accurate as LOGIST estimates. For this reason, BILOG is now used as the standard to which new estimation programs are compared. The purpose of this study was to examine the effects of varying sample size (N = 100, 250, 500, and 1000) and test length (20- and 40-item tests) on the accuracy and consistency of 3PL model item parameter estimates and ICCs obtained from TESTGRAF and BILOG. Overall, TESTGRAF seemed to perform better or just as well as BILOG. Where large bias effect sizes existed, in all but one case, TESTGRAF was more accurate than BILOG. TESTGRAF was slightly less accurate than BILOG in estimating the $P(\theta$)'s at high ability levels. Where large efficiency effect sizes existed, in all but two cases, TESTGRAF was more consistent than BILOG. TESTGRAF was slightly less consistent than BILOG in estimating the a parameter with a sample size of 1000 and in estimating the c parameter at all sample sizes. (Abstract shortened by UMI.)

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/9889
Date January 1995
CreatorsPatsula, Liane.
ContributorsGessaroli, M.,
PublisherUniversity of Ottawa (Canada)
Source SetsUniversité d’Ottawa
Detected LanguageEnglish
TypeThesis
Format75 p.

Page generated in 0.0016 seconds