This study aimed to compare the efficiencies of multi-dimensional CAT versus uni-dimensional CAT based on the multi-dimensional graded response model and provide information about the optimal size of the item pool. Item selection and ability estimation methods based on multi-dimensional graded response models were developed and two studies, one based on simulated data, the other based on real data, were conducted. Five design factors were manipulated: correlation between dimensions, item pool size, test length, ability level, and number of estimated dimensions.
A modest effect due to the correlation between dimensions on the outcome measures was observed, although the effect was found primarily for correlations of 0 versus 0.4. Based on a comparison of the correlation condition equal to zero with correlation conditions greater than zero, the multi-dimensional CAT was more efficient than the uni-dimensional CAT. As expected, ability level had an impact on the outcome measures. A multi-dimensional CAT provided more accurate estimates for those examinees with average true ability values than those with true ability values in the extreme range. The multi-dimensional CAT was over-estimated for examinees with negative true ability values and under-estimated for examinees with positive true ability values. This is consistent with Bayesian estimation methods which shrink estimates toward the mean of the prior distribution. As the number of estimated dimensions increased, more accurate estimates were achieved. This supports the idea that the ability of one dimension can be used to augment the information available to estimate ability in another dimension. Finally, larger item pools and longer tests yielded more accurate and reliable ability estimation, although greater difference in efficiency was realized when comparing shorter tests and smaller item pools.
Information on the optimal item pool size was provided by plotting the outcome measures versus the item pool size. The plots indicated that, for short tests, the optimal item pool size was 20 items; for longer test, the optimal item pool size was 50 items. However, if item exposure control or content balancing were an issue, a larger item pool would be needed to achieve the same efficiency in ability estimates.
Identifer | oai:union.ndltd.org:PITT/oai:PITTETD:etd-06132007-134603 |
Date | 27 September 2007 |
Creators | Liu, Jingyu |
Contributors | James J. Irrgan, Feifei Ye, Susan Lane, Clement A. Stone |
Publisher | University of Pittsburgh |
Source Sets | University of Pittsburgh |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.library.pitt.edu/ETD/available/etd-06132007-134603/ |
Rights | unrestricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to University of Pittsburgh or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report. |
Page generated in 0.0079 seconds