Global ETD Search

Return to search

A comparison of the performance of testlet-based computer adaptive tests and multistage tests

Computer adaptive testing (CAT) has grown both in research and implementation. Test construction and security issues, however, have led many to reconsider the merits of CAT. Multistage testing (MST) is an alternative adaptive test design that purportedly addresses CAT's shortcomings. Yet considerably less research has been conducted on MST. Also, most research in adaptive testing has been based on item response theory (IRT). Many tests now make use of testlets -- bundles of items administered together, often based on a common stimulus. The use of testlets violates local independence, a fundamental assumptions of IRT. Testlet response theory (TRT) is a relatively new measurement model designed to measure testlet-based tests. Few studies though have examined its use in testlet-based CAT and MST designs. This dissertation investigated the performance of testlet-based CATs and MSTs measured using the TRT model. The test designs compared included a CAT that is adaptive at the testlet level only (testlet-level CAT), a CAT that is adaptive at both the testlet and item levels (item-level CAT) and a MST design (MST). Test conditions manipulated included test length, item pool size, and examinee ability distribution. Examinee data were generated using TRT-calibrated item parameters based on data from a large-scale reading assessment. The three test designs were evaluated based on measurement effectiveness and exposure control properties. The study found that all three adaptive test designs yielded similar and good measurement accuracy. Overall, the item-level CAT produced better measurement precision, followed by the MST design. However, the MST and CAT designs yielded better measurement precision at different areas of the ability scale. All three test designs yielded acceptable exposure control properties at the testlet level. At the item level, the testlet-level CAT produced the best overall result. The item-level CAT had less than ideal pool utilization, but was able to meet its pre-specified maximum exposure control rate and maintain low item exposure rates. The MST had excellent pool utilization, but a higher percentage of items with high exposure rates. Skewing the underlying ability distribution also had a particularly notable negative effect on the exposure control properties of the MST. / text

Computer adaptive testing

Bayesian statistical decision theory

Identifer	oai:union.ndltd.org:UTEXAS/oai:repositories.lib.utexas.edu:2152/3862
Date	29 August 2008
Creators	Keng, Leslie, 1974-
Source Sets	University of Texas
Language	English
Detected Language	English
Type	Thesis
Format	electronic
Rights	Copyright is held by the author. Presentation of this material on the Libraries' web site by University Libraries, The University of Texas at Austin was made possible under a limited license grant from the author who has retained all copyrights in the works.

Page generated in 0.002 seconds

A comparison of the performance of testlet-based computer adaptive tests and multistage tests

Description

Links & Downloads

Tags

Additional Fields