• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

An investigation of the optimal test design for multi-stage test using the generalized partial credit model

Chen, Ling-Yin 27 January 2011 (has links)
Although the design of Multistage testing (MST) has received increasing attention, previous studies mostly focused on comparison of the psychometric properties of MST with CAT and paper-and-pencil (P&P) test. Few studies have systematically examined the number of items in the routing test, the number of subtests in a stage, or the number of stages in a test design to achieve accurate measurement in MST. Given that none of the studies have identified an ideal MST test design using polytomously-scored items, the current study conducted a simulation to investigate the optimal design for MST using generalized partial credit model (GPCM). Eight different test designs were examined on ability estimation across two routing test lengths (short and long) and two total test lengths (short and long). The item pool and generated item responses were based on items calibrated from a national test consisting of 273 partial credit items. Across all test designs, the maximum information routing method was employed and the maximum likelihood estimation was used for ability estimation. Ten samples of 1,000 simulees were used to assess each test design. The performance of each test design was evaluated in terms of the precision of ability estimates, item exposure rate, item pool utilization, and item overlap. The study found that all test designs produced very similar results. Although there were some variations among the eight test structures in the ability estimates, results indicate that the performance overall of these eight test structures in achieving measurement precision did not substantially deviate from one another with regard to total test length and routing test length. However, results from the present study suggest that routing test length does have a significant effect on the number of non-convergent cases in MST tests. Short routing tests tended to result in more non-convergent cases, and the presence of fewer stage tests yielded more of such cases than structures with more stages. Overall, unlike previous findings, the results of the present study indicate that the MST test structure is less likely to be a factor impacting ability estimation when polytomously-scored items are used, based on GPCM. / text
2

A comparison of item selection procedures using different ability estimation methods in computerized adaptive testing based on the generalized partial credit model

Ho, Tsung-Han 17 September 2010 (has links)
Computerized adaptive testing (CAT) provides a highly efficient alternative to the paper-and-pencil test. By selecting items that match examinees’ ability levels, CAT not only can shorten test length and administration time but it can also increase measurement precision and reduce measurement error. In CAT, maximum information (MI) is the most widely used item selection procedure. However, the major challenge with MI is the attenuation paradox, which results because the MI algorithm may lead to the selection of items that are not well targeted at an examinee’s true ability level, resulting in more errors in subsequent ability estimates. The solution is to find an alternative item selection procedure or an appropriate ability estimation method. CAT studies have not investigated the association between these two components of a CAT system based on polytomous IRT models. The present study compared the performance of four item selection procedures (MI, MPWI, MEI, and MEPV) across four ability estimation methods (MLE, WLE, EAP-N, and EAP-PS) under the mixed-format CAT based on the generalized partial credit model (GPCM). The test-unit pool and generated responses were based on test-units calibrated from an operational national test that included both independent dichotomous items and testlets. Several test conditions were manipulated: the unconstrained CAT as well as the constrained CAT in which the CCAT was used as the content-balancing, and the progressive-restricted procedure with maximum exposure rate equal to 0.19 (PR19) served as the exposure control in this study. The performance of various CAT conditions was evaluated in terms of measurement precision, exposure control properties, and the extent of selected-test-unit overlap. Results suggested that all item selection procedures, regardless of ability estimation methods, performed equally well in all evaluation indices across two CAT conditions. The MEPV procedure, however, was favorable in terms of a slightly lower maximum exposure rate, better pool utilization, and reduced test and selected-test-unit overlap than with the other three item selection procedures when both CCAT and PR19 procedures were implemented. It is not necessary to implement the sophisticated and computing-intensive Bayesian item selection procedures across ability estimation methods under the GPCM-based CAT. In terms of the ability estimation methods, MLE, WLE, and two EAP methods, regardless of item selection procedures, did not produce practical differences in all evaluation indices across two CAT conditions. The WLE method, however, generated significantly fewer non-convergent cases than did the MLE method. It was concluded that the WLE method, instead of MLE, should be considered, because the non-convergent case is less of an issue. The EAP estimation method, on the other hand, should be used with caution unless an appropriate prior θ distribution is specified. / text

Page generated in 0.174 seconds