• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Investigating the impact of a mixed-format item pool on optimal test designs for multistage testing

Park, Ryoungsun 08 September 2015 (has links)
The multistage testing (MST) has drawn increasing attention as a balanced format of adaptive testing that takes advantages of both fully-adaptive computerized adaptive testing (CAT) and paper-and-pencil (P\&P) tests. Most previous studies on MST have focused on purely dichotomous or polytomous item formats although the mixture of two item types (i.e., mixed-format) provides desirable psychometric properties by combining the strength of both item types. Given the dearth of studies investigating the characteristics of mixed-format MST, the current study conducted a simulation to identify important design factors impacting the measurement precision of mixed-format MST. The study considered several factors-namely, total points (40 and 60), MST structures (1-2-2 and 1-3-3), the proportion of polytomous items (10%, 30%, 50% and 70%), and the routing module design (purely dichotomous and a mixture of dichotomous and polytomous items) resulting in 32 total conditions. A total of 100 replications were performed, and 1,000 normally distributed examinees were generated in each replication. The performance of MST was evaluated in terms of the precision of ability estimation across the wide range of the scale. The study found that the longer test produced greater measurement precision while the 1-3-3 structure performed better than 1-2-2 structure. In addition, a larger proportion of polytomous items resulted in lower measurement precision through the reduced test information during the test construction. The interaction between the large proportion of polytomous items and the purely dichotomous routing module design was identified. Overall, the two factors of test length and the MST structure impacted the ability estimation, whereas the impact of the proportion of polytomous items and routing module design mirrored the item pool characteristic. / text
2

Mixed-format test score equating: effect of item-type multidimensionality, length and composition of common-item set, and group ability difference

Wang, Wei 01 December 2013 (has links)
Mixed-format tests containing both multiple-choice (MC) items and constructed-response (CR) items are now widely used in many testing programs. Mixed-format tests often are considered to be superior to tests containing only MC items although the use of multiple item formats leads to measurement challenges in the context of equating conducted under the common-item nonequivalent groups design (CINEG). The purpose of this dissertation was to investigate how various test characteristics and examinee characteristics influence CINEG mixed-format test score equating results. Simulated data were used in this dissertation. Simulees' item responses were generated using items selected from one MC item pool and one CR item pool which were constructed based on the College Board Advanced Placement examinations from various subject areas. Five main factors were investigated in this dissertation, including item-type dimensionality, group ability difference, within group ability difference, length and composition of the common-item set, and format representativeness of the common-item set. In addition, the performance of two equating methods, the presmoothed frequency estimation method (PreSm_FE) and the presmoothed chained equipercentile equating method (PreSm_CE), was compared under various conditions. To evaluate equating results, both conditional statistics and overall summary statistics were considered: absolute bias, standard error of equating, and root mean squared error. The difference that matters (DTM) also was used as a criterion for evaluating whether adequate equating results were obtained. The main findings based on the simulation studies are as follows: (1) For most situations, item-type multidimensionality did not have substantial impact on random error, regardless of the common-item set. However, its influence on bias depended on the composition of common-item sets; (2) Both the group ability difference factor and the within group ability difference factor had no substantial influence on random error. When group ability differences were simulated, the common-item set with more items or more total score points had less equating error. When a within group ability difference existed, conditions in which there was a balance of different item formats in the common-item set displayed more accurate equating results than did unbalanced common-item sets. (3) The relative performance of common-item sets with various lengths and compositions was dependent on the levels of group ability difference, within group ability difference, and test dimensionality. (4) The common-item set containing only MC items performed similarly to the common-item set with both item formats when the test forms were unidimensional and no within group ability difference existed or when groups of examinees did not differ in proficiency. (5) The PreSm_FE method was more sensitive to group ability difference than the PreSm_CE method. When the within group ability difference was non-zero, the relative performance of the two methods depended on the length and composition of the common-item set. The two methods performed almost the same in terms of random error. The studies conducted in this dissertation suggest that when equating multidimensional mixed-format test forms in practice, if groups of examinees differ substantially in overall proficiency, inclusion of both item formats should be considered for the common-item set. When within group ability differences are likely to exist, balancing different item formats in the common-item set appears to be even more important than the use of a larger number of common items for obtaining accurate equating results. Because only simulation studies were conducted in this dissertation, caution should be exercised when generalizing the conclusions to practical situations.

Page generated in 0.068 seconds