• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • Tagged with
  • 5
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Investigating the impact of a mixed-format item pool on optimal test designs for multistage testing

Park, Ryoungsun 08 September 2015 (has links)
The multistage testing (MST) has drawn increasing attention as a balanced format of adaptive testing that takes advantages of both fully-adaptive computerized adaptive testing (CAT) and paper-and-pencil (P\&P) tests. Most previous studies on MST have focused on purely dichotomous or polytomous item formats although the mixture of two item types (i.e., mixed-format) provides desirable psychometric properties by combining the strength of both item types. Given the dearth of studies investigating the characteristics of mixed-format MST, the current study conducted a simulation to identify important design factors impacting the measurement precision of mixed-format MST. The study considered several factors-namely, total points (40 and 60), MST structures (1-2-2 and 1-3-3), the proportion of polytomous items (10%, 30%, 50% and 70%), and the routing module design (purely dichotomous and a mixture of dichotomous and polytomous items) resulting in 32 total conditions. A total of 100 replications were performed, and 1,000 normally distributed examinees were generated in each replication. The performance of MST was evaluated in terms of the precision of ability estimation across the wide range of the scale. The study found that the longer test produced greater measurement precision while the 1-3-3 structure performed better than 1-2-2 structure. In addition, a larger proportion of polytomous items resulted in lower measurement precision through the reduced test information during the test construction. The interaction between the large proportion of polytomous items and the purely dichotomous routing module design was identified. Overall, the two factors of test length and the MST structure impacted the ability estimation, whereas the impact of the proportion of polytomous items and routing module design mirrored the item pool characteristic. / text
2

Operational characteristics of mixed-format multistage tests using the 3PL testlet response theory model

Hembry, Ian Fredrick 19 September 2014 (has links)
Multistage tests (MSTs) have received renewed interest in recent years as an effective compromise between fixed-length linear tests and computerized adaptive test. Most MSTs studies scored the assessments based on item response theory (IRT) methods. Many assessments are currently being developed as mixed-format assessments that administer both standalone items and clusters of items associated with a common stimulus called testlets. By the nature of a testlet, a natural dependency occurs between the items within the testlet that violates the local independence of items. Local independence is a fundamental assumption of the IRT models. Using dichotomous IRT methods on a mixed-format testlet-based assessment knowingly violates local independence. By combining the score points within a testlet, researchers have successfully applied polytomous IRT models. However, the use of such models loses information by not using the unique response patterns provided by each item within a testlet. The three-parameter logistic testlet response theory (3PL-TRT) model is a measurement model developed to retain the uniqueness in response patterns of each item, while accounting for the local dependency exhibited by a testlet, or testlet effect. Because few studies have examined mixed-format MSTs administration under the 3PL-TRT model, the dissertation performed a simulation to investigate the administration of a mixed-format testlet based MSTs under the 3PL-TRT model. Simulee responses were generated based on the 3PL-TRT calibrated item parameters from a real large-scale passage based standardized assessment. The manipulated testing conditions considered four panel designs, two test lengths, three routing procedures, and three conditions of local item dependence. The study found functionally no bias across testing conditions. All conditions showed adequate measurement properties, but a few differences did occur between some of the testing conditions. The measurement precision was impacted by panel design, test length and the magnitude of local item dependence. The three-stage MSTs consistently illustrated slightly lower measurement precision than the two-stage MSTs. As expected, the longer test length conditions had better measurement precision than the shorter test length conditions. Conditions with the largest magnitude of local item dependency showed the worst measurement precision. The routing procedure had little impact on the measurement effectiveness. / text
3

An investigation of the optimal test design for multi-stage test using the generalized partial credit model

Chen, Ling-Yin 27 January 2011 (has links)
Although the design of Multistage testing (MST) has received increasing attention, previous studies mostly focused on comparison of the psychometric properties of MST with CAT and paper-and-pencil (P&P) test. Few studies have systematically examined the number of items in the routing test, the number of subtests in a stage, or the number of stages in a test design to achieve accurate measurement in MST. Given that none of the studies have identified an ideal MST test design using polytomously-scored items, the current study conducted a simulation to investigate the optimal design for MST using generalized partial credit model (GPCM). Eight different test designs were examined on ability estimation across two routing test lengths (short and long) and two total test lengths (short and long). The item pool and generated item responses were based on items calibrated from a national test consisting of 273 partial credit items. Across all test designs, the maximum information routing method was employed and the maximum likelihood estimation was used for ability estimation. Ten samples of 1,000 simulees were used to assess each test design. The performance of each test design was evaluated in terms of the precision of ability estimates, item exposure rate, item pool utilization, and item overlap. The study found that all test designs produced very similar results. Although there were some variations among the eight test structures in the ability estimates, results indicate that the performance overall of these eight test structures in achieving measurement precision did not substantially deviate from one another with regard to total test length and routing test length. However, results from the present study suggest that routing test length does have a significant effect on the number of non-convergent cases in MST tests. Short routing tests tended to result in more non-convergent cases, and the presence of fewer stage tests yielded more of such cases than structures with more stages. Overall, unlike previous findings, the results of the present study indicate that the MST test structure is less likely to be a factor impacting ability estimation when polytomously-scored items are used, based on GPCM. / text
4

An Evaluation of DIF Tests in Multistage Tests for Continuous Covariates

Debelak, Rudolf, Debeer, Dries 22 January 2024 (has links)
Multistage tests are a widely used and efficient type of test presentation that aims to provide accurate ability estimates while keeping the test relatively short. Multistage tests typically rely on the psychometric framework of item response theory. Violations of item response models and other assumptions underlying a multistage test, such as differential item functioning, can lead to inaccurate ability estimates and unfair measurements. There is a practical need for methods to detect problematic model violations to avoid these issues. This study compares and evaluates three methods for the detection of differential item functioning with regard to continuous person covariates in data from multistage tests: a linear logistic regression test and two adaptations of a recently proposed score-based DIF test. While all tests show a satisfactory Type I error rate, the score-based tests show greater power against three types of DIF effects.
5

Utilizing response time for item selection in on-the-fly multistage adaptive testing for PISA assessment

Xiuxiu Tang (18430326) 25 April 2024 (has links)
<p dir="ltr">Multistage adaptive testing (MST) has become one of the most popular test designs for large-scale testing. However, it has some weaknesses such as a larger estimation bias compared to computerized adaptive testing (CAT). On-the-fly multistage adaptive testing (OMST) can balance the advantages and limitations of CAT and MST. Several CAT item selection methods that incorporate response time have been proposed. However, incorporating response time into OMST to select items was rarely studied. The study plans to explore the possibility of applying OMST with response time to the Programme for International Student Assessment to solve the issue of large estimation bias and improve test efficiency.</p>

Page generated in 0.1038 seconds