• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The impact of equating method and format representation of common items on the adequacy of mixed-format test equating using nonequivalent groups

Hagge, Sarah Lynn 01 July 2010 (has links)
Mixed-format tests containing both multiple-choice and constructed-response items are widely used on educational tests. Such tests combine the broad content coverage and efficient scoring of multiple-choice items with the assessment of higher-order thinking skills thought to be provided by constructed-response items. However, the combination of both item formats on a single test complicates the use of psychometric procedures. The purpose of this dissertation was to examine how characteristics of mixed-format tests and composition of the common-item set impact the accuracy of equating results in the common-item nonequivalent groups design. Operational examinee item responses for two classes of data were considered in this dissertation: (1) operational test forms and (2) pseudo-test forms that were assembled from portions of operational test forms. Analyses were conducted on three mixed-format tests from the Advanced Placement Examination program: English Language, Spanish Language, and Chemistry. For the operational test form analyses, two factors of investigation were considered as follows: (1) difference in proficiency between old and new form groups of examinees and (2) relative difficulty of multiple-choice and constructed-response items. For the pseudo-test form analyses, two additional factors of investigation were considered: (1) format representativeness of the common-item set and (2) statistical representativeness of the common-item set. For each study condition, two traditional equating methods, frequency estimation and chained equipercentile equating, and two item response theory (IRT) equating methods, IRT true score and IRT observed score methods, were considered. There were five main findings from the operational and pseudo-test form analyses. (1) As the difference in proficiency between old and new form groups of examinees increased, bias also tended to increase. (2) Relative to the criterion equating relationship for a given equating method, increases in bias were typically largest for frequency estimation and smallest for the IRT equating methods. However, it is important to note that the criterion equating relationship was different for each equating method. Additionally, only one smoothing value was analyzed for the traditional equating methods. (3) Standard errors of equating tended to be smallest for IRT observed score equating and largest for chained equipercentile equating. (4) Results for the operational and pseudo-test analyses were similar when the pseudo-tests were constructed to be similar to the operational test forms. (5) Results were mixed regarding which common-item set composition resulted in the least bias.
2

A comparison of smoothing methods for the common item nonequivalent groups design

Kim, Han Yi 01 July 2014 (has links)
The purpose of this study was to compare the relative performance of various smoothing methods under the common item nonequivalent groups (CINEG) design. In light of the previous literature on smoothing under the CINEG design, this study aimed to provide general guidelines and practical insights on the selection of smoothing procedures under specific testing conditions. To investigate the smoothing procedures, 100 replications were simulated under various testing conditions by using an item response theory (IRT) framework. A total of 192 conditions (3 sample size × 4 group ability difference × 2 common-item proportion × 2 form difficulty difference × 1 test length × 2 common-item type × 2 common-item difficulty spread) were investigated. Two smoothing methods including log-linear presmoothing and cubic spline postsmoothing were considered with four equating methods including frequency estimation (FE), modified frequency estimation (MFE), chained equipercentile equating (CE), and kernel equating (KE). Bias, standard error, and root mean square error were computed to evaluate the performance of the smoothing methods. Results showed that 1) there were always one or more smoothing methods that produced smaller total error than unsmoothed methods; 2) polynomial log-linear presmoothing tended to perform better than cubic spline postsmoothing in terms of systematic and total errors when FE or MFE were used; 3) cubic spline postsmoothing showed a strong tendency to produce the least amount of random error regardless of the equating method used; 4) KE produced more accurate equating relationships under a majority of testing conditions when paired with CE; and 5) log-linear presmoothing produced smaller total error under a majority testing conditions than did cubic spline postsmoothing. Tables are provided to show the best-performing method for all combinations of testing conditions considered.

Page generated in 0.0772 seconds