The purpose of this study was to investigate the performance of the parametric bootstrap method and to compare the parametric and nonparametric bootstrap methods for estimating the standard error of equating (SEE) under the common-item nonequivalent groups (CINEG) design with the frequency estimation (FE) equipercentile method under a variety of simulated conditions.
When the performance of the parametric bootstrap method was investigated, bivariate polynomial log-linear models were employed to fit the data. With the consideration of the different polynomial degrees and two different numbers of cross-product moments, a total of eight parametric bootstrap models were examined. Two real datasets were used as the basis to define the population distributions and the "true" SEEs. A simulation study was conducted reflecting three levels for group proficiency differences, three levels of sample sizes, two test lengths and two ratios of the number of common items and the total number of items. Bias of the SEE, standard errors of the SEE, root mean square errors of the SEE, and their corresponding weighted indices were calculated and used to evaluate and compare the simulation results.
The main findings from this simulation study were as follows: (1) The parametric bootstrap models with larger polynomial degrees generally produced smaller bias but larger standard errors than those with lower polynomial degrees. (2) The parametric bootstrap models with a higher order cross product moment (CPM) of two generally yielded more accurate estimates of the SEE than the corresponding models with the CPM of one. (3) The nonparametric bootstrap method generally produced less accurate estimates of the SEE than the parametric bootstrap method. However, as the sample size increased, the differences between the two bootstrap methods became smaller. When the sample size was equal to or larger than 3,000, the differences between the nonparametric bootstrap method and the parametric bootstrap model that produced the smallest RMSE were very small. (4) Of all the models considered in this study, parametric bootstrap models with the polynomial degree of four performed better under most simulation conditions. (5) Aside from method effects, sample size and test length had the most impact on estimating the SEE. Group proficiency differences and the ratio of the number of common items to the total number of items had little effect on a short test, but had slight effect on a long test.
Identifer | oai:union.ndltd.org:uiowa.edu/oai:ir.uiowa.edu:etd-2572 |
Date | 01 July 2011 |
Creators | Wang, Chunxin |
Contributors | Ansley, Timothy Neri, Lee, Won-Chan |
Publisher | University of Iowa |
Source Sets | University of Iowa |
Language | English |
Detected Language | English |
Type | dissertation |
Format | application/pdf |
Source | Theses and Dissertations |
Rights | Copyright 2011 Chunxin Wang |
Page generated in 0.0023 seconds