This study compares item and examinee properties, studies the robustness of IRT models, and examines the difference in robustness when using model-data-fit as a robustness criterion. A conceptualization of robustness as a statistical relationship between model assumption violation and invariance properties has been created in this study based on current understanding on IRT models. Using real data from British Columbia Science Assessments, a series of regressional and canonical analyses were conducted. Scatterplots were used to study possible non-linear relationships. The means and standard deviations of "a" and "c" parameter estimates obtained by applying the three-parameter model to a data sample were used as indices of equal discrimination and non-guessing assumption violation for the Rasch model. The assumption of local independence was taken as being equivalent to the assumption of unidimensionality, and Humphreys' pattern index "p" was used to assess the degree of unidimensionality assumption violation. Means and standard deviations of Yen's Q [i subscript] were used to assess the model-data-fit of items at the total test level. Another statistic to assess the model-data-fit of examinees (D [i subscript]) was created and validated in this study. The mean and standard deviation of D [i subscript] were used to assess model-data-fit of examinees at the total test level. The statistics used in this study for assessing item and ability parameter estimate invariance properties were correlations between estimates obtained from a sample and the estimates obtained from an assessment data file. It was found that model-data-fit of items and model-data-fit of examinees are two statistically independent total test properties of model-data-fit. Therefore, there is a necessity in practice to differentiate model-data-fit of items and model-data-fit of examinees. It was also found that item estimate invariance and ability estimate invariance are statistically independent total test properties of invariance. Therefore, there is also a necessity in practice to differentiate item invariance and ability invariance. When invariance is used as a criterion for robustness, the three-parameter model is robust for all the combinations of sample size and test length. The Rasch model is not robust in terms of ability estimate invariance when a large sample size is combined with a moderate test length, or when a moderate sample size is combined with a long test length. Finally, no significant relationship between model-data-fit and invariance was found. Therefore, results of robustness studies obtained when model-data-fit is used as a criterion and the results when invariance is used as a criterion may be totally different, or even contradictory. Because invariance is the fundamental premise of IRT models, invariance properties rather than model-data-fit should be used as criteria for robustness.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:BVAU.2429/1215 |
Date | 11 1900 |
Creators | Liu, Xiufeng |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Relation | UBC Retrospective Theses Digitization Project [http://www.library.ubc.ca/archives/retro_theses/] |
Page generated in 0.0123 seconds