Global ETD Search

Return to search

Multilevel Mixture IRT Modeling for the Analysis of Differential Item Functioning

A multilevel mixture IRT (MMixIRT) model for DIF analysis has been proposed as a solution to gain greater insight on the source of nuisance factors which reduce the reliability and validity of educational assessments. The purpose of this study was to investigate the efficacy of a MMix2PL model in detecting DIF across a broad set of conditions in hierarchically structured, dichotomous data. Monte Carlo simulation was performed to generate examinee response data with conditions common in the field of education. These include (a) two instrument lengths, (b) nine hierarchically structured sample sizes, (c) four latent class features, and (d) eight distinct DIF characteristics, thus allowing for an examination with 576 unique data conditions. DIF analysis was performed using an iterative IRT-based ordinal logistic regression technique, with the focal group identified through estimation of latent classes from a multilevel mixture model. For computational efficiency in analyzing 50 replications for each condition, model parameters were recovered using maximum likelihood estimation (MLE) with the expectation maximization algorithm. Performance of the MMix2PL model for DIF analysis was evaluated by (a) the accuracy in recovering the true class structure, (b) the accuracy of membership classification, and (c) the sensitivity in detecting DIF items and Type I error rates. Results from this study demonstrate that the model is predominantly influenced by instrument length and separation between the class mean abilities, referred to as impact. Enumeration accuracy improved by an average of 40% when analyzing the short 10-item instrument, but with 100 clusters enumeration accuracy was high regardless of the number of items. Classification accuracy was substantially influenced by the presence of impact. Under conditions with no impact, classification was unsuccessful as the matching between model-based class assignments and examinees' true classes averaged only 53.2%. At best, with impact of one standard deviation, classification accuracy averaged between 66.5% to 70.3%. Misclassification errors were then propagated forward to influence the performance of the DIF analysis. Detection power was poor, averaging only 0.34 across the analysis iterations that reached convergence. Additionally, the short 10-item instrument proved challenging for MLE, a condition in which a Bayesian estimation method appears necessary. Finally, this paper provides recommendations on data conditions which improve performance of the MMix2PL model for DIF analysis. Additionally, suggestions for several improvements to the MMix2PL analysis process, which have potential to improve the feasibility of the model for DIF analysis, are summarized.

mixture model

item response theory

multilevel modeling

differential item functioning

maximum likelihood estimation

Monte Carlo simulation

Education

Identifer	oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-11099
Date	14 August 2023
Creators	Dras, Luke
Publisher	BYU ScholarsArchive
Source Sets	Brigham Young University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	https://lib.byu.edu/about/copyright/

Page generated in 0.0022 seconds

Multilevel Mixture IRT Modeling for the Analysis of Differential Item Functioning

Description

Links & Downloads

Tags

Additional Fields