Return to search

The Comparison of Standard Error Methods in the Marginal Maximum Likelihood Estimation of the Two-Parameter Logistic Item Response Model When the Distribution of the Latent Trait Is Nonnormal

A Monte Carlo simulation study was conducted to investigate the accuracy of several item parameter standard error (SE) estimation methods in item response theory (IRT) when the marginal maximum likelihood (MML) estimation method was used and the distribution of the underlying latent trait was nonnormal in the two-parameter logistic (2PL) model. The manipulated between-subject factors were sample size (N), test length (TL), and the shape of the latent trait distribution (Shape). The within-subject factor was the SE estimation method, which includes the expected Fisher information method (FIS), the empirical cross-product method (XPD), the supplemented-EM method (SEM), the forward difference method (FDM), the Richardson extrapolation method (REM), and the sandwich-type covariance method (SW). The commercial IRT software flexMIRT was used for item parameter estimation and SE estimation. Results showed that when other factors were hold equal, all of the SE methods studied were apt to produce less accurate SE estimates when the distribution of the underlying trait was positively skewed or positively skewed-bimodal, as compared to what they would produce when the distribution was normal. The degree of inaccuracy of each method for an individual item parameter depended on the magnitude of the relevant a and b parameter, and were affected more by the magnitude of the b parameter. On the test level, the overall average performance of the SE methods interact with N, TL, and Shape. The FIS was not viable when TL=40 and was only run when TL=15. For such a short test, it remained to be the “gold standard” as it estimated the SEs most accurately among all the methods, although it requires relatively longer time to run. The XPD method was the least time-consuming option and it generally performed very well when Shape is normal. However, it tended to produce positively biased results when a short test was paired with a small sample. The SW did not outperform other SE methods when Shape is nonnormal as the theory suggests. The FDM had somewhat larger variations when TL=1500 and TL=3000. The SEM and REM were most accurate among the SE methods in this study and appeared to be a good choice both for normal or non-normal cases. For each simulated condition, the average shape of the raw-score distribution was presented to help practitioners better infer the shape of the underlying distribution of latent trait when the truth about the latent trait distribution shape is unknown, thereby leading to more informed decisions of SE methods using the results of this study. Implications, limitations and future directions were discussed. / A Dissertation submitted to the Department of Educational Psychology and Learning Systems in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester 2018. / April 9, 2018. / Includes bibliographical references. / Insu Paek, Professor Directing Dissertation; Fred Huffer, University Representative; Betsy Jane Becker, Committee Member; Yanyun Yang, Committee Member.

Identiferoai:union.ndltd.org:fsu.edu/oai:fsu.digital.flvc.org:fsu_653458
ContributorsLin, Zhongtian (author), Paek, Insu (professor directing dissertation), Huffer, Fred W. (university representative), Becker, Betsy Jane, 1956- (committee member), Yang, Yanyun (committee member), Florida State University (degree granting institution), College of Education (degree granting college), Department of Educational Psychology and Learning Systems (degree granting departmentdgg)
PublisherFlorida State University
Source SetsFlorida State University
LanguageEnglish, English
Detected LanguageEnglish
TypeText, text, doctoral thesis
Format1 online resource (117 pages), computer, application/pdf

Page generated in 0.0023 seconds