Return to search

Factors influencing the performance of the Mantel-Haenszel procedure in identifying differential item functioning

The Mantel-Haenszel (MH) procedure has emerged as one of the methods of choice for identification of differentially functioning test items (DIF). Although there has been considerable research examining its performance in this context, important gaps remain in the knowledge base for effectively applying this procedure. This investigation is an attempt to fill these gaps with the results of five simulation studies. The first study is an examination of the utility of the two-step procedure recommended by Holland and Thayer in which the matching criterion used in the second step is refined by removing items identified in the first step. The results showed that using the two-step procedure is associated with a reduction in the Type II error rate. In the second study, the capability of the MH procedure to identify uniform DIF was examined. The statistic was used to identify simulated DIF in items with varying levels of difficulty and discrimination and with differing levels of difference in difficulty between groups. The results indicated that when difference in difficulty was held constant, poorly discriminating items and items that were very difficult were less likely to be identified by the procedure. In the third study, the effects of sample size were considered. In spite of the fact that the MH procedure has been repeatedly recommended for use with small samples, the results of this study suggest that samples below 200 per group may be inadequate. Performance with larger samples was satisfactory and improved as samples increased. The fourth study is an examination of the effects of score group width on the statistic. Holland and Thayer recommended that n + 1 score groups should be used for matching (where n is the number of items). Since then, various authors have suggested that there may be utility in using fewer (wider) score groups. It was shown that use of this variation on the MH procedure could result in dramatically increased type I error rates. In the final study, a simple variation on the MH statistic which may allow it to identify non-uniform DIF was examined. The MH statistic's inability to identify certain types of non-uniform DIF items has been noted as a major shortcoming. Use of the variation resulted in identification of many of the simulated non-uniform DIF items with little or no increase in the type I error rate.

Identiferoai:union.ndltd.org:UMASS/oai:scholarworks.umass.edu:dissertations-8594
Date01 January 1993
CreatorsClauser, Brian Errol
PublisherScholarWorks@UMass Amherst
Source SetsUniversity of Massachusetts, Amherst
LanguageEnglish
Detected LanguageEnglish
Typetext
SourceDoctoral Dissertations Available from Proquest

Page generated in 0.0018 seconds