Return to search

Evaluation of IRT anchor test designs in test translation studies

Translating measurement instruments from one language to another is a common way of adapting them for use in a population other than those for which the instruments were designed. This technique is particularly useful in helping to (1) understand the similarities and differences that exist between populations and (2) provide unbiased testing opportunities across different segments of a single population. To help insure that a translated instrument is valid for these purposes, it is essential that the equivalence of the original and translated instrument be established. One focus of this thesis was to provide a review of the history, problems and techniques associated with establishing the translation equivalence of measurement instruments. In addition, this review provided support for the use of item response theory (IRT) in translation equivalence studies. The second and main focus of this thesis was to investigate anchor test designs when using IRT in translation equivalence studies. Simulated data were used to determine the anchor test length required to provide adequate scaling results under conditions similar to those that are likely to be found in a translation equivalence study. These conditions included (1) relatively small samples and (2) examinee ability distribution overlaps that are more representative of vertical rather than horizontal scaling situations. The effects of these two variables on the anchor test design required to provide adequate scaling results were also investigated. The main conclusions from this research concerning the scaling of IRT ability and item parameters are: (1) larger examinee samples with larger ability overlaps should be used whenever possible, (2) under ideal scaling conditions of larger examinee samples with larger ability overlaps, relatively good scaling results can be obtained with anchor tests consisting of as few as 5 items (although the use of such short anchor tests is not recommended), and (3) anchor test lengths of at least 10 items should provide adequate scaling results, but longer anchor tests, consisting of well-translated items, should be used if possible. Finally, suggestions for further research on establishing translation equivalence were provided.

Identiferoai:union.ndltd.org:UMASS/oai:scholarworks.umass.edu:dissertations-8107
Date01 January 1991
CreatorsBollwark, John Anthony
PublisherScholarWorks@UMass Amherst
Source SetsUniversity of Massachusetts, Amherst
LanguageEnglish
Detected LanguageEnglish
Typetext
SourceDoctoral Dissertations Available from Proquest

Page generated in 0.2472 seconds