One approach to bridging the gap between cognitively principled assessment, instruction, and learning is to provide the score user with meaningful details about the examinee's test performance. Several researchers have demonstrated the utility of modeling item characteristics, such as difficulty, in light of item features and the cognitive skills required to solve the item, as a way to link assessment and instructional feedback. The next generation of the Test of English as a Foreign Language (TOEFL) will be launched in 2005, with new task types that integrate listening, reading, writing and speaking—the four modalities of language. Evidence centered design (ECD) principles are being used to develop tasks for the new TOEFL assessment. ECD provides a framework within which to design tasks, to link information gathered from those tasks back to the target of inference through the statistical model, and to evaluate each facet of the assessment program in terms of its connection to the test purpose. One of the primary goals of the new exam is to provide users with a score report that describes the English language proficiencies of the examinee. The purpose of this study was to develop an item difficulty model as the first step in generating descriptive score reports for the new TOEFL assessment. Task model variables resulting from the ECD process were used as the independent variables, and item difficulty estimates were used as the dependent variable in the item difficulty model. Tree-based regression was used to estimate the nonlinear relationships among the item and stimulus features and item difficulty. The proposed descriptive score reports capitalized on the item features that accounted for the most variance in item difficulty. The validity of the resulting proficiency statements were theoretically supported by the links among the task model variables and student model variables evidenced in the ECD task design shells, and empirically supported by the item difficulty model. Directions for future research should focus on improving the predictors in the item difficulty model, determining the most appropriate proficiency estimate categories, and comparing item difficulty models across major native language groups.
Identifer | oai:union.ndltd.org:UMASS/oai:scholarworks.umass.edu:dissertations-2209 |
Date | 01 January 2003 |
Creators | Huff, Kristen Leigh |
Publisher | ScholarWorks@UMass Amherst |
Source Sets | University of Massachusetts, Amherst |
Language | English |
Detected Language | English |
Type | text |
Source | Doctoral Dissertations Available from Proquest |
Page generated in 0.0021 seconds