Global ETD Search

Return to search

A comparability study on differences between scores of handwritten and typed responses on a large-scale writing assessment

As the use of technology for personal, professional, and learning purposes increases, more and more assessments are transitioning from a traditional paper-based testing format to a computer-based one. During this transition, some assessments are being offered in both paper and computer formats in order to accommodate examinees and testing center capabilities. Scores on the paper-based test are often intended to be directly comparable to the computer-based scores, but such claims of comparability are often unsupported by research specific to that assessment. Not only should the scores be examined for differences, but the thought processes used by raters while scoring those assessments should also be studied to better understand why raters might score response modes differently. Previous comparability literature can be informative, but more contemporary, test-specific research is needed in order to completely support the direct comparability of scores.
The goal of this thesis was to form a more complete understanding of why analytic scores on a writing assessment might differ, if at all, between handwritten and typed responses. A representative sample of responses to the writing composition portion of a large-scale high school equivalency assessment were used. Six trained raters analytically scored approximately six-hundred examinee responses each. Half of those responses were typed, and the other half were the transcribed handwritten duplicates. Multiple methods were used to examine why differences between response modes might exist. A MANOVA framework was applied to examine score differences between response modes, and the systematic analyses of think-alouds and interviews were used to explore differences in rater cognition. The results of these analyses indicated that response mode was of no practical significance, meaning that domain scores were not notably dependent on whether or not a response was presented as typed or handwritten. Raters, on the other hand, had a more substantial effect on scores. Comments from the think-alouds and interviews suggest that, while the scores were not affected by response mode, raters tended to consider certain aspects of typed responses differently than handwritten responses. For example, raters treated typographical errors differently from other conventional errors when scoring typed responses, but not while scoring the handwritten duplicates. Raters also indicated that they preferred scoring typed responses over handwritten ones, but felt they could overcome their personal preferences to score both response modes similarly.
Empirical investigations on the comparability of scores, combined with the analysis of raters’ thought processes, helped to provide a more evidence-based answer to the question of why scores might differ between response modes. Such information could be useful for test developers when making decisions regarding what mode options to offer and how to best train raters to score such assessments. The design of this study itself could be useful for testing organizations and future research endeavors, as it could be used as a guide for exploring score differences and the human-based reasons behind them.

Educational Psychology

Identifer	oai:union.ndltd.org:uiowa.edu/oai:ir.uiowa.edu:etd-5951
Date	01 July 2015
Creators	Rankin, Angelica Desiree
Contributors	Dunbar, Stephen B., Welch, Catherine J.
Publisher	University of Iowa
Source Sets	University of Iowa
Language	English
Detected Language	English
Type	dissertation
Format	application/pdf
Source	Theses and Dissertations
Rights	Copyright 2015 Angelica Desiree Rankin

Page generated in 0.0027 seconds

A comparability study on differences between scores of handwritten and typed responses on a large-scale writing assessment

Description

Links & Downloads

Tags

Additional Fields