Global ETD Search

1	Examining Rater Bias in Elicited Imitation Scoring: Influence of Rater's L1 and L2 Background to the Ratings Son, Min Hye 16 July 2010 (has links) (PDF) Elicited Imitation (EI), which is a way of assessing language learners' speaking, has been used for years. Furthermore, there have been many studies done showing rater bias (variance in test ratings associated with a specific rater and attributable to the attributes of a test taker) in language assessment. In this project, I evaluated possible rater bias, focusing mostly on bias attributable to raters' and test takers' language backgrounds, as seen in EI ratings. I reviewed literature on test rater bias, participated in a study of language background and rater bias, and produced recommendations for reducing bias in EI administration. Also, based on possible rater bias effects discussed in the literature I reviewed and on results of the research study I participated in, I created a registration tool to collect raters' background information that might be helpful in evaluating and reducing rater bias in future EI testing. My project also involved producing a co-authored research paper. In that paper we found no bias effect based on rater first or second language background. Elicited Imitation Rater bias Language Assessment Rating Linguistics
2	Investigating the effects of Rater's Second Language Learning Background and Familiarity with Test-Taker's First Language on Speaking Test Scores Zhao, Ksenia 01 March 2017 (has links) Prior studies suggest that raters' familiarity with test-takers' first language (L1) can be a potential source of bias in rating speaking tests. However, there is still no consensus between researchers on how and to what extent that familiarity affects the scores. This study investigates raters' performance and focuses on not only how raters' second language (L2) proficiency level interacts with examinees' L1, but also if raters' teaching experience has any effect on the scores. Speaking samples of 58 ESL learners with L1s of Spanish (n = 30) and three Asian languages (Korean, n = 12; Chinese, n = 8; and Japanese, n = 8) of different levels of proficiency were rated by 16 trained raters with varying levels of Spanish proficiency (Novice to Advanced) and different degrees of teaching experience (between one and over 10 semesters). The ratings were analyzed using Many-Facet Rasch Measurement (MFRM). The results suggest that extensive rater training can be quite effective: there was no significant effect of either raters' familiarity with examinees' L1, or raters' teaching experience on the scores. However, even after training, the raters still exhibited different degrees of leniency/severity. Therefore, the main conclusion of this study is that even trained raters may consistently rate differently. The recommendation is to (a) have further rater training and calibration; and/or (b) use MFRM with fair average to compensate for the variance. language testing rater bias speaking tests oral proficiency language learning background accented speech Linguistics
3	EFFECTS OF ITEM-LEVEL FEEDBACK ON THE RATINGS PROVIDED BY JUDGES IN A MODIFIED-ANGOFF STANDARD SETTING STUDY Peabody, Michael R 01 January 2014 (has links) Setting performance standards is a judgmental process involving human opinions and values as well as technical and empirical considerations and although all cut score decisions are by nature arbitrary, they should not be capricious. Establishing a minimum passing standard is the technical expression of a policy decision and the information gained through standard setting studies inform these policy decisions. To this end, it is necessary to conduct robust examinations of methods and techniques commonly applied to standard setting studies in order to better understand issues that may influence policy decisions. The modified-Angoff method remains one of the most popular methods for setting performance standards in testing and assessment. With this method, is common practice to provide content experts with feedback regarding the item difficulties; however, it is unclear how this feedback affects the ratings and recommendations of content experts. Recent research seems to indicate mixed results, noting that the feedback given to raters may or may not alter their judgments depending on the type of data provided, when the data was provided, and how raters collaborated within groups and between groups. This research seeks to examine issues related to the effects of item-level feedback on the judgment of raters. The results suggest that the most important factor related to item-level feedback is whether or not a Subject Matter Expert (SME) was able to correctly answer a question. If so, then the SMEs tended to rely on their own inherent sense of item difficulty rather than the data provided, in spite of empirical evidence to the contrary. The results of this research may hold implications for how standard setting studies are conducted with regard to the difficulty and ordering of items, the ability level of content experts invited to participate in these studies, and the types of feedback provided. Standard Setting Angoff method Rasch model Rater bias Form difficulty
4	The validation of a performance-based assessment battery Wilson, Irene Rose 01 January 2002 (has links) Legislative pressures are being brought to bear on South African employers to demonstrate that occupational assessment is scientifically valid and culturefair. The development of valid and reliable performance-based assessment tools will enable employers to meet these requirements. The general aim of this research was to validate a performance-based assessment battery for the placement of sales representatives. A literature survey examined alternative assessment measures and methods of performance measurement, leading to the conclusion that the combination of the work sample as a predictor measure and the managerial rating of performance as a criterion measure offer a practical and cost-effective assessment process to the sales manager. The empirical study involved 54 sales persons working for the Commercial division of an oil marketing company, selling products and services to the commercial and industrial market. By means of the empirical study, a significant correlation was found between performance of sales representatives in terms of the performance-based assessment battery for the entry level of the career ladder and their behaviour in the field as measured by the managerial performance rating instrument. The limitations of the sample, however, prevent the results from being generalised to other organisations. Assessment Performance measurement Competency Signs and samples Behavioural consistency Work sample In-basket Role-play Criterion Rater bias
5	The validation of a performance-based assessment battery Wilson, Irene Rose 11 1900 (has links) Legislative pressures are being brought to bear on South African employers to demonstrate that occupational assessment is scientifically valid and culture-fair. The development of valid and reliable performance-based assessment tools will enable employers to meet these requirements. The general aim of this research was to validate a performance-based assessment battery for the placement of sales representatives. A literature survey examined alternative assessment measures and methods of performance measurement, leading to the conclusion that the combination of the work sample as a predictor measure and the managerial rating of performance as a criterion measure offer a practical and cost-effective assessment process to the sales manager. The empirical study involved 54 sales persons working for the Commercial division of an oil marketing company, selling products and services to the commercial and industrial market. By means of the empirical study, a significant correlation was found between performance of sales representatives in terms of the performance-based assessment battery for the entry level of the career ladder and their behaviour in the field as measured by the managerial performance rating instrument. The limitations of the sample, however, prevent the results from being generalised to other organisations. / Industrial & Organisational Psychology / M.A. (Industrial Psychology) Assessment Performance measurement Competency Signs and samples Behavioural consistency Work sample In-basket Role-play Criterion Rater bias 658.3125 Sales personnel -- Rating of Sales personnel -- Psychological testing Behavioral assessment Performance -- Measurement
6	The validation of a performance-based assessment battery Wilson, Irene Rose 11 1900 (has links) Legislative pressures are being brought to bear on South African employers to demonstrate that occupational assessment is scientifically valid and culture-fair. The development of valid and reliable performance-based assessment tools will enable employers to meet these requirements. The general aim of this research was to validate a performance-based assessment battery for the placement of sales representatives. A literature survey examined alternative assessment measures and methods of performance measurement, leading to the conclusion that the combination of the work sample as a predictor measure and the managerial rating of performance as a criterion measure offer a practical and cost-effective assessment process to the sales manager. The empirical study involved 54 sales persons working for the Commercial division of an oil marketing company, selling products and services to the commercial and industrial market. By means of the empirical study, a significant correlation was found between performance of sales representatives in terms of the performance-based assessment battery for the entry level of the career ladder and their behaviour in the field as measured by the managerial performance rating instrument. The limitations of the sample, however, prevent the results from being generalised to other organisations. / Industrial and Organisational Psychology / M.A. (Industrial Psychology) Assessment Performance measurement Competency Signs and samples Behavioural consistency Work sample In-basket Role-play Criterion Rater bias 658.3125 Sales personnel -- Rating of Sales personnel -- Psychological testing Behavioral assessment Performance -- Measurement

1

Page generated in 0.082 seconds