Return to search

Assessing Agreement Among Raters And Identifying Atypical Raters Using A Log-Linear Modeling Approach

When an outcome is rated by several raters, ensuring consistency across raters increases the reliability of the measurement. Tanner and Young (1985) proposed a general class of log-linear models to assess agreement among K raters and a rating scale with C nominal categories. Their methodology can be used to assess pair-wise agreement among three or more raters. Rogel et al. (1996, 1998) extended this work by assessing various patterns of agreement among rater sub-groups of size K-1. These models can be used to test the assumption of rater exchangeability. Although parameters from these models can be used to identify atypical raters, no formal inferential procedures are available. I propose a formal inferential approach that can be used to test the assumption of rater exchangeability and to identify an atypical rater. The global and heterogeneous partial agreement model is fit to the data and pair-wise comparisons of the K partial agreement parameters are made, adjusting the p-values for the multiple comparisons made. The heterogeneous partial agreement parameter that is constantly involved in the pair-wise comparisons that are statistically significant is distinguished. The premise is that, if there is an atypical rater, at least one heterogeneous partial agreement parameter will differ from at least one of the remaining K-1 partial agreement parameters. The approach is illustrated using published data from an intestinal biopsy rating study with six raters (Rogel et al., 1998). Overall Type I error and the power of the inferential approach to correctly identify atypical raters are assessed via simulation with rater sub-groups of size 5. The Bonferroni, Sidak, and Holms Step-down procedures using the Bonferroni and Sidak adjustments are used to control the overall Type I error. Being able to correctly identify an atypical rater, if present, and improving the consistency of ratings directly, influence the reliability of the measurement and the power of the study for a given sample size. Consequently, more informative studies can be conducted of interventions (e.g., behavioral, medicinal) that may have a significant positive impact on the publics health.

Identiferoai:union.ndltd.org:PITT/oai:PITTETD:etd-03302006-125650
Date06 June 2006
CreatorsKastango, Kari B.
ContributorsSati Mazumdar, PhD, Benoit H Mulsant, MD, Roslyn A Stone, PhD, Mary Amanda Dew, PhD, Howard E. Rockette, PhD
PublisherUniversity of Pittsburgh
Source SetsUniversity of Pittsburgh
LanguageEnglish
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.library.pitt.edu/ETD/available/etd-03302006-125650/
Rightsunrestricted, I hereby certify that, if appropriate, I have obtained and attached hereto a written permission statement from the owner(s) of each third party copyrighted matter to be included in my thesis, dissertation, or project report, allowing distribution as specified below. I certify that the version I submitted is the same as that approved by my advisory committee. I hereby grant to University of Pittsburgh or its agents the non-exclusive license to archive and make accessible, under the conditions specified below, my thesis, dissertation, or project report in whole or in part in all forms of media, now or hereafter known. I retain all other ownership rights to the copyright of the thesis, dissertation or project report. I also retain the right to use in future works (such as articles or books) all or part of this thesis, dissertation, or project report.

Page generated in 0.0024 seconds