Recent research with the PAQ that investigated the ability of job-naive raters to make PAQ ratings based on limited job information found average interrater reliabilities for item ratings in the .40 range (Jones, Main, Butler, & Johnson, 1982). While admitting that this represented generally low agreement among raters, Jones et al. deemed these data adequate because the corresponding dimension score reliabilities averaged in the "acceptable" range of .60. The argument proposes that translating item ratings to dimension scores negates the necessity of obtaining reliable job analysis data at the item level. This study took issue with this position and empirically investigated the relationship betwen PAQ item and dimension reliability. Random data were generated to simulate PAQ item ratings for 1000 pairs of raters in each of four conditions of data generation. In each condition, a true profile was generated and the items evidencing agreement on the rating of Does Not Apply (DNA) were identified. A pair of simulated ratings was generated that held the DNA items constant and varied the reliability of the remaining responses. Each condition generated data that reflected different levels of reliability on the non-DNA items. Interrater reliability coefficients (Pearson r's) were calculated for these simulated item data and the corresponding dimension scores. Results indicated that, even with random data, average reliability coefficients for item-level data could be found in the .40 range; in addition, an average dimension score reliability in the .60 range was found when the true reliability for the items that were actually rated was .30. It was also found that as the number of DNA agreements increased, so did the item reliability, but the dimension reliability was essentially unaffected. Furthermore, as the number of DNA agreements increased and the item reliability increased, the reliability of the items not exhibiting DNA agreement was unchanged. Thus, reliability estimates that included a large number of DNA agreements tended to overestimate the reliability of the non-DNA ratings. It was concluded that reliability estimates, for both items and dimensions, of the magnitude reported by Jones et al. are inadequate, especially when the influence of the DNA rating is taken into consideration.
Identifer | oai:union.ndltd.org:RICE/oai:scholarship.rice.edu:1911/15955 |
Date | January 1986 |
Creators | BLUNT, JANET H. |
Source Sets | Rice University |
Language | English |
Detected Language | English |
Type | Thesis, Text |
Format | application/pdf |
Page generated in 0.0013 seconds