Return to search

The generalizability of systematic direct observations across items: Exploring the psychometric properties of behavioral observation

Direct behavioral observation of children in a classroom setting has become a required assessment procedure by school psychologists in order to determine possible out of class placement, services, and/or interventions. However, the reliability of direct behavioral observations has come under criticism. The purpose of this study was to determine the generalizability of systematic direct observation across items. In this study, a partial interval and momentary time sampling observational system is used in which 102 second grade children are observed during math for on/off task behaviors for 15 second intervals for 15 minutes. Data from this study were collected from two Western Massachusetts Elementary Schools and two elementary schools located on Cape Cod. Generalizability theory was employed to determine how many 15 second intervals are needed for reliability. Repeated measures analysis of variance (ANOVA) was conducted to obtain the variance components for the three sources of variability (i.e., persons, items, items x persons, and residual). Data were analyzed in two ways. The first looked at a simple definition of on and off task behavior using a momentary time sampling procedure. It was determined that the majority of variance was attributed to error (88%). Person variance or the differences among individuals did not contribute much to the variance (12%). Items did not contribute to the variance. Gabsolute and Grelative were both .88 indicating high dependability. A second analysis explored a more multidimensional definition of on and off task behavior using both a momentary time sampling and a partial interval procedure. Similarly, the majority of variance contributed to error (82%). Person variance only accounted for a small portion of the variance (18%) and items did not contribute to the variance. Gabsolute and Grelative both increased to .93. Due to the large amount of error in both analyses D studies were not able to be conducted to determine the number of items necessary to obtain a dependable sample of behavior. Further interpretation of results from a behavioral assessment perspective, implications for practice, and future directions are discussed.

Identiferoai:union.ndltd.org:UMASS/oai:scholarworks.umass.edu:dissertations-5127
Date01 January 2008
CreatorsClark, Tara M
PublisherScholarWorks@UMass Amherst
Source SetsUniversity of Massachusetts, Amherst
LanguageEnglish
Detected LanguageEnglish
Typetext
SourceDoctoral Dissertations Available from Proquest

Page generated in 0.0021 seconds