1 |
A comparison of the Effects of Different Sizes of Ceiling Rules on the Estimates of Reliability of a Mathematics Achievement TestSomboon Suriyawongse 05 1900 (has links)
This study compared the estimates of reliability made using one, two, three, four, five, and unlimited consecutive failures as ceiling rules in scoring a mathematics achievement test which is part of the Iowa Tests of Basic Skill (ITBS), Form 8. There were 700 students randomly selected from a population (N=2640) of students enrolled in the eight grades in a large urban school district in the southwestern United States. These 700 students were randomly divided into seven subgroups so that each subgroup had 100 students. The responses of all those students to three subtests of the mathematics achievement battery, which included mathematical concepts (44 items), problem solving (32 items), and computation (45 items), were analyzed to obtain the item difficulties and a total score for each student. The items in each subtest then were rearranged based on the item difficulties from the highest to the lowest value. In each subgroup, the method using one, two, three, four, five, and unlimited consecutive failures as the ceiling rules were applied to score the individual responses. The total score for each individual was the sum of the correct responses prior to the point described by the ceiling rule. The correct responses after the ceiling rule were not part of the total score. The estimate of reliability in each method was computed by alpha coefficient of the SPSS-X. The results of this study indicated that the estimate of reliability using two, three, four, and five consecutive failures as the ceiling rules were an improvement over the methods using one and unlimited consecutive failures.
|
2 |
Diagnostic measurement from a standardized math achievement test using multidimensional latent trait modelsJun, Hea Won 22 May 2014 (has links)
The present study compares applications of continuous multidimensional item response theory (MIRT) models for their diagnostic potential. Typically, MIRT models have not been used for diagnosing the possession of skills or attributes by students, but several researchers have suggested that they can potentially be used for this purpose (e.g., Stout, 2007; Wainer, Vevea, Camacho, Reeve, Rosa, Nelson, Swygert, & Thissen, 2001). This study applies MIRT models to a standardized eighth grade mathematics achievement test that was constructed based on a hierarchically-structured blueprint consisting of standards, benchmarks, and indicators. Only the highest level, consisting of four standards, was used to define the dimensions. The confirmatory models were defined using the standards that had been scored for involvement in each item. For the current study, the exploratory MIRT (EMIRT) model was interpreted with respect to the dimensions. Then, the compensatory and confirmatory MIRT (CMIRT) models and the full information bifactor model were fitted. The interpretation of dimensions, empirical reliabilities of person estimates, and test- and item-fit were examined. Also, dimension and pattern probabilities were obtained for determining their diagnostic potential. Last, a noncompensatory MIRT model (MLTM-D; Embretson & Yang, 2011) and the DINA model (Haertel, 1989; Junker & Sijtsma, 2001) in use as diagnostic models were analyzed to compare pattern probabilities with the compensatory CMIRT model.
|
Page generated in 0.433 seconds