• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Random or fixed testlet effects : a comparison of two multilevel testlet models

Chen, Tzu-An, 1978- 10 December 2010 (has links)
This simulation study compared the performance of two multilevel measurement testlet (MMMT) models: Beretvas and Walker’s (2008) two-level MMMT model and Jiao, Wang, and Kamata’s (2005) three-level model. Several conditions were manipulated (including testlet length, sample size, and the pattern of the testlet effects) to assess the impact on the estimation of fixed and random effect parameters. While testlets, in which items share the same stimulus, are common in educational tests, testlet item scores violate the assumption of local item independence (LID) underlying item response theory (IRT). Modeling LID has been widely discussed in previous studies (for example, Bradlow, Wainer, and Wang, 1999; Wang, Bradlow, and Wainer, 2002; Wang, Cheng, and Wilson, 2005). More recently, Jiao et al. (2005) proposed a three-level MMMT (MMMT-3r) in which items are modeled as nested within testlets (level two) and then testlets are nested with persons (level three). Testlet effects are typically modeled as random in previous studies involving LID. However, item effects (difficulties) are commonly modeled as fixed under IRT models: that is, persons with the same ability level are assumed to have the same probability of answering an item correctly. Therefore, it is also important that a testlet effects model permit modeling of item effects as fixed. Moreover, modeling testlet effect as random implies testlets are being sampled from a larger population of testlets. However, as with item effects, researchers are typically more interested in a particular set of items or testlets that are being used in an assessment. Given the interest of the researcher or psychometrician using a testlet response model, it seems more useful to use a testlet response model that permits modeling testlets effects as fixed. An alternative MMMT that permits modeling testlet effect as fixed and/or randomly varying has been proposed (Beretvas and Walker, 2008). The MMMT-2f and MMMT-2r models treat testlet effects as item-set-specific but not person-specific. However, no simulation has been conducted to assess how this proposed model performs. The current study compared the performance of the MMMT-2f, MMMT-2r with that of the MMMT-3r. Results of the present simulation study showed that the MMMT-2r yielded the best parameter bias in estimation on fixed item effects, fixed testlet effects, and random testlet effects for conditions with nonzero equal pattern of random testlet effects’ variance even when the MMMMT-2r was not the generating model. However, random effects estimation did not perform well when unequal random testlet effects’ variances were generated. Fit indices did not perform well either as other studies have found. And it should be emphasized that model differences were of very little practical significance. From a modeling perspective, MMMT-2r does allow the greatest flexibility in terms of modeling testlet effects as fixed, random, or both. / text
2

An evaluation of item difficulty and person ability estimation using the multilevel measurement model with short tests and small sample sizes

Brune, Kelly Diane 08 June 2011 (has links)
Recently, researchers have reformulated Item Response Theory (IRT) models into multilevel models to evaluate clustered data appropriately. Using a multilevel model to obtain item difficulty and person ability parameter estimates that correspond directly with IRT models’ parameters is often referred to as multilevel measurement modeling. Unlike conventional IRT models, multilevel measurement models (MMM) can handle, the addition of predictor variables, appropriate modeling of clustered data, and can be estimated using non-specialized computer software, including SAS. For example, a three-level model can model the repeated measures (level one) of individuals (level two) who are clustered within schools (level three). Limitations in terms of the minimum sample size and number of test items that permit reasonable one-parameter logistic (1-PL) IRT model’s parameters have not been examined for either the two- or three-level MMM. Researchers (Wright and Stone, 1979; Lord, 1983; Hambleton and Cook, 1983) have found that sample sizes under 200 and fewer than 20 items per test result in poor model fit and poor parameter recovery for dichotomous 1-PL IRT models with data that meet model assumptions. This simulation study tested the performance of the two-level and three-level MMM under various conditions that included three sample sizes (100, 200, and 400), three test lengths (5, 10, and 20), three level-3 cluster sizes (10, 20, and 50), and two generated intraclass correlations (.05 and .15). The study demonstrated that use of the two- and three-level MMMs lead to somewhat divergent results for item difficulty and person-level ability estimates. The mean relative item difficulty bias was lower for the three-level model than the two-level model. The opposite was true for the person-level ability estimates, with a smaller mean relative parameter bias for the two-level model than the three-level model. There was no difference between the two- and three-level MMMs in the school-level ability estimates. Modeling clustered data appropriately; having a minimum total sample size of 100 to accurately estimate level-2 residuals and a minimum total sample size of 400 to accurately estimate level-3 residuals; and having at least 20 items will help ensure valid statistical test results. / text

Page generated in 0.0762 seconds