1 |
An evaluation of item difficulty and person ability estimation using the multilevel measurement model with short tests and small sample sizesBrune, Kelly Diane 08 June 2011 (has links)
Recently, researchers have reformulated Item Response Theory (IRT) models into multilevel models to evaluate clustered data appropriately. Using a multilevel model to obtain item difficulty and person ability parameter estimates that correspond directly with IRT models’ parameters is often referred to as multilevel measurement modeling. Unlike conventional IRT models, multilevel measurement models (MMM) can handle, the addition of predictor variables, appropriate modeling of clustered data, and can be estimated using non-specialized computer software, including SAS. For example, a three-level model can model the repeated measures (level one) of individuals (level two) who are clustered within schools (level three).
Limitations in terms of the minimum sample size and number of test items that permit reasonable one-parameter logistic (1-PL) IRT model’s parameters have not been examined for either the two- or three-level MMM. Researchers (Wright and Stone, 1979; Lord, 1983; Hambleton and Cook, 1983) have found that sample sizes under 200 and fewer than 20 items per test result in poor model fit and poor parameter recovery for dichotomous 1-PL IRT models with data that meet model assumptions.
This simulation study tested the performance of the two-level and three-level MMM under various conditions that included three sample sizes (100, 200, and 400), three test lengths (5, 10, and 20), three level-3 cluster sizes (10, 20, and 50), and two generated intraclass correlations (.05 and .15).
The study demonstrated that use of the two- and three-level MMMs lead to somewhat divergent results for item difficulty and person-level ability estimates. The mean relative item difficulty bias was lower for the three-level model than the two-level model. The opposite was true for the person-level ability estimates, with a smaller mean relative parameter bias for the two-level model than the three-level model. There was no difference between the two- and three-level MMMs in the school-level ability estimates. Modeling clustered data appropriately; having a minimum total sample size of 100 to accurately estimate level-2 residuals and a minimum total sample size of 400 to accurately estimate level-3 residuals; and having at least 20 items will help ensure valid statistical test results. / text
|
2 |
Using Hierarchical Generalized Linear Modeling for Detection of Differential Item Functioning in a Polytomous Item Response Theory Framework: An Evaluation and Comparison with Generalized Mantel-HaenszelRyan, Cari Helena 16 May 2008 (has links)
In the field of education, decisions are influenced by the results of various high stakes measures. Investigating the presence of differential item functioning (DIF) in a set of items ensures that results from these measures are valid. For example, if an item measuring math self-efficacy is identified as having DIF then this indicates that some other characteristic (e.g. gender) other than the latent trait of interest may be affecting an examinee’s score on that particular item. The use of hierarchical generalized linear modeling (HGLM) enables the modeling of items nested within examinees, with person-level predictors added at level-2 for DIF detection. Unlike traditional DIF detection methods that require a reference and focal group, HGLM allows the modeling of a continuous person-level predictor. This means that instead of dichotomizing a continuous variable associated with DIF into a focal and reference group, the continuous variable can be added at level-2. Further benefits of HGLM are discussed in this study. This study is an extension of work done by Williams and Beretvas (2006) where the use of HGLM with polytomous items (PHGLM) for detection of DIF was illustrated. In the Williams and Beretvas study, the PHGLM was compared with the generalized Mantel-Haenszel (GMH), for DIF detection and it was found that the two performed similarly. A Monte Carlo simulation study was conducted to evaluate HGLM’s power to detect DIF and its associated Type 1 error rates using the constrained form of Muraki’s Rating Scale Model (Muraki, 1990) as the generating model. The two methods were compared when DIF was associated with a continuous variable which was dichotomized for the GMH and used as a continuous person-level predictor with PHGLM. Of additional interest in this study was the comparison of HGLM’s performance with that of the GMH under a variety of DIF and sample size conditions. Results showed that sample size, sample size ratio and DIF magnitude substantially influenced the power performance for both GMH and HGLM. Furthermore, the power performance associated with the GMH was comparable to HGLM for conditions with large sample sizes. The mean performance for both DIF detection methods showed good Type I error control.
|
3 |
Curriculum Track And Its Influences On Predicting High School Dropout LikelihoodMohd Kamalludeen, Rosemaliza 08 August 2012 (has links)
Dropping out of school is a major concern as high school graduation credentials have been used as an important measurement tool to define post-secondary success. Numerous researchers presented a multitude of factors that predict dropouts at individual and school levels. Curriculum track choice, or high school course-taking sequence, defines students' schooling career and ultimately the post-secondary path that they choose (Plank, DeLuca, & Estacion, 2008). Scholars have debated on various outcomes related to dropouts influenced by various curriculum choices, namely academic, career and technical education (CTE), dual enrollment, and general curriculum. Several argued students following academic tracks are more likely to graduate. Others claim that CTE benefits students who are at-risk and suppresses dropout likelihood (Rumberger & Sun, 2008). New vocationalism or dual enrollment has proven successful at reducing dropout rates.
This study attempted to investigate the influence of curriculum track and CTE program areas on dropout likelihood while controlling for possible individual differences. Analysis was conducted via Hierarchical Generalized Linear Modeling (HGLM) due to the nested data structure of Education Longitudinal Study of 2002 (ELS). Variables included were academic background, academic and career aspiration, school-sponsored activity participation, school minority composition, school average student socio-economic status (SES), school type (private or public), school urbanicity, CTE courses offered at the school, and demographic indicators (gender, race, and SES). Findings reflect higher dropout likelihood among general curriculum participants than academic and occupational concentrators after controlling for all possible individual differences. Dual concentrators had 0% dropout rate, and therefore comparison with other curriculum tracks was not possible via HGLM analysis. Results suggest substantial importance of academic background, post-secondary education plans, and school-sponsored activity participation in predicting dropout likelihood.
Comparing CTE program areas, Family and Consumer Sciences, Human Services, Public Services, Health and Education (Human Services area) participants were more likely to drop out than other program areas while Technology Education participants were less likely to drop out than Human Services and 2 or more CTE program area participants. Results suggest 9th grade overall GPA and school-sponsored activity participation as substantial predictors of dropout likelihood among occupational concentrators. Variability across schools was insignificant. / Ph. D.
|
4 |
Identification and Estimation of Location and Dispersion Effects in Unreplicated 2k-p Designs Using Generalized Linear ModelsSabangan, Rainier Monteclaro 14 July 2010 (has links)
No description available.
|
Page generated in 0.1079 seconds