1 |
Assessing first- and second-order equity for the common-item nonequivalent groups design using multidimensional IRTAndrews, Benjamin James 01 July 2011 (has links)
The equity properties can be used to assess the quality of an equating. The degree to which expected scores conditional on ability are similar between test forms is referred to as first-order equity. Second-order equity is the degree to which conditional standard errors of measurement are similar between test forms after equating. The purpose of this dissertation was to investigate the use of a multidimensional IRT framework for assessing first- and second-order equity of mixed format tests.
Both real and simulated data were used for assessing the equity properties for mixed-format tests. Using real data from three Advanced Placement (AP) exams, five different equating methods were compared in their preservation of first- and second-order equity. Frequency estimation, chained equipercentile, unidimensional IRT true score, unidimensional IRT observed score, and multidimensional IRT observed score equating methods were used. Both a unidimensional IRT framework and a multidimensional IRT framework were used to assess the equity properties. Two simulation studies were also conducted. The first investigated the accuracy of expected scores and conditional standard errors of measurement as tests became increasingly multidimensional using both a unidimensional IRT framework and multidimensional IRT framework. In the second simulation study, the five different equating methods were compared in their ability to preserve first- and second-order equity as tests became more multidimensional and as differences in group ability increased.
Results from the real data analyses indicated that the performance of the equating methods based on first- and second-order equity varied depending on which framework was used to assess equity and which test was used. Some tests showed similar preservation of equity for both frameworks while others differed greatly in their assessment of equity. Results from the first simulation study showed that estimates of expected scores had lower mean squared error values when the unidimensional framework was used compared to when the multidimensional framework was used when the correlation between abilities was high. The multidimensional IRT framework had lower mean squared error values for conditional standard errors of measurement when the correlation between abilities was less than .95. In the second simulation study, chained equating performed better than frequency estimation for first-order equity. Frequency estimation better preserved second-order equity compared to the chained method. As tests became more multidimensional or as group differences increased, the multidimensional IRT observed score equating method tended to perform better than the other methods.
|
2 |
A New Item Response Theory Model for Estimating Person Ability and Item Parameters for Multidimensional Rank Order ResponsesSeybert, Jacob 01 January 2013 (has links)
The assessment of noncognitive constructs poses a number of challenges that set it apart from traditional cognitive ability measurement. Of particular concern is the influence of response biases and response styles that can influence the accuracy of scale scores. One strategy to address these concerns is to use alternative item presentation formats (such as multidimensional forced choice (MFC) pairs, triads, and tetrads) that may provide resistance to such biases. A variety of strategies for constructing and scoring these forced choice measured have been proposed, though they often require large sample sizes, are limited in the way that statements can vary in location, and (in some cases) require a separate precalibration phase prior to the scoring of forced-choice responses. This dissertation introduces new item response theory models for estimating item and person parameters from rank-order responses indicating preferences among two or more alternatives representing, for example, different personality dimensions. Parameters for this new model, called the Hyperbolic Cosine Model for Rank order responses (HCM-RANK), can be estimated using Markov chain Monte Carlo (MCMC) methods that allow for the simultaneous evaluation of item properties and person scores. The efficacy of the MCMC parameter estimation procedures for these new models was examined via three studies. Study 1 was a Monte Carlo simulation examining the efficacy of parameter recovery across levels of sample size, dimensionality, and approaches to item calibration and scoring. It was found that estimation accuracy improves with sample size, and trait scores and location parameters can be estimated reasonably well in small samples. Study 2 was a simulation examining the robustness of trait estimation to error introduced by substituting subject matter expert (SME) estimates of statement location for MCMC item parameter estimates and true item parameters. Only small decreases in accuracy relative to the true parameters were observed, suggesting that using SME ratings of statement location for scoring might be a viable short-term way of expediting MFC test deployment in field settings. Study 3 was included primarily to illustrate the use of the newly developed IRT models and estimation methods with real data. An empirical investigation comparing validities of personality measures using different item formats yielded mixed results and raised questions about multidimensional test construction practices that will be explored in future research. The presentation concludes with a discussion of MFC methods and potential applications in educational and workforce contexts.
|
3 |
Modeling Computational Thinking Using Multidimensional Item Response Theory: Investigation into Model Fit and Measurement InvarianceBrown, Emily A. 05 1900 (has links)
Previous research has been limited regarding the measurement of computational thinking, particularly as a learning progression in K-12. This study proposes to apply a multidimensional item response theory (IRT) model to a newly developed measure of computational thinking utilizing both selected response and open-ended polytomous items to establish the factorial structure of the construct, apply the recently introduced composite and structured constructs models, and to investigate the measurement invariance of the assessment between males and females using the means and covariance structures (MACS) approach.
|
4 |
Latent Variable Models of Categorical Responses in the Bayesian and Frequentist FrameworksFarouni, Tarek January 2014 (has links)
No description available.
|
5 |
How Item Response Theory can solve problems of ipsative dataBrown, Anna 25 October 2010 (has links)
Multidimensional forced-choice questionnaires can reduce the impact of numerous response biases typically associated with Likert scales. However, if scored with traditional methodology these instruments produce ipsative data, which has psychometric problems, such as constrained total test score and negative average scale inter-correlation. Ipsative scores distort scale relationships and reliability estimates, and make interpretation of scores problematic. This research demonstrates how Item Response Theory (IRT) modeling may be applied to overcome these problems. A multidimensional IRT model for forced-choice questionnaires is introduced, which is suitable for use with any forced-choice instrument composed of items fitting the dominance response model, with any number of measured traits, and any block sizes (i.e. pairs, triplets, quads etc.). The proposed model is based on Thurstone's framework for comparative data. Thurstonian IRT models are normal ogive models with structured factor loadings, structured uniquenesses, and structured local dependencies. These models can be straightforwardly estimated using structural equation modeling (SEM) software Mplus. Simulation studies show how the latent traits are recovered from the comparative binary data under different conditions. The Thurstonian IRT model is also tested with real participants in both research and occupational assessment settings. It is concluded that when the recommended design guidelines are met, scores estimated from forced-choice questionnaires with the proposed methodology reproduce the latent traits well.
|
Page generated in 0.1139 seconds