1 |
IRT Software: Überblick und AnwendungenMaier, Marco J., Hatzinger, Reinhold 10 1900 (has links) (PDF)
Diese Publikation wurde im Rahmen des Seminars Psychometric Methods erstellt. Dabei handelt es sich um eine Lehrveranstaltung, die jedes Semester am Institut für Statistik und Mathematik der Wirtschaftsuniversität Wien mit wechselnden thematischen Schwerpunkten abgehalten wird. Im Wintersemester 2009/2010 lag der Fokus auf der Anwendung von Item-Response-Software. Zur Anwendung psychometrischer Methoden steht eine Vielzahl von Programmen zur Verfügung, die jeweils unterschiedliche Verfahren und Modelle anbieten. In diesem Seminar ging es im Wesentlichen darum, einen Überblick über die vorhandene Software zu bekommen, sowie die Stärken und Schwächen der einzelnen Programme herauszuarbeiten. Weiters sollten die Teilnehmer in die Lage versetzt werden, verschiedene psychometrische Modelle bei unterschiedlichen Problemstellungen praktisch anzuwenden. Im Rahmen des Seminars wurden von verschiedenen Teilnehmergruppen jeweils ein bestimmtes Programm vorgestellt. Einerseits wurden die theoretischen Hintergründe und Modelle aufbereitetet und andererseits die jeweiligen Programme mittels Live-Präsentationen von Datenanalysen vorgeführt. Dadurch bekamen alle Beteiligten einen Einblick, welche Modelle in den unterschiedlichen Softwarepaketen umgesetzt sind, wie man sie anwenden und interpretieren kann und auch, wie man
praktisch mit ihnen umgeht. Damit die gewonnenen Erfahrungen auch für andere nutzbar werden haben wir die Gruppenbeiträge gesammelt herausgegeben. Die einzelnen Kapitel sollen jeweils eine Brücke zwischen den theoretisch-technischen Aspekten und anwendungsorientierten-praktischen Aspekten der einzelnen Progamme schlagen. Wichtig war uns auch die Auswahl der vorgestellten Softwarepakete, wobei sich der Bogen von etablierten und weitverbreiteten Programmen (z.B. BILOG oder MULTILOG) bis zu eher selten verwendenten Programmen (bspw. GGUM oder ScoRight) spannt.
Ohne Anspruch auf Vollständigkeit hoffen wir mit diesem Buch einen Einblick in die wichtigsten
Softwarepakete zu geben, wobei wir auf eine verständliche Erklärung theoretischer Hintergründe und
möglichst interessante Anwendungsbeispiele großen Wert legten. Unser Ziel war es, interessierten
Anwenderinnen und Anwendern eine kleine ,Landkarte' durch den Dschungel verfügbarer IRT-
Software bereitzustellen, die zur weiteren Vertiefung anregen soll.
Unser Dank gilt den Teilnehmerinnen und Teilnehmern des Seminars, die ihre Beiträge mit viel
Engagement und Durchhaltevermögen (für nicht wenige war dieser Artikel die erste Begegnung mit
LaTeX) verfasst und überarbeitet haben, sodass dieses Werk zustande kommen konnte.(author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
2 |
Marginal Bayesian parameter estimation in the multidimensional generalized graded unfolding modelThompson, Vanessa Marie 08 June 2015 (has links)
The Multidimensional Generalized Graded Unfolding Model (MGGUM) is a proximity-based, noncompensatory item response theory (IRT) model with applications in the context of attitude, personality, and preference measurement. Model development used fully Bayesian Markov Chain Monte Carlo (MCMC) parameter estimation (Roberts, Jun, Thompson, & Shim, 2009a; Roberts & Shim, 2010). Challenges can arise while estimating MGGUM parameters using MCMC where the meaning of dimensions may switch during the estimation process and difficulties in obtaining informative starting values may lead to increased identification of local maxima. Furthermore, researchers must contend with lengthy computer processing time. It has been shown alternative estimation methods perform just as well as, if not better than, MCMC in the unidimensional Generalized Graded Unfolding Model (GGUM; Roberts & Thompson, 2011) with marginal maximum a posteriori (MMAP) item parameter estimation paired with expected a posteriori (EAP) person parameter estimation being a viable alternative. This work implements MMAP/EAP parameter estimation in the multidimensional model. Additionally, item location initial values are derived from detrended correspondence analysis (DCA) based on previous implementation of correspondence analysis in the GGUM (Polak, 2011). A parameter recovery demonstrates the accuracy of two-dimensional MGGUM MMAP/EAP parameter estimates and a comparative analysis of MMAP/EAP and MCMC demonstrates equal accuracy, yet much improved efficiency of the former method. Analysis of real attitude measurement data also provides an illustrative application of the model.
|
3 |
Robustness redressed : an exploratory study on the relationships among overall assumption violation, model-data-fit, and invariance properties for item response theory modelsLiu, Xiufeng 11 1900 (has links)
This study compares item and examinee properties, studies the robustness of IRT models, and examines the difference in robustness when using model-data-fit as a robustness criterion. A conceptualization of robustness as a statistical relationship between model assumption violation and invariance properties has been created in this study based on current understanding on IRT models. Using real data from British Columbia Science Assessments, a series of regressional and canonical analyses were conducted. Scatterplots were used to study possible non-linear relationships. The means and standard deviations of "a" and "c" parameter estimates obtained by applying the three-parameter model to a data sample were used as indices of equal discrimination and non-guessing assumption violation for the Rasch model. The assumption of local independence was taken as being equivalent to the assumption of unidimensionality, and Humphreys' pattern index "p" was used to assess the degree of unidimensionality assumption violation. Means and standard deviations of Yen's Q [i subscript] were used to assess the model-data-fit of items at the total test level. Another statistic to assess the model-data-fit of examinees (D [i subscript]) was created and validated in this study. The mean and standard deviation of D [i subscript] were used to assess model-data-fit of examinees at the total test level. The statistics used in this study for assessing item and ability parameter estimate invariance properties were correlations between estimates obtained from a sample and the estimates obtained from an assessment data file. It was found that model-data-fit of items and model-data-fit of examinees are two statistically independent total test properties of model-data-fit. Therefore, there is a necessity in practice to differentiate model-data-fit of items and model-data-fit of examinees. It was also found that item estimate invariance and ability estimate invariance are statistically independent total test properties of invariance. Therefore, there is also a necessity in practice to differentiate item invariance and ability invariance. When invariance is used as a criterion for robustness, the three-parameter model is robust for all the combinations of sample size and test length. The Rasch model is not robust in terms of ability estimate invariance when a large sample size is combined with a moderate test length, or when a moderate sample size is combined with a long test length. Finally, no significant relationship between model-data-fit and invariance was found. Therefore, results of robustness studies obtained when model-data-fit is used as a criterion and the results when invariance is used as a criterion may be totally different, or even contradictory. Because invariance is the fundamental premise of IRT models, invariance properties rather than model-data-fit should be used as criteria for robustness.
|
4 |
Tackling measurement issues in health predictors and outcomes using item response theoryJackson, Jeanette January 2008 (has links)
The Functional Limitation Profile (FLP), the Hospital Anxiety and Depression Scale (HADS) and the Recovery Locus of Control scale (RLOC) are three well established and useful measures used in Health Psychology. However, the reliable and valid measurement of these health predictors and outcomes has associated problems. The present thesis tackles measurement issues in all three instruments using item response theory (IRT). The Scientific Advisory Committee of the Medical Outcomes Trust has suggested the methodological and theoretical rationale for the conceptual and measurement model of available measurement instruments should be reported. The introduction chapter provides theoretical background in order to understand activity limitations and participation restrictions as behaviours affected by a certain health condition, as well as by thoughts and feelings. Within this theoretical framework, the present thesis investigates the measurement of mood using the HADS and functional limitations using the FLP in three different health conditions: (1) stroke patients, (2) patients with myocardial infarction, and (3) patients who underwent joint replacement surgery. The measurement of perceived personal control beliefs using the RLOC scale, and the relationship between control cognitions, mood and functional limitations were examined in stroke patients since all three measures were available for secondary analysis in this sample. The main findings are that (1) highly sensitive FLP items measure precisely different levels of disability and handicap, (2) removing 2 HADS items results in precise measurements of different levels of anxiety and depression, and (3) internal but not external perceived personal control beliefs measured sensitively different levels of the underlying construct.
|
5 |
Nonparametric estimation of item response functions using the EM algorithmRossi, Natasha T. January 2001 (has links)
Bock and Aitkin (1981) developed an EM algorithm for the maximum marginal likelihood estimation of parametric item response curves, such that these estimates could be obtained in the absence of the estimation of examinee parameters. Using functional data analytic techniques described by Ramsay and Silverman (1997), this algorithm is extended to achieve nonparametric estimates of item response functions. Unlike their parametric counterparts, nonparametric functions have the freedom to adopt any possible shape, making the current approach an attractive alternative to the popular three-parameter logistic model. A basis function expansion is described for the item response functions, as is a roughness penalty which mediates a compromise between the fit of the data and the smoothness of the estimate. The algorithm is developed and applied to both actual and simulated data to illustrate its performance, and how the nonparametric estimates compare to results obtained through more classical methods.
|
6 |
The use of item response theory to assess adults' postdiction accuracyCummings, Andrea M., January 2006 (has links)
Thesis (Ph. D.)--Georgia State University, 2006. / Karen M. Zabrucky, committee chair; Laura D. Fredrick, John H. Neel, Dennis N. Thompson, committee members. Electronic text (142 p.) : digital, PDF file. Description based on contents viewed July 16, 2007. Includes bibliographical references (p. 129-135).
|
7 |
Robustness redressed : an exploratory study on the relationships among overall assumption violation, model-data-fit, and invariance properties for item response theory modelsLiu, Xiufeng 11 1900 (has links)
This study compares item and examinee properties, studies the robustness of IRT models, and examines the difference in robustness when using model-data-fit as a robustness criterion. A conceptualization of robustness as a statistical relationship between model assumption violation and invariance properties has been created in this study based on current understanding on IRT models. Using real data from British Columbia Science Assessments, a series of regressional and canonical analyses were conducted. Scatterplots were used to study possible non-linear relationships. The means and standard deviations of "a" and "c" parameter estimates obtained by applying the three-parameter model to a data sample were used as indices of equal discrimination and non-guessing assumption violation for the Rasch model. The assumption of local independence was taken as being equivalent to the assumption of unidimensionality, and Humphreys' pattern index "p" was used to assess the degree of unidimensionality assumption violation. Means and standard deviations of Yen's Q [i subscript] were used to assess the model-data-fit of items at the total test level. Another statistic to assess the model-data-fit of examinees (D [i subscript]) was created and validated in this study. The mean and standard deviation of D [i subscript] were used to assess model-data-fit of examinees at the total test level. The statistics used in this study for assessing item and ability parameter estimate invariance properties were correlations between estimates obtained from a sample and the estimates obtained from an assessment data file. It was found that model-data-fit of items and model-data-fit of examinees are two statistically independent total test properties of model-data-fit. Therefore, there is a necessity in practice to differentiate model-data-fit of items and model-data-fit of examinees. It was also found that item estimate invariance and ability estimate invariance are statistically independent total test properties of invariance. Therefore, there is also a necessity in practice to differentiate item invariance and ability invariance. When invariance is used as a criterion for robustness, the three-parameter model is robust for all the combinations of sample size and test length. The Rasch model is not robust in terms of ability estimate invariance when a large sample size is combined with a moderate test length, or when a moderate sample size is combined with a long test length. Finally, no significant relationship between model-data-fit and invariance was found. Therefore, results of robustness studies obtained when model-data-fit is used as a criterion and the results when invariance is used as a criterion may be totally different, or even contradictory. Because invariance is the fundamental premise of IRT models, invariance properties rather than model-data-fit should be used as criteria for robustness. / Education, Faculty of / Curriculum and Pedagogy (EDCP), Department of / Graduate
|
8 |
Examinee control of item order effects on latent trait model and classical model test statisticsScales, Michael J. January 1990 (has links)
The purpose of this study was to determine what effect changes in the item order had on classical and on latent trait test statistics. As well, comparisons were made between students who were allowed to answer the questions in any order, and students who were required to answer the questions In the order presented in the test booklet. The results were then analyzed using the student's ability level as an additional independent factor.
Four different formats of a forty item mathematics test were used with 590 students in grade eight. Half of the booklets had the items sequenced from easiest to hardest. The other booklets were sequenced from hardest to easiest. In addition, half of the tests of each sequence had special directions which prevented students from altering the given item difficulty sequence. The classroom teachers provided a rating of each student's ability in mathematics.
The order of the items was found to have a significant effect. Tests which were sequenced from hard to easy had a lower mean score. Although students with test booklets with restrictive directions had lower scores on average, it was not a statistically significant difference. There were no significant interactions found. Classical and latent trait item difficulty statistics showed a high degree of correlation.
It was concluded that under certain circumstances, the order of the items could effect both classical and latent trait statistics. It was also recommended that care should be taken when assumptions are made about parallel forms or local independence. / Education, Faculty of / Educational and Counselling Psychology, and Special Education (ECPS), Department of / Graduate
|
9 |
Nonparametric estimation of item response functions using the EM algorithmRossi, Natasha T. January 2001 (has links)
No description available.
|
10 |
Assessment of dimensionality in dichotomously-scored data using multidimensional scaling.Jones, Patricia Ann Blodgett. January 1987 (has links)
The effectiveness of multidimensional scaling (MDS) techniques in recovering the underlying dimensionality of dichotomously-scored data was examined for unidimensional and multidimensional data. Thirty-three data sets of varying numbers of dimensions with differing patterns of item discrimination were generated using a multidimensional latent trait model in a Monte Carlo simulation study. Margin-sensitive measures (agreement, phi, and kappa) and margin-free measures (Φ/ Φ(max), Yule's Q, and the tetrachoric correlation) were used as measures of similarity and the resulting matrices were scaled in one through five dimensions. Values of the stress coefficient, S₁, S₁ by dimensionality plots, and plot configurations were examined to determine the dimensionality of the item set. Principal components analyses (PCAs) of phi and tetrachoric matrices were carried out as a basis for comparison. In addition, MDS and PCA were used to examine a data set comprised of items obtained from the routing tests of the Head Start Measures Battery. Two effects of item discrimination on MDS results were especially noteworthy. First, factors tended to be located equally distant from each other in the MDS space. Items were located closest to the factor for which the primary factor loading occurred. Second, as item discrimination decreased, items tended to be more widely dispersed from their appropriate locations in space. Extra dimensions in the MDS representational space were required for margin-sensitive coefficients to accommodate difficulty effects. Margin-free coefficients generally eliminated difficulty-related dimensions, although occasional problems were noted with the tetrachoric correlation. Analysis of the HSMB revealed that the data were primarily unidimensional, although specific effects due to each subtest were clearly present in the analysis. MDS was found to be a useful technique and its use in conjunction with PCA or factor analysis is recommended.
|
Page generated in 0.0735 seconds