Spelling suggestions: "subject:"itemresponse theory"" "subject:"chemoresponse theory""
181 |
Modelling Conditional Dependence Between Response Time and Accuracy in Cognitive Diagnostic ModelsBezirhan, Ummugul January 2021 (has links)
With the novel data collection tools and diverse item types, computer-based assessments allow to easily obtain more information about an examinee’s response process such as response time (RT) data. This information has been utilized to increase the measurement precision about the latent ability in the response accuracy models. Van der Linden’s (2007) hierarchical speed-accuracy model has been widely used as a joint modelling framework to harness the information from RT and the response accuracy, simultaneously. The strict assumption of conditional independence between response and RT given latent ability and speed is commonly imposed in the joint modelling framework. Recently multiple studies (e.g., Bolsinova & Maris, 2016; Bolsinova, De Boeck, & Tijmstra, 2017a; Meng, Tao, & Chang, 2015) have found violations of the conditional independence assumption and proposed models to accommodate this violation by modelling conditional dependence of responses and RTs within a framework of Item Response Theory (IRT). Despite the widespread usage of Cognitive Diagnostic Models as formative assessment tools, the conditional joint modelling of responses and RTs has not yet been explored in this framework. Therefore, this research proposes a conditional joint response and RT model in CDM with an extended reparametrized higher-order deterministic input, noisy ‘and’ gate (DINA) model for the response accuracy. The conditional dependence is modelled by incorporating item-specific effects of residual RT (Bolsinova et al., 2017a) on the slope and intercept of the accuracy model. The effects of ignoring the conditional dependence on parameter recovery is explored with a simulation study, and empirical data analysis is conducted to demonstrate the application of the proposed model. Overall, modelling the conditional dependence, when applicable, has increased the correct attribute classification rates and resulted in more accurate item response parameter estimates.
|
182 |
Application of Item Response Tree (IRTree) Models on Testing Data: Comparing Its Performance with Binary and Polytomous Item Response ModelsWang, Yixi 01 October 2020 (has links)
No description available.
|
183 |
Comparing Dichotomous and Polytomous Items Using Item Response TreesJenkins, Daniel 02 September 2020 (has links)
No description available.
|
184 |
[en] A TOOL TO OBTAIN AND ANALYSE ENEM DATA / [pt] UMA FERRAMENTA PARA A OBTENÇÃO E ANÁLISE DE DADOS DO ENEMJORGE LUIZ DIAS DE FRIAS 28 October 2015 (has links)
[pt] Este trabalho de conclusão de curso tem dois objetivos. O primeiro, tornar
acessível aos Diretores, Coordenadores e Professores os dados dos seus alunos
que participaram do Exame Nacional do Ensino Médio (ENEM) 2012, divulgados
pelo INEP através dos Microdados do ENEM 2012, por intermédio de uma
ferramenta que permite conhecer, analisar, tomar decisões e criar estratégias para
o aprimoramento do projeto pedagógico do Estabelecimento de Ensino. O
segundo objetivo é o de capacitar pessoas para obter estes dados e desenvolver as
suas próprias ferramentas utilizando o programa Excel. / [en] The present work aims at achieving two objectives. The first one is to
make it accessible to principals, coordinators and teachers, data about their
students who took part in ENEM (Exame Nacional do Ensino Médio / High
School National Exam) in 2012. This data was released by INEP, a governmental
institute, based on each candidate s information on the test. The tool used to get
this data makes it possible to obtain, analyse, make decisions and create strategies
to improve the pedagogic plan of the school. The second objective is to train
people to get this data and develop their own tools using the Microsoft Office
Excel.
|
185 |
The Strength of Multidimensional Item Response Theory in Exploring Construct Space that is Multidimensional and CorrelatedSpencer, Steven Gerry 08 December 2004 (has links) (PDF)
This dissertation compares the parameter estimates obtained from two item response theory (IRT) models: the 1-PL IRT model and the MC1-PL IRT model. Several scenarios were explored in which both unidimensional and multidimensional item-level and personal-level data were used to generate the item responses. The Monte Carlo simulations mirrored the real-life application of the two correlated dimensions of Necessary Operations and Calculations in the basic mathematics domain. In all scenarios, the MC1-PL IRT model showed greater precision in the recovery of the true underlying item difficulty values and person theta values along each primary dimension as well as along a second general order factor. The fit statistics that are generally applied to the 1-PL IRT model were not sensitive to the multidimensional item-level structure, reinforcing the requisite assumption of unidimensionality when applying the 1-PL IRT model.
|
186 |
Evaluation of RELATE Using Rasch AnalysisYoshida, Keitaro 30 November 2010 (has links) (PDF)
The importance of valid and reliable couple assessment has been increasing with growth in research on couple and family relationships as well as in therapeutic and educational interventions for couples and families. However, self-report instruments–the most popular type of couple assessment–have been criticized at least partly due to limitations in Classical Test Theory (CTT) which has been used solely in developing and evaluating couple assessments for decades. In an effort to address the limitations in the sole use of CTT in developing self-report couple assessments, the present study integrated a modern test theory called Item Response Theory (IRT) and evaluated the properties of subscales in the RELATionship Evaluation (RELATE) using the existing data from 4,784 participants. Using the Rasch rating scale or partial credit model which is one of the IRT models, the author demonstrated that some of the RELATE subscales had items and response categories that functioned less optimally or in an unexpected way. The results suggested that some items misfit the model or overlapped with other items, many scales did not cover the entire range of the measured construct, and response categories for many items malfunctioned. The author made recommendations on possible remedies that could be adopted to improve the function of individual scales and items.
|
187 |
Multidimensional Item Response Theory in Clinical Measurement: A Bifactor Graded-Response Model Analysis of the Outcome-Questionnaire-45.2Berkeljon, Arjan 22 May 2012 (has links) (PDF)
Bifactor Item Response Theory (IRT) models are presented as a plausible structure for psychological measures with a primary scale and two or more subscales. A bifactor graded response model, appropriate for polytomous categorical data, was fit to two university counseling center datasets (N=4,679 and N=4,500) of Outcome-Questionnaire-45.2 (OQ) psychotherapy intake data. The bifactor model showed superior fit compared to a unidimensional IRT model. IRT item parameters derived from the bifactor model show that items discriminate well on the primary scale. Items on the OQ's subscales maintain some discrimination ability over and above the primary scale. However, reliability estimates for the subscales, controlling for the primary scale, suggest that clinical use should likely proceed with caution. Item difficulty or severity parameters reflected item content well, in that increased probability of endorsement was found at high levels of distress for items tapping severe symptomatology. Increased probability of endorsement was found at lower levels of distress for items tapping milder symptomatology. Analysis of measurement invariance showed that item parameters hold equally across gender for most OQ items. A subset of items was found to have item parameters non-invariant across gender. Implications for research and practice are discussed, and directions for future work given.
|
188 |
The Use of Item Response Theory in Developing a Phonics Diagnostic InventoryPirani-McGurl, Cynthia A. 01 May 2009 (has links)
This study was conducted to investigate the reliability of the Phonics Diagnostic Inventory (PDI), a curriculum-based, specific skill mastery measurement tool for diagnosing and informing the treatment of decoding weaknesses. First, a modified one-parameter item response theory model was employed to identify the properties of potential items for inclusion in each subtest to then inform the construction of subtests using the most reliable items. Second, the properties of each subtest were estimated and examined. The test information and test characteristic curves (TCC) for the newly developed forms are reported. Finally, the accuracy and sensitivity of PDI cut scores for each subtest were examined. Specifically, based upon established cut scores, the accuracy with which students would be identified as in need of support and those who are not in need of support were investigated. The PDI generated from this research was found to more reliably diagnose specific decoding deficits in mid-year second grade students than initially constructed forms. Research also indicates further examination of cut scores is warranted to maximize decision consistency. Implications for future studies are also discussed.
|
189 |
An Assessment of The Nonparametric Approach for Evaluating The Fit of Item Response ModelsLiang, Tie 01 February 2010 (has links)
As item response theory (IRT) has developed and is widely applied, investigating the fit of a parametric model becomes an important part of the measurement process when implementing IRT. The usefulness and successes of IRT applications rely heavily on the extent to which the model reflects the data, so it is necessary to evaluate model-data fit by gathering sufficient evidence before any model application. There is a lack of promising solutions on the detection of model misfit in IRT. In addition, commonly used fit statistics are not satisfactory in that they often do not possess desirable statistical properties and lack a means of examining the magnitude of misfit (e.g., via graphical inspections). In this dissertation, a newly-proposed nonparametric approach, RISE was thoroughly and comprehensively studied. Specifically, the purposes of this study are to (a) examine the promising fit procedure, RISE, (b) compare the statistical properties of RISE with that of the commonly used goodness-of-fit procedures, and (c) investigate how RISE may be used to examine the consequences of model misfit. To reach the above-mentioned goals, both a simulation study and empirical study were conducted. In the simulation study, four factors including ability distribution, sample size, test length and model were varied as the factors which may influence the performance of a fit statistic. The results demonstrated that RISE outperformed G2 and S-X2 in that it controlled Type I error rates and provided adequate power under all conditions. In the empirical study, the three fit statistics were applied to one empirical data and the misfitting items were flagged. RISE and S-X2 detected reasonable numbers of misfitting items while G2 detected almost all items when sample size is large. To further demonstrate an advantage of RISE, the residual plot on each misfitting item was shown. Compared to G2 and S-X2, RISE gave a much clearer picture of the location and magnitude of misfit for each misfitting item. Other than statistical properties and graphical displays, the score distribution and test characteristic curve (TCC) were investigated as model misfit consequence. The results indicated that for the given data, there was no practical consequence on classification before and after replacement of misfitting items detected by three fit statistics.
|
190 |
An Evaluation of DIF Tests in Multistage Tests for Continuous CovariatesDebelak, Rudolf, Debeer, Dries 22 January 2024 (has links)
Multistage tests are a widely used and efficient type of test presentation that aims to
provide accurate ability estimates while keeping the test relatively short. Multistage tests typically
rely on the psychometric framework of item response theory. Violations of item response models and
other assumptions underlying a multistage test, such as differential item functioning, can lead to
inaccurate ability estimates and unfair measurements. There is a practical need for methods to detect
problematic model violations to avoid these issues. This study compares and evaluates three methods
for the detection of differential item functioning with regard to continuous person covariates in data
from multistage tests: a linear logistic regression test and two adaptations of a recently proposed
score-based DIF test. While all tests show a satisfactory Type I error rate, the score-based tests show
greater power against three types of DIF effects.
|
Page generated in 0.0671 seconds