Spelling suggestions: "subject:"tem byelection"" "subject:"tem dielection""
1 |
An empirical comparison of item response theory and classical test theory item/person statisticsCourville, Troy Gerard 15 November 2004 (has links)
In the theory of measurement, there are two competing measurement frameworks, classical test theory and item response theory. The present study empirically examined, using large scale norm-referenced data, how the item and person statistics behaved under the two competing measurement frameworks. The study focused on two central themes: (1) How comparable are the item and person statistics derived from the item response and classical test framework? (2) How invariant are the item statistics from each measurement framework across examinee samples? The findings indicate that, in a variety of conditions, the two measurement frameworks produce similar item and person statistics. Furthermore, although proponents of item response theory have centered their arguments for its use on the property of invariance, classical test theory statistics, for this sample, are just as invariant.
|
2 |
An empirical comparison of item response theory and classical test theory item/person statisticsCourville, Troy Gerard 15 November 2004 (has links)
In the theory of measurement, there are two competing measurement frameworks, classical test theory and item response theory. The present study empirically examined, using large scale norm-referenced data, how the item and person statistics behaved under the two competing measurement frameworks. The study focused on two central themes: (1) How comparable are the item and person statistics derived from the item response and classical test framework? (2) How invariant are the item statistics from each measurement framework across examinee samples? The findings indicate that, in a variety of conditions, the two measurement frameworks produce similar item and person statistics. Furthermore, although proponents of item response theory have centered their arguments for its use on the property of invariance, classical test theory statistics, for this sample, are just as invariant.
|
3 |
A Comparison of Two Criterion-Referenced Item-Selection Techniques Utilizing Simulated Data with Item Pools that Vary in Degrees of Item DifficultyDavis, Robbie G. 05 1900 (has links)
The problem of this study was to examine the equivalency of two different types of criterion-referenced item-selection techniques on simulated data as item pools varied in degrees of item difficulty. A pretest-posttest design was employed in which pass-fail scores were randomly generated for item pools of twenty-five items. From the item pools, the two techniques determined which items were to be used to make up twelve-item criterion-referenced tests. The twenty-five items also were rank ordered according to the discrimination power of the two techniques.
|
4 |
Stratified item selection and exposure control in unidimensional adaptive testing in the presence of two-dimensional data.Kalinowski, Kevin E. 08 1900 (has links)
It is not uncommon to use unidimensional item response theory (IRT) models to estimate ability in multidimensional data. Therefore it is important to understand the implications of summarizing multiple dimensions of ability into a single parameter estimate, especially if effects are confounded when applied to computerized adaptive testing (CAT). Previous studies have investigated the effects of different IRT models and ability estimators by manipulating the relationships between item and person parameters. However, in all cases, the maximum information criterion was used as the item selection method. Because maximum information is heavily influenced by the item discrimination parameter, investigating a-stratified item selection methods is tenable. The current Monte Carlo study compared maximum information, a-stratification, and a-stratification with b blocking item selection methods, alone, as well as in combination with the Sympson-Hetter exposure control strategy. The six testing conditions were conditioned on three levels of interdimensional item difficulty correlations and four levels of interdimensional examinee ability correlations. Measures of fidelity, estimation bias, error, and item usage were used to evaluate the effectiveness of the methods. Results showed either stratified item selection strategy is warranted if the goal is to obtain precise estimates of ability when using unidimensional CAT in the presence of two-dimensional data. If the goal also includes limiting bias of the estimate, Sympson-Hetter exposure control should be included. Results also confirmed that Sympson-Hetter is effective in optimizing item pool usage. Given these results, existing unidimensional CAT implementations might consider employing a stratified item selection routine plus Sympson-Hetter exposure control, rather than recalibrate the item pool under a multidimensional model.
|
5 |
Learning what to learn: The effects of task experience on strategy shifts in the allocation of study timeAriel, Robert 17 July 2012 (has links)
No description available.
|
6 |
Utilizing response time for item selection in on-the-fly multistage adaptive testing for PISA assessmentXiuxiu Tang (18430326) 25 April 2024 (has links)
<p dir="ltr">Multistage adaptive testing (MST) has become one of the most popular test designs for large-scale testing. However, it has some weaknesses such as a larger estimation bias compared to computerized adaptive testing (CAT). On-the-fly multistage adaptive testing (OMST) can balance the advantages and limitations of CAT and MST. Several CAT item selection methods that incorporate response time have been proposed. However, incorporating response time into OMST to select items was rarely studied. The study plans to explore the possibility of applying OMST with response time to the Programme for International Student Assessment to solve the issue of large estimation bias and improve test efficiency.</p>
|
7 |
Uma abordagem personalizada no processo de seleção de itens em Testes Adaptativos Computadorizados / A personalized approach to the item selection process in Computerized Adaptive TestingVictor Miranda Gonçalves Jatobá 08 October 2018 (has links)
Testes Adaptativos Computadorizados (CAT) baseados na Teoria de Resposta ao Item permitem fazer testes mais precisos com um menor número de questões em relação à prova clássica feita a papel. Porém a construção de CAT envolve alguns questionamentos-chave, que quando feitos de forma adequada, podem melhorar ainda mais a precisão e a eficiência na estimativa das habilidades dos respondentes. Um dos principais questionamentos está na escolha da Regra de Seleção de Itens (ISR). O CAT clássico, faz uso, exclusivamente, de uma ISR. Entretanto, essas regras possuem vantagens, entre elas, a depender do nível de habilidade e do estágio em que o teste se encontra. Assim, o objetivo deste trabalho é reduzir o comprimento de provas dicotômicas - que consideram apenas se a resposta foi correta ou incorreta - que estão inseridas no ambiente de um CAT que faz uso, exclusivo, de apenas uma ISR sem perda significativa de precisão da estimativa das habilidades. Para tal, cria-se a abordagem denominada ALICAT que personaliza o processo de seleção de itens em CAT, considerando o uso de mais de uma ISR. Para aplicar essa abordagem é necessário primeiro analisar o desempenho de diferentes ISRs. Um estudo de caso na prova de Matemática e suas tecnologias do ENEM de 2012, indica que a regra de seleção de Kullback-Leibler com distribuição a posteriori (KLP) possui melhor desempenho na estimativa das habilidades dos respondentes em relação as regras: Informação de Fisher (F); Kullback-Leibler (KL); Informação Ponderada pela Máxima Verossimilhança (MLWI); e Informação ponderada a posteriori (MPWI). Resultados prévios da literatura mostram que CAT utilizando a regra KLP conseguiu reduzir a prova do estudo de caso em 46,6% em relação ao tamanho completo de 45 itens sem perda significativa na estimativa das habilidades. Neste trabalho, foi observado que as regras F e a MLWI tiveram melhor desempenho nos estágios inicias do CAT, para estimar respondentes com níveis de habilidades extremos negativos e positivos, respectivamente. Com a utilização dessas regras de seleção em conjunto, a abordagem ALICAT reduziu a mesma prova em 53,3% / Computerized Adaptive Testing (CAT) based on Item Response Theory allows more accurate assessments with fewer questions than the classic paper test. Nonetheless, the CAT building involves some key questions that, when done properly, can further improve the accuracy and efficiency in estimating examinees\' abilities. One of the main questions is in regard to choosing the Item Selection Rule (ISR). The classic CAT makes exclusive use of one ISR. However, these rules have differences depending on the examinees\' ability level and on the CAT stage. Thus, the objective of this work is to reduce the dichotomous - which considers only correct and incorrect answers - test size which is inserted on a classic CAT without significant loss of accuracy in the estimation of the examinee\'s ability level. For this purpose, we create the ALICAT approach that personalizes the item selection process in a CAT considering the use of more than one ISR. To apply this approach, we first analyze the performance of different ISRs. The case study in textit test of the ENEM 2012 shows that the Kullback-Leibler Information with a Posterior Distribution (KLP) has better performance in the examinees\' ability estimation when compared with: Fisher Information (F); Kullback-Leibler Information (KL); Maximum Likelihood Weighted Information(MLWI); and Maximum Posterior Weighted Information (MPWI) rules. Previous results in the literature show that CAT using KLP was able to reduce this test size by 46.6% from the full size of 45 items with no significant loss of accuracy in estimating the examinees\' ability level. In this work, we observe that the F and the MLWI rules performed better on early CAT stages to estimate examinees proficiency level with extreme negative and positive values, respectively. With this information, we were able to reduce the same test by 53.3% using an approach that uses the best rules together
|
8 |
Uma abordagem personalizada no processo de seleção de itens em Testes Adaptativos Computadorizados / A personalized approach to the item selection process in Computerized Adaptive TestingJatobá, Victor Miranda Gonçalves 08 October 2018 (has links)
Testes Adaptativos Computadorizados (CAT) baseados na Teoria de Resposta ao Item permitem fazer testes mais precisos com um menor número de questões em relação à prova clássica feita a papel. Porém a construção de CAT envolve alguns questionamentos-chave, que quando feitos de forma adequada, podem melhorar ainda mais a precisão e a eficiência na estimativa das habilidades dos respondentes. Um dos principais questionamentos está na escolha da Regra de Seleção de Itens (ISR). O CAT clássico, faz uso, exclusivamente, de uma ISR. Entretanto, essas regras possuem vantagens, entre elas, a depender do nível de habilidade e do estágio em que o teste se encontra. Assim, o objetivo deste trabalho é reduzir o comprimento de provas dicotômicas - que consideram apenas se a resposta foi correta ou incorreta - que estão inseridas no ambiente de um CAT que faz uso, exclusivo, de apenas uma ISR sem perda significativa de precisão da estimativa das habilidades. Para tal, cria-se a abordagem denominada ALICAT que personaliza o processo de seleção de itens em CAT, considerando o uso de mais de uma ISR. Para aplicar essa abordagem é necessário primeiro analisar o desempenho de diferentes ISRs. Um estudo de caso na prova de Matemática e suas tecnologias do ENEM de 2012, indica que a regra de seleção de Kullback-Leibler com distribuição a posteriori (KLP) possui melhor desempenho na estimativa das habilidades dos respondentes em relação as regras: Informação de Fisher (F); Kullback-Leibler (KL); Informação Ponderada pela Máxima Verossimilhança (MLWI); e Informação ponderada a posteriori (MPWI). Resultados prévios da literatura mostram que CAT utilizando a regra KLP conseguiu reduzir a prova do estudo de caso em 46,6% em relação ao tamanho completo de 45 itens sem perda significativa na estimativa das habilidades. Neste trabalho, foi observado que as regras F e a MLWI tiveram melhor desempenho nos estágios inicias do CAT, para estimar respondentes com níveis de habilidades extremos negativos e positivos, respectivamente. Com a utilização dessas regras de seleção em conjunto, a abordagem ALICAT reduziu a mesma prova em 53,3% / Computerized Adaptive Testing (CAT) based on Item Response Theory allows more accurate assessments with fewer questions than the classic paper test. Nonetheless, the CAT building involves some key questions that, when done properly, can further improve the accuracy and efficiency in estimating examinees\' abilities. One of the main questions is in regard to choosing the Item Selection Rule (ISR). The classic CAT makes exclusive use of one ISR. However, these rules have differences depending on the examinees\' ability level and on the CAT stage. Thus, the objective of this work is to reduce the dichotomous - which considers only correct and incorrect answers - test size which is inserted on a classic CAT without significant loss of accuracy in the estimation of the examinee\'s ability level. For this purpose, we create the ALICAT approach that personalizes the item selection process in a CAT considering the use of more than one ISR. To apply this approach, we first analyze the performance of different ISRs. The case study in textit test of the ENEM 2012 shows that the Kullback-Leibler Information with a Posterior Distribution (KLP) has better performance in the examinees\' ability estimation when compared with: Fisher Information (F); Kullback-Leibler Information (KL); Maximum Likelihood Weighted Information(MLWI); and Maximum Posterior Weighted Information (MPWI) rules. Previous results in the literature show that CAT using KLP was able to reduce this test size by 46.6% from the full size of 45 items with no significant loss of accuracy in estimating the examinees\' ability level. In this work, we observe that the F and the MLWI rules performed better on early CAT stages to estimate examinees proficiency level with extreme negative and positive values, respectively. With this information, we were able to reduce the same test by 53.3% using an approach that uses the best rules together
|
9 |
A comparison of item selection procedures using different ability estimation methods in computerized adaptive testing based on the generalized partial credit modelHo, Tsung-Han 17 September 2010 (has links)
Computerized adaptive testing (CAT) provides a highly efficient alternative to the paper-and-pencil test. By selecting items that match examinees’ ability levels, CAT not only can shorten test length and administration time but it can also increase measurement precision and reduce measurement error.
In CAT, maximum information (MI) is the most widely used item selection procedure. However, the major challenge with MI is the attenuation paradox, which results because the MI algorithm may lead to the selection of items that are not well targeted at an examinee’s true ability level, resulting in more errors in subsequent ability estimates. The solution is to find an alternative item selection procedure or an appropriate ability estimation method. CAT studies have not investigated the association between these two components of a CAT system based on polytomous IRT models.
The present study compared the performance of four item selection procedures (MI, MPWI, MEI, and MEPV) across four ability estimation methods (MLE, WLE, EAP-N, and EAP-PS) under the mixed-format CAT based on the generalized partial credit model (GPCM). The test-unit pool and generated responses were based on test-units calibrated from an operational national test that included both independent dichotomous items and testlets. Several test conditions were manipulated: the unconstrained CAT as well as the constrained CAT in which the CCAT was used as the content-balancing, and the progressive-restricted procedure with maximum exposure rate equal to 0.19 (PR19) served as the exposure control in this study. The performance of various CAT conditions was evaluated in terms of measurement precision, exposure control properties, and the extent of selected-test-unit overlap.
Results suggested that all item selection procedures, regardless of ability estimation methods, performed equally well in all evaluation indices across two CAT conditions. The MEPV procedure, however, was favorable in terms of a slightly lower maximum exposure rate, better pool utilization, and reduced test and selected-test-unit overlap than with the other three item selection procedures when both CCAT and PR19 procedures were implemented. It is not necessary to implement the sophisticated and computing-intensive Bayesian item selection procedures across ability estimation methods under the GPCM-based CAT.
In terms of the ability estimation methods, MLE, WLE, and two EAP methods, regardless of item selection procedures, did not produce practical differences in all evaluation indices across two CAT conditions. The WLE method, however, generated significantly fewer non-convergent cases than did the MLE method. It was concluded that the WLE method, instead of MLE, should be considered, because the non-convergent case is less of an issue. The EAP estimation method, on the other hand, should be used with caution unless an appropriate prior θ distribution is specified. / text
|
10 |
Teoria e a prática de um teste adaptativo informatizado / Theory and practice of computerized adaptive testingSassi, Gilberto Pereira 10 April 2012 (has links)
O objetivo deste trabalho é apresentar os conceitos relacionados a Teste Adaptativo Informatizado, ou abreviadamente TAI, para o modelo logístico unidimensional da Teoria de Resposta ao Item. Utilizamos a abordagem bayesiana para a estimação do parâmetro de interesse, chamado de traço latente ou habilidade. Apresentamos os principais algoritmos de seleção de itens em TAI e realizamos estudos de simulação para comparar o desempenho deles. Para comparação, usamos aproximações numéricas para o Erro Quadrático Médio e para o Vício e também calculamos o tempo médio para o TAI selecionar um item. Além disso, apresentamos como instalar e usar a implementação de TAI desenvolvida neste projeto chamada de TAI2U, que foi desenvolvido no VBA-Excel usando uma interface com o R / The main of this work is to introduce the subjects related to Computerized Adaptive Testing, or breafly CAT, for the unidimensional three-parameter logistic model of Item Response Theory. We use bayesian approach to estimate the parameter of interest. We present several item selection algorithms and we perform simulations comparing them. The comparisons are made in terms of the mean square error, bias of the trait estimates, the average time for item selection and the average length of test. Furthermore, we show how to install e use the CAT implementation of this work called built in MIcrosoft Excel - VBA using interface with the statistical package R
|
Page generated in 0.1043 seconds