Global ETD Search

61	A comparison of item selection procedures using different ability estimation methods in computerized adaptive testing based on the generalized partial credit model Ho, Tsung-Han 17 September 2010 (has links) Computerized adaptive testing (CAT) provides a highly efficient alternative to the paper-and-pencil test. By selecting items that match examinees’ ability levels, CAT not only can shorten test length and administration time but it can also increase measurement precision and reduce measurement error. In CAT, maximum information (MI) is the most widely used item selection procedure. However, the major challenge with MI is the attenuation paradox, which results because the MI algorithm may lead to the selection of items that are not well targeted at an examinee’s true ability level, resulting in more errors in subsequent ability estimates. The solution is to find an alternative item selection procedure or an appropriate ability estimation method. CAT studies have not investigated the association between these two components of a CAT system based on polytomous IRT models. The present study compared the performance of four item selection procedures (MI, MPWI, MEI, and MEPV) across four ability estimation methods (MLE, WLE, EAP-N, and EAP-PS) under the mixed-format CAT based on the generalized partial credit model (GPCM). The test-unit pool and generated responses were based on test-units calibrated from an operational national test that included both independent dichotomous items and testlets. Several test conditions were manipulated: the unconstrained CAT as well as the constrained CAT in which the CCAT was used as the content-balancing, and the progressive-restricted procedure with maximum exposure rate equal to 0.19 (PR19) served as the exposure control in this study. The performance of various CAT conditions was evaluated in terms of measurement precision, exposure control properties, and the extent of selected-test-unit overlap. Results suggested that all item selection procedures, regardless of ability estimation methods, performed equally well in all evaluation indices across two CAT conditions. The MEPV procedure, however, was favorable in terms of a slightly lower maximum exposure rate, better pool utilization, and reduced test and selected-test-unit overlap than with the other three item selection procedures when both CCAT and PR19 procedures were implemented. It is not necessary to implement the sophisticated and computing-intensive Bayesian item selection procedures across ability estimation methods under the GPCM-based CAT. In terms of the ability estimation methods, MLE, WLE, and two EAP methods, regardless of item selection procedures, did not produce practical differences in all evaluation indices across two CAT conditions. The WLE method, however, generated significantly fewer non-convergent cases than did the MLE method. It was concluded that the WLE method, instead of MLE, should be considered, because the non-convergent case is less of an issue. The EAP estimation method, on the other hand, should be used with caution unless an appropriate prior θ distribution is specified. / text Keyword 1 Computerized adaptive testing Keyword 2 Item selection procedure Keyword 3 Ability estimation method Keyword 4 Generalized partial credit model
62	Počítačové adaptivní testování a možnosti jeho využití v psychodiagnostice / Computerized adaptive testing and its use in psychodiagnostic Dlouhá, Jana January 2014 (has links) 5 Abstract The theoretical part of the paper focuses on computerized adaptive testing (CAT) and item response theory (IRT). Also included is a chapter comparing IRT with the commonly used classical test theory (CTT). There is also a brief mention of computerized and online testing, as these types of administration differ in many aspects from conventional paper & pencil tests. The goal of this paper was to evaluate the individual ways of eEPI test administration and to compare them with eEPQ tests and self-evaluation. In the practical part the items of the extraversion scale of the Eysenck Personality Inventory (eEPI) were calibrated using a group of 124 respondents. The acquired data were subsequently used to carry out a simulation of computerized adaptive testing, which clearly demonstrated the benefits of this type of testing in comparison to the classical test form. These results were compared with the results of real CAT test administration using the original sample and a new group of respondents (Np=69, Nn=68). The results were highly correlated with the results of the simulated test. Moreover, to verify the validity of the computerized adaptive version of eEOD, the respondents' results in this test were compared with the results in the eEPQ test and in a short self-assessment scale. Finally,...
63	Teste adaptativo informatizado da Provinha Brasil: a construção de um instrumento de apoio para professores(as) e gestores(as) de escolas / Computerized adaptive test of Provinha Brasil: the construction of a supportive instrument for teachers and school administrators Catalani, Érica Maria Toledo 29 March 2019 (has links) Esta Tese resulta de um projeto de construção de um Teste Adaptativo Informatizado (TAI) para a versão em papel e lápis da Provinha Brasil (PB), focado na avaliação da proficiência em leitura. O teste da PB Leitura, apesar de possuir elementos de ordem técnica e conceitual para a constituição de uma avaliação educacional e de seu amplo uso por professores dos anos iniciais do ensino fundamental, apresentava limitações que poderiam ser superadas por testes adaptados aos perfis de aprendizagem dos estudantes e com resultados mais fidedignos para apoiar as decisões pedagógicas de professores(as) e gestores(as) escolares. Assim, buscou-se responder à questão: É possível construir um TAI para a versão impressa da PB Leitura que seja ponto de apoio para professores(a) na avaliação de alunos(as) dos anos iniciais do ensino fundamental?. Para a construção dessa ferramenta TAI da PB Leitura foi necessário articular engenheiros de softwares, elaboradores de testes, pesquisadores e profissionais da educação de 15 escolas públicas do município de São Paulo. Para que pudessem participar da construção da ferramenta e da validação dos resultados obtidos, foi realizada formação de professores(as) e gestores(as) educacionais sobre medida educacional, leitura e avaliação. Após a verificação de que os aspectos psicométricos dos itens da versão impressa poderiam ser mantidos para a versão informatizada, o TAI da PB Leitura foi aplicado e os resultados indicaram que ele permitiu testes personalizados aos domínios dos(as) alunos(as), mais rápidos e de menor comprimento, sem prejuízo da precisão. Por apresentar resultados embasados em uma escala com importante interpretação pedagógica, o TAI da PB Leitura se revelou capaz de apoiar a prática avaliativa de professores(as) e gestores(as) e o trabalho pedagógico na alfabetização e no letramento inicial. Esse apoio foi potencializado com o acréscimo de uma regra ao critério de parada do TAI, utilizada em testes que visam a classificação do respondente em níveis de resultado. Verificouse também a necessidade de aprofundar as investigações sobre: a formação de professores(as) na temática da medida e avaliação; a ampliação do banco de itens, com a finalidade de controle de taxas de exposição e balanceamento de conteúdo, e a produção de relatórios pedagógicos. / This thesis results from a project of construction of a Computerized Adaptive Test (CAT) for the paper and pencil version of Provinha Brasil (PB), focused on the assessment of proficiency in reading. The PB Reading test, despite having technical and conceptual elements for the constitution of an educational assessment and its wide use by teachers of the initial years of elementary school, presented limitations that could be overcome by tests adapted to the learning styles of students and with much more reliable outcomes to support the pedagogical decisions of teachers and school administrators. Thus, it was sought to answer the question: \"Is it possible to create a CAT for the printed version of PB Reading test which would be a base of assistance for teachers in the assessment of students in the initial years of elementary education?\" For the creation of this CAT tool from PB Reading test it was necessary to articulate software engineers, test designers, researchers and education professionals from 15 public schools from São Paulo city. In order to take part in the creation of the tool and the validation of the achieved results, it was made teachers and educational managers training on educational measures, reading and assessment. After verifying that the psychometric aspects of the printed version items could be kept for the computerized version, the PB Reading CAT was applied and the results indicated that it allowed customized testing to the students domains, faster and of smaller length, without prejudice of the precision. Based on a scale with an important pedagogical interpretation, the PB Reading CAT was able to support the assessment practice of teachers and managers and the pedagogical work in literacy and initial literacy. This support was strengthened by adding a rule to the CAT stopping criterion, used in tests that aim to classify the respondent into outcome levels. There was also a need to deepen the research on: teacher training in the subject of measurement and assessment; the expansion of the item base, for the purpose of controlling exposure rates and content balancing, and the production of pedagogical reports. Alfabetização Avaliação educacional Computerized Adaptive Testing (CAT) Educational assessment Literacy Proficiência em leitura Provinha Brasil Provinha Brasil Reading diagnostics Teste Adaptativo Informatizado (TAI)
64	Teoria e a prática de um teste adaptativo informatizado / Theory and practice of computerized adaptive testing Sassi, Gilberto Pereira 10 April 2012 (has links) O objetivo deste trabalho é apresentar os conceitos relacionados a Teste Adaptativo Informatizado, ou abreviadamente TAI, para o modelo logístico unidimensional da Teoria de Resposta ao Item. Utilizamos a abordagem bayesiana para a estimação do parâmetro de interesse, chamado de traço latente ou habilidade. Apresentamos os principais algoritmos de seleção de itens em TAI e realizamos estudos de simulação para comparar o desempenho deles. Para comparação, usamos aproximações numéricas para o Erro Quadrático Médio e para o Vício e também calculamos o tempo médio para o TAI selecionar um item. Além disso, apresentamos como instalar e usar a implementação de TAI desenvolvida neste projeto chamada de TAI2U, que foi desenvolvido no VBA-Excel usando uma interface com o R / The main of this work is to introduce the subjects related to Computerized Adaptive Testing, or breafly CAT, for the unidimensional three-parameter logistic model of Item Response Theory. We use bayesian approach to estimate the parameter of interest. We present several item selection algorithms and we perform simulations comparing them. The comparisons are made in terms of the mean square error, bias of the trait estimates, the average time for item selection and the average length of test. Furthermore, we show how to install e use the CAT implementation of this work called built in MIcrosoft Excel - VBA using interface with the statistical package R Algoritmo de seleção de item Computerized adaptive testing Item response theory Item selection algorithms Modelo logístico unidimensional Teoria de resposta ao item Teste adaptativo informatizado
65	Multistage adaptive testing based on logistic positive exponent model / Teste adaptativo multiestágio baseado no modelo logístico de expoente positivo Thales Akira Matsumoto Ricarte 08 December 2016 (has links) The Logistic Positive Exponent (LPE) model from Item Response Theory (IRT) and the Multistage Adaptive Testing (MST) using this model are the focus of this dissertation. For the LPE, item parameter estimations efficiency was studied, it was also analyzed the latent trait estimation for different response patterns to verify the effects it has on guessing and accidental mistakes. The LPE was put in contrast to Rasch, 2 and 3 parameter logistic models to compare the its efficiency. The item parameter estimations were implemented using the Bayesian approach for the Monte Carlo Markov Chain and the Marginal Maximum Likelihood. The latent trait estimation were calculated by the Expected a Posterior method. A goodness of fit analysis were made using the Posterior Predictive model-check method and information statistics. In the MST perspective, the LPE was compared with the Rasch and 2 logistic models. Different tests were constructed using methods that uses optimization functions to select items from a bank. Three functions were chosen to this task: the Fisher and Kullback-Leibler informations and the Continuous Entropy Method. The results were obtained with simulated and real data, the latter was from a general science knowledge test calls General Science test and it was provided by the Educational Testing Service company. Results showed that the LPE might help individuals that made mistakes in earlier stage of the test, especially for easy items. However, the LPE requires a large individual sample and time to estimate the item parameters making it an expensive model. MST based on LPE can be dissolve the impact of accidental mistakes from high performance test takers depending of the item pool available and the way the test is constructed. The optimization function performance vary depending of the situation. / O modelo Logístico de Expoente Positivo (LPE) da Teoria de Resposta ao Item (IRT) e o Teste Adaptativo Multiestágio (MST) sob esse modelo são os focos desta tese. Para o LPE, a eficiência da estimações dos parâmetros dos itens foram estudados, também foi analisado como as estimativas dos parâmetros dos indivíduos foram influenciados por padrões de respostas contendo chutes ou erros acidentais. O LPE foi comparado com os modelos de Rasch, Logístico de 2 e 3 Parâmetros para verificar seu desempenho. A estimação dos parâmetros dos itens foi implementada usando Monte Carlo via cadeias de Markov sob a abordagem Bayesiana e a Máxima Verossimilhança Marginal. As estimações dos traços latentes foram calculadas através do Método da Esperança a Posteriori. A qualidade do ajuste dos modelos foram analisadas usando o método Posterior Predictive model-check e critério de informações. Sob o contexto do MST, o LPE foi comparado com os modelos de Rasch e Logístico de 2 Parâmetro. Os MSTs foram construídos usando diferentes funções de objetivas que selecionaram os itens de bancos para comporem os testes. Três funções foram escolhidas para esse trabalho: As informações de Fisher e Kullback-Leibler e o Continuous Entropy Method. Os resultados para dados simulados e reais foram obtidos, os dados reais eram consituídos de respostas a perguntas sob conhecimento científico de do General Science test que foram fornecidos pela empresa Educational Testing Service. Resultados mostraram que o LPE pode ajudar os indivíduos que cometeram erros acidentais nas primeiras perguntas do teste, especialmente para os itens fáceis. Entretanto, este modelo requer tempo e uma grande quantidade de amostras de indivíduos para calcular as estimativas dos parâmetros dos itens o que o torna um modelo caro. O MST sob o modelo LPE pode diminuir o impacto de erros acidentais cometidos por examinandos com alto desempenho dependendo dos itens disponíveis no banco e a forma de construção do MST. O desempenho das funções objetivas variaram de acordo com cada situação. Continuous entropy method Informação de Fisher Informação de Kullback-Leibler Logístico de expoente positivo Teste adaptativo multiestágio Continuous entropy method Fisher information Kullback-Leibler information Logistic positive exponent Multistage adaptive testing
66	Multistage adaptive testing based on logistic positive exponent model / Teste adaptativo multiestágio baseado no modelo logístico de expoente positivo Ricarte, Thales Akira Matsumoto 08 December 2016 (has links) The Logistic Positive Exponent (LPE) model from Item Response Theory (IRT) and the Multistage Adaptive Testing (MST) using this model are the focus of this dissertation. For the LPE, item parameter estimations efficiency was studied, it was also analyzed the latent trait estimation for different response patterns to verify the effects it has on guessing and accidental mistakes. The LPE was put in contrast to Rasch, 2 and 3 parameter logistic models to compare the its efficiency. The item parameter estimations were implemented using the Bayesian approach for the Monte Carlo Markov Chain and the Marginal Maximum Likelihood. The latent trait estimation were calculated by the Expected a Posterior method. A goodness of fit analysis were made using the Posterior Predictive model-check method and information statistics. In the MST perspective, the LPE was compared with the Rasch and 2 logistic models. Different tests were constructed using methods that uses optimization functions to select items from a bank. Three functions were chosen to this task: the Fisher and Kullback-Leibler informations and the Continuous Entropy Method. The results were obtained with simulated and real data, the latter was from a general science knowledge test calls General Science test and it was provided by the Educational Testing Service company. Results showed that the LPE might help individuals that made mistakes in earlier stage of the test, especially for easy items. However, the LPE requires a large individual sample and time to estimate the item parameters making it an expensive model. MST based on LPE can be dissolve the impact of accidental mistakes from high performance test takers depending of the item pool available and the way the test is constructed. The optimization function performance vary depending of the situation. / O modelo Logístico de Expoente Positivo (LPE) da Teoria de Resposta ao Item (IRT) e o Teste Adaptativo Multiestágio (MST) sob esse modelo são os focos desta tese. Para o LPE, a eficiência da estimações dos parâmetros dos itens foram estudados, também foi analisado como as estimativas dos parâmetros dos indivíduos foram influenciados por padrões de respostas contendo chutes ou erros acidentais. O LPE foi comparado com os modelos de Rasch, Logístico de 2 e 3 Parâmetros para verificar seu desempenho. A estimação dos parâmetros dos itens foi implementada usando Monte Carlo via cadeias de Markov sob a abordagem Bayesiana e a Máxima Verossimilhança Marginal. As estimações dos traços latentes foram calculadas através do Método da Esperança a Posteriori. A qualidade do ajuste dos modelos foram analisadas usando o método Posterior Predictive model-check e critério de informações. Sob o contexto do MST, o LPE foi comparado com os modelos de Rasch e Logístico de 2 Parâmetro. Os MSTs foram construídos usando diferentes funções de objetivas que selecionaram os itens de bancos para comporem os testes. Três funções foram escolhidas para esse trabalho: As informações de Fisher e Kullback-Leibler e o Continuous Entropy Method. Os resultados para dados simulados e reais foram obtidos, os dados reais eram consituídos de respostas a perguntas sob conhecimento científico de do General Science test que foram fornecidos pela empresa Educational Testing Service. Resultados mostraram que o LPE pode ajudar os indivíduos que cometeram erros acidentais nas primeiras perguntas do teste, especialmente para os itens fáceis. Entretanto, este modelo requer tempo e uma grande quantidade de amostras de indivíduos para calcular as estimativas dos parâmetros dos itens o que o torna um modelo caro. O MST sob o modelo LPE pode diminuir o impacto de erros acidentais cometidos por examinandos com alto desempenho dependendo dos itens disponíveis no banco e a forma de construção do MST. O desempenho das funções objetivas variaram de acordo com cada situação. Continuous entropy method Continuous entropy method Fisher information Informação de Fisher Informação de Kullback-Leibler Kullback-Leibler information Logistic positive exponent Logístico de expoente positivo Multistage adaptive testing Teste adaptativo multiestágio
67	Test case generation using symbolic grammars and quasirandom sequences Felix Reyes, Alejandro 06 1900 (has links) This work presents a new test case generation methodology, which has a high degree of automation (cost reduction); while providing increased power in terms of defect detection (benefits increase). Our solution is a variation of model-based testing, which takes advantage of symbolic grammars (a context-free grammar where terminals are replaced by regular expressions that represent their solution space) and quasi-random sequences to generate test cases. Previous test case generation techniques are enhanced with adaptive random testing to maximize input space coverage; and selective and directed sentence generation techniques to optimize sentence generation. Our solution was tested by generating 200 firewall policies containing up to 20 000 rules from a generic firewall grammar. Our results show how our system generates test cases with superior coverage of the input space, increasing the probability of defect detection while reducing considerably the needed number the test cases compared with other previously used approaches. / Software Engineering and Intelligent Systems symbolic grammars model-based testing testing firewall testing firewall policies quasi-random sequences adaptive testing firewalls grammar-based testing grammars data generation test data
68	Comparison Of Linear And Adaptive Versions Of The Turkish Pupil Monitoring System (pms) Mathematics Assessment Gokce, Semirhan 01 July 2012 (has links) (PDF) Until the developments in computer technology, linear test administrations within classical test theory framework is mostly used in testing practices. These tests contain a set of predefined items in a large range of difficulty values for collecting information from students at various ability levels. However, placing very easy and very difficult items in the same test not only cause wasting time and effort but also introduces possible extraneous variables into the measurement process such as possibility of guessing, chance of careless errors induced by boredom or frustration. Instead of administering a linear test there is another option that adapts the difficulty of test according to the ability level of examinees which is named as computerized adaptive test. Computerized adaptive tests use item response theory as a measurement framework and have algorithms responsible for item selection, ability estimation, starting rule and test termination. The present study aims to determine the applicability of computerized adaptive testing (CAT) to Turkish Pupil Monitoring System&rsquo / s (PMS) mathematics assessments. Therefore, live CAT study using only multiple choice items is designed to investigate whether to obtain comparable ability estimations. Afterwards, a Monte Carlo simulation study and a Post-hoc simulation study are designed to determine the optimum CAT algorithm for Turkish PMS mathematics assessments. In the simulation studies, both multiple-choice and open-ended items are used and different scenarios are tested regarding various starting rules, termination criterion, ability estimation methods and existence of exposure/content controls. The results of the study indicate that using Weighted Maximum Likelihood (WML) ability estimation method, easy initial item difficulty as starting rule and a fixed test reliability termination criterion (0.30 standard error as termination rule) gives the optimum CAT algorithm for Turkish PMS mathematics assessment. Additionally, item exposure and content control strategies have a positive impact on providing comparable ability estimations.
69	Test case generation using symbolic grammars and quasirandom sequences Felix Reyes, Alejandro Unknown Date No description available. symbolic grammars model-based testing testing firewall testing firewall policies quasi-random sequences adaptive testing firewalls grammar-based testing grammars data generation test data
70	Počítačové adaptivní testování v kinantropologii: Monte Carlo simulace s využitím physical self description questionnaire / Computerized Adaptive Testing In Kinanthropology: Monte Carlo Simulations Using The Physical Self Description Questionnaire Komarc, Martin January 2017 (has links) This thesis aims to introduce the use of computerized adaptive testing (CAT) - a novel and ever increasingly used method of a test administration - applied to the field of Kinanthropology. By adapting a test to an individual respondent's latent trait level, computerized adaptive testing offers numerous theoretical and methodological improvements that can significantly advance testing procedures. In the first part of the thesis, the theoretical and conceptual basis of CAT, as well as a brief overview of its historical origins and basic general principles are presented. The discussion necessarily includes the description of Item Response Theory (IRT) to some extent, since IRT is almost exclusively used as the mathematical model in today's CAT applications. Practical application of CAT is then evaluated using Monte-Carlo simulations involving adaptive administration of the Physical Self-Description Questionnaire (PSDQ) (Marsh, Richards, Johnson, Roche, & Tremayne, 1994) - an instrument widely used to assess physical self-concept in the field of sport and exercise psychology. The Monte Carlo simulation of the PSDQ adaptive administration utilized a real item pool (N = 70) calibrated with a Graded Response Model (GRM, see Samejima, 1969, 1997). The responses to test items were generated based on item...

Search results