• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 11
  • 7
  • Tagged with
  • 23
  • 23
  • 11
  • 10
  • 10
  • 9
  • 9
  • 9
  • 9
  • 9
  • 9
  • 8
  • 7
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The impact of collateral information on ability estimation in an adaptive test battery

Xie, Qing 01 May 2019 (has links)
The advantages of administering an adaptive test battery, a collection of multiple adaptive subtests that are specifically tailored to examinees’ abilities, include shortening the subtest length and maintaining the accuracy of individual subtest scores. The test battery can incorporate a range of subjects, though this study focused primarily on Math and Reading. This study compared different ways of incorporating collateral information (CI), supplementary information beyond examinees’ current test performance, under two frameworks (Unidimensional and Multidimensional computerized adaptive testing). It also investigated the impact of subtest intercorrelations (the relationship between an examinee’s test scores), as well as the sequences of subtest administration on ability estimation in a variable-length adaptive battery. Practical issues including content constraints and item exposure control were also considered. Findings showed that the CI methods improved measurement efficiency with an acceptable level of measurement precision. The CI was more beneficial when associated with higher intercorrelations among the subtests. Also, the CI was found to be advantageous during the early stages of the subtests which were not taken first. Therefore, the CI may improve the examinee experience by administering items more aligned with their abilities. In addition, the CI should reduce costs for testing organizations by requiring fewer items and possibly saving seat time, while still providing reliable scores. The results should help practitioners decide whether the use of the CI is worthwhile under their particular testing situation.
2

Can a computer adaptive assessment system determine, better than traditional methods, whether students know mathematics skills?

Whorton, Skyler 19 April 2013 (has links)
Schools use commercial systems specifically for mathematics benchmarking and longitudinal assessment. However these systems are expensive and their results often fail to indicate a clear path for teachers to differentiate instruction based on students’ individual strengths and weaknesses in specific skills. ASSISTments is a web-based Intelligent Tutoring System used by educators to drive real-time, formative assessment in their classrooms. The software is used primarily by mathematics teachers to deliver homework, classwork and exams to their students. We have developed a computer adaptive test called PLACEments as an extension of ASSISTments to allow teachers to perform individual student assessment and by extension school-wide benchmarking. PLACEments uses a form of graph-based knowledge representation by which the exam results identify the specific mathematics skills that each student lacks. The system additionally provides differentiated practice determined by the students’ performance on the adaptive test. In this project, we describe the design and implementation of PLACEments as a skill assessment method and evaluate it in comparison with a fixed-item benchmark.
3

Efficient Test Strategies for Analog/RF Circuits

January 2012 (has links)
abstract: Test cost has become a significant portion of device cost and a bottleneck in high volume manufacturing. Increasing integration density and shrinking feature sizes increased test time/cost and reduce observability. Test engineers have to put a tremendous effort in order to maintain test cost within an acceptable budget. Unfortunately, there is not a single straightforward solution to the problem. Products that are tested have several application domains and distinct customer profiles. Some products are required to operate for long periods of time while others are required to be low cost and optimized for low cost. Multitude of constraints and goals make it impossible to find a single solution that work for all cases. Hence, test development/optimization is typically design/circuit dependent and even process specific. Therefore, test optimization cannot be performed using a single test approach, but necessitates a diversity of approaches. This works aims at addressing test cost minimization and test quality improvement at various levels. In the first chapter of the work, we investigate pre-silicon strategies, such as design for test and pre-silicon statistical simulation optimization. In the second chapter, we investigate efficient post-silicon test strategies, such as adaptive test, adaptive multi-site test, outlier analysis, and process shift detection/tracking. / Dissertation/Thesis / Ph.D. Electrical Engineering 2012
4

Ability parameter recovery of a computerized adaptive test based on rasch testlet models

Pak, Seohong 15 December 2017 (has links)
The purpose of this study was to investigate the effects of various testlet characteristics in terms of an ability parameter recovery under the modality of computerized adaptive test (CAT). Given the popularity of using CATs and the high frequency of emerging testlets into exams as either mixed format or not, it was important to evaluate the various conditions in a testlet-based CAT fitted testlet response theory models. The manipulated factors of this study were testlet size, testlet effect size, testlet composition, and exam format. The performance of each condition was compared with the true thetas which were 81 equally spaced points from -3.0 to +3.0. For each condition, 1,000 times of replication process were conducted with respect to overall bias, overall standard error, overall RMSE, conditional bias, conditional standard error, conditional RMSE, as well as conditional passing rate. The conditional results were presented in the pre-specified intervals. Several significant conclusions were made. Overall, the mean theta estimates over 1,000 replications were close to the true thetas regardless of manipulated conditions. In terms of aggregated overall RMSE, predictable relationships were found in four study factors: A larger amount of error was associated with a longer testlet, a bigger effect size, a random composition, and a testlet only exam format. However, when the aggregated overall bias was considered, only two effects were observed: a large difference among three testlet length conditions, and almost no difference between two testlet composition conditions. As expected, conditional SEMs for all conditions showed a U-shape across the theta scale. The noticeable discrepancy occurred only within the testlet length condition: more error was associated with the condition of the longest testlet length compared to the short and medium length conditions. Conditional passing rate showed little discrepancy among conditions within each facto, so no particular association was found. In general, a short testlet length is better, a small testlet effect size is better, a homogeneous difficulty composition is better, and a mixed format is better in terms of the smaller amount of error found in this study. Other than these obvious findings, some interaction effects were also observed. When the medium or large (i.e., greater than .50) testlet effect was suspicious, it was better to have a short length testlet. It was also found that using a mixed-format exam increased the accuracy of the random difficulty composition. However, this study was limited by several other factors which were controlled to be the same across the conditions: a fixed length exam, no content balancing, and the uniform testlet effects. Consequently, plans for improvements in terms of generalization were also discussed.
5

The relationship between learning potential, English language proficiency and work-related training test results

Schoeman, Adele 11 1900 (has links)
Continuous change and competition in the working environment necessitate increased efficiency and productivity which require different and enhanced skills and abilities. It is therefore important that the right people with the right skills are selected and employees are developed to enable them to meet the organisational and national demands of the future. This dissertation investigates the relationship between learning potential, English language proficiency and work-related training test results to establish why some production employees perform better on work-related training test results than others. The results indicate that there is no significant relationship between the work-related training test results and either learning potential or English language proficiency. There is, however, a significant correlation between learning potential and English language proficiency. It might be worthwhile exploring the availability and adequacy of assessors as well as the motivational level of the production employees as factors that influence the progress made with work-related training test results. / Industrial and Organisational Psychology / MCOM (Industrial Psychology)
6

Desafios e perspectivas da implementação computacional de testes adaptativos multidimensionais para avaliações educacionais / Challenges and perspectives of implementation of multidimensional adaptive test for educational assessment

Piton Gonçalves, Jean 17 December 2012 (has links)
Testes educacionais possibilitam a obtenção de medidas e resultados, a realização de análises e o estabelecimento de objetivos para os processos de ensino e a aprendizagem, além de subsidiarem processos seletivos e políticas públicas. A avaliação de desempenho dos examinados pode considerar uma única ou múltiplas habilidades e/ou competências. Como alternativa para testes via lápis e papel, o Teste Baseado em Computador (CBT) pode compor, aplicar e corrigir testes e produzir estatísticas individuais ou do grupo de examinados automaticamente. Considerando que o examinado possua múltiplas habilidades, o Teste Adaptativo baseado na Teoria de Resposta ao Item Multidimensional (MCAT) mantém a mesma acurácia de um teste tradicional, baseando-se no conhecimento do examinado a partir do histórico de itens anteriormente respondidos. A seleção de itens por Kullback Leibler entre Posteriores Subsequentes (\'K POT. p\') evita selecionar um item difícil para um examinado com baixa habilidade, sugerindo que \'K POT. p\' é um critério aplicável em testes educacionais. A revisão da literatura apontou para: (i) a carência de estudos para o critério \'K POT. P\', (ii) a carência de estudos com MCATs operacionais em contextos educacionais para usuários reais, (iii) a carência de estudos e propostas de critérios iniciais e de parada para MCATs, quando o número de itens administrados pelo teste é variável, e (iv) a ausência de trabalhos brasileiros na área de MCATs. Diante das lacunas apresentadas, esta tese de doutoramento trata da seguinte questão de pesquisa: Qual a abordagem para viabilizar o uso do critério KP em MCATs operacionais para contextos educacionais, que permita que o sistema implementado seja aprovado nos critérios de funcionalidade, confiabilidade, eficiência, manutenibilidade e portabilidade da ISO-9126, que é a base para avaliar testes computadorizados? Os objetivos específicos desta pesquisa foram os seguintes: (i) implementar e validar o critério de seleção \'K POT. P\', comparando-o com o critério bayesiano usual, (ii) propor melhorias e calcular o tempo computacional de processamento da seleção de itens por \'K POT. P\', (iii) propor critérios iniciais consistentes com a realidade e a necessidade das avaliações educacionais, (iv) validar o critério de parada inédito KPIC, quando a intenção é se ter MCATs que administrem um número variável de itens para os examinados, (v) desenvolver uma arquitetura que viabilize a aplicação via Web de MCATs com usuários reais, (vi) discutir aspectos teóricos e metodológicos da nova abordagem CBMAT via prova de conceito, por meio da implementação do sistema MADEPT, que avalia examinados na perspectiva da avaliação diagnóstica, (vii) avaliar o MADEPT de acordo com as normas internacionais de produto de software ISO-9126 e apontar a factibilidade, a viabilidade, as dificuldades, as vantagens e as limitações do desenvolvimento CBMATs para o ambiente Web. A metodologia utilizada para responder a questão de pesquisa foi: (i) organizar e selecionar as teorias, os métodos, os modelos e os resultados inerentes a MCATs, (ii) expandir a equação de \'K POT. P\', (iii) implementar o MCAT contemplando o critério de seleção \'K POT. P\' e a metodologia bayesiana para estimação e seleção de itens, (iv) validar estatisticamente \'K POT. P\' e KPIC, (v) implementar o CBMAT, contemplando o MCAT como um subsistema e (vi) avaliar o CBMAT via ISO-9126. Os resultados deste trabalho são vários: (i) uma ampla revisão da literatura nas teorias/métodos/critérios necessários para a implementação computacional de MCATs, (ii) a reformulação da equação que expressa a seleção por \'K POT. P\' para implementação via linguagem de programação científica, (iii) os estudos de simulações do MCAT quando a seleção de itens é por \'K POT. P\' e o critério de parada por KPIC mostram que \'K POT. P\' é um critério adequado e indicado quando o objetivo é ter um teste com um número baixo e variável de itens administrados, mantendo um vício adequado e com alta acurácia na estimação da habilidade, (iv) o desenvolvimento de algoritmos inéditos para os critérios iniciais, (v) a validação de uma nova arquitetura que viabiliza a aplicação via Web de MCATs com usuários reais e (vi) a implementação e avaliação via ISO-9126 do sistema computacionalWeb MADEPT. Conclui-se que é possível desenvolver uma arquitetura que viabilize a aplicação viaWeb de MCATs com usuários reais, utilizando o critério de seleção \'K POT. P\' e critérios iniciais condizentes com as avaliações educacionais. Quando a intenção é aplicar MCATs em cenários reais, a seleção de itens por \'K POT. P\' combinado com o critério de parada KPIC proporcionam um teste mais curto e com mais acurácia do que aqueles que utilizam a metodologia bayesiana usual, e com um tempo computacional de processamento condizente com as características da abordagem multidimensional / Educational tests provide measures and indicators that enable evaluations and guide the definition of educational goals, besides supporting selection processes and public policies formulation. The evaluation of the examinees performance may consider one or multiple skills and abilities. As an alternative to hand-written tests, the Computer Based Test (CBT) provides the setup, application and correction of tests as well as provide individual and/or collective statistics about the examinees performance. Considering that the examinee has several abilities, the Computer Adaptive Test based on the Multidimensional Item Response Theory (MCAT) keeps the same accuracy of a traditional test, building on the personal knowledge inferred from the track record of responses to previous items. The item selection through Kullback Leibler between Subsequent Posteriors (\'K POT. P\') avoids to select a difficult item for a low ability examinee, suggesting that \'K POT. P\' is a criterion applicable to educational tests. The literature review evidenced: (i) the insufficiency of studies about the \'K POT. P\' criterion; (ii) the insufficiency of studies on operational MCATs in educational contexts for real users; (iii) the shortage of studies and proposals for initial and stop criteria for MCATs, given a variable number of administered items, and (iv) the lack of Brazilian studies in the area of MCATs. To bridge these gaps, this doctoral thesis addresses the following research question: What is the approach that enables to employ the \'K POT. P\' criterion in operational MCATs for educational contexts, ensuring that the implemented system be in accordance with the functionality, reliability, efficiency, maintainability and portability criteria of ISO-9126 (which is the base for computer based tests evaluation)? The specific objectives of this research are to: (i) implement and validate the \'K POT. P\' selection criterion, comparing it to the usual Bayesian criterion; (ii) propose improvements and calculate the computational time for item selection processing through \'K POT. P\'; (iii) propose initial criteria consistent with the reality and the need of educational evaluation; (iv) validate the novel stop criterion KPIC, aiming at MCATs that administer a variable number of items for the examinees; (v) develop an architecture that enables the application of MCATs via web to real users; (vi) discuss theoretic and methodological issues related to the new CBMAT via proof-of-concept, implementing the MADEPT, which evaluates the examinees under the perspective of the diagnostic evaluation; (vii) evaluateMADEPT according to the international standards software ISO-9126 and point out feasibility, viability, difficulties, advantages and limitations of CBMATs development for web environment. The methodology used to answer the research question was to: (i) organize and select the theories, the methods, the models and results inherent to MCATs; (ii) rewrite the equation of \'K POT. P\'; (iii) implement the MCAT considering the \'K POT. P\' selection criterion and the Bayesian methodology for item estimation and selection (iv) validate \'K POT. P\' and KPIC statistically; (v) implement CBMAT, considering MCAT as a subsystem and (vi) evaluate CBMAT according to ISO-9126. This research has many results: (i) it presents a broad literature review regarding theories/methods/criteria for MCATs computational implementation; (ii) it rewrites in a scientific programming language the equation that expresses the selection through \'K POT. P\'; (iii) it shows, through MCAT simulations, that \'K POT. P\' is a criterion adequate and indicated for tests with a small and variable number of administered items, using \'K POT. P\' for item selection and KPIC as stop criterion; (iv) it develops novel algorithms for initial criteria; (v) it validates a new architecture to enable the application of MCATs via Web to real users; (vi) it implements and evaluates the web computational system MADEPT according to ISO-9126. We conclude that it is possible to develop an architecture that enables the application of MCATs via web to real users, using \'K POT. P\' selection criterion and initial criteria consistent with the educational evaluation. If the aim is to apply MCATs in real scenarios, the item selection through \'K POIT. \'P associated with the stop criterion KPIC provide a shorter and more accurate test in comparison to those using bayesian methodology. Moreover, its processing computational time is in line with the features of the multidimensional approach
7

The relationship between learning potential, English language proficiency and work-related training test results

Schoeman, Adele 11 1900 (has links)
Continuous change and competition in the working environment necessitate increased efficiency and productivity which require different and enhanced skills and abilities. It is therefore important that the right people with the right skills are selected and employees are developed to enable them to meet the organisational and national demands of the future. This dissertation investigates the relationship between learning potential, English language proficiency and work-related training test results to establish why some production employees perform better on work-related training test results than others. The results indicate that there is no significant relationship between the work-related training test results and either learning potential or English language proficiency. There is, however, a significant correlation between learning potential and English language proficiency. It might be worthwhile exploring the availability and adequacy of assessors as well as the motivational level of the production employees as factors that influence the progress made with work-related training test results. / Industrial and Organisational Psychology / MCOM (Industrial Psychology)
8

Desafios e perspectivas da implementação computacional de testes adaptativos multidimensionais para avaliações educacionais / Challenges and perspectives of implementation of multidimensional adaptive test for educational assessment

Jean Piton Gonçalves 17 December 2012 (has links)
Testes educacionais possibilitam a obtenção de medidas e resultados, a realização de análises e o estabelecimento de objetivos para os processos de ensino e a aprendizagem, além de subsidiarem processos seletivos e políticas públicas. A avaliação de desempenho dos examinados pode considerar uma única ou múltiplas habilidades e/ou competências. Como alternativa para testes via lápis e papel, o Teste Baseado em Computador (CBT) pode compor, aplicar e corrigir testes e produzir estatísticas individuais ou do grupo de examinados automaticamente. Considerando que o examinado possua múltiplas habilidades, o Teste Adaptativo baseado na Teoria de Resposta ao Item Multidimensional (MCAT) mantém a mesma acurácia de um teste tradicional, baseando-se no conhecimento do examinado a partir do histórico de itens anteriormente respondidos. A seleção de itens por Kullback Leibler entre Posteriores Subsequentes (\'K POT. p\') evita selecionar um item difícil para um examinado com baixa habilidade, sugerindo que \'K POT. p\' é um critério aplicável em testes educacionais. A revisão da literatura apontou para: (i) a carência de estudos para o critério \'K POT. P\', (ii) a carência de estudos com MCATs operacionais em contextos educacionais para usuários reais, (iii) a carência de estudos e propostas de critérios iniciais e de parada para MCATs, quando o número de itens administrados pelo teste é variável, e (iv) a ausência de trabalhos brasileiros na área de MCATs. Diante das lacunas apresentadas, esta tese de doutoramento trata da seguinte questão de pesquisa: Qual a abordagem para viabilizar o uso do critério KP em MCATs operacionais para contextos educacionais, que permita que o sistema implementado seja aprovado nos critérios de funcionalidade, confiabilidade, eficiência, manutenibilidade e portabilidade da ISO-9126, que é a base para avaliar testes computadorizados? Os objetivos específicos desta pesquisa foram os seguintes: (i) implementar e validar o critério de seleção \'K POT. P\', comparando-o com o critério bayesiano usual, (ii) propor melhorias e calcular o tempo computacional de processamento da seleção de itens por \'K POT. P\', (iii) propor critérios iniciais consistentes com a realidade e a necessidade das avaliações educacionais, (iv) validar o critério de parada inédito KPIC, quando a intenção é se ter MCATs que administrem um número variável de itens para os examinados, (v) desenvolver uma arquitetura que viabilize a aplicação via Web de MCATs com usuários reais, (vi) discutir aspectos teóricos e metodológicos da nova abordagem CBMAT via prova de conceito, por meio da implementação do sistema MADEPT, que avalia examinados na perspectiva da avaliação diagnóstica, (vii) avaliar o MADEPT de acordo com as normas internacionais de produto de software ISO-9126 e apontar a factibilidade, a viabilidade, as dificuldades, as vantagens e as limitações do desenvolvimento CBMATs para o ambiente Web. A metodologia utilizada para responder a questão de pesquisa foi: (i) organizar e selecionar as teorias, os métodos, os modelos e os resultados inerentes a MCATs, (ii) expandir a equação de \'K POT. P\', (iii) implementar o MCAT contemplando o critério de seleção \'K POT. P\' e a metodologia bayesiana para estimação e seleção de itens, (iv) validar estatisticamente \'K POT. P\' e KPIC, (v) implementar o CBMAT, contemplando o MCAT como um subsistema e (vi) avaliar o CBMAT via ISO-9126. Os resultados deste trabalho são vários: (i) uma ampla revisão da literatura nas teorias/métodos/critérios necessários para a implementação computacional de MCATs, (ii) a reformulação da equação que expressa a seleção por \'K POT. P\' para implementação via linguagem de programação científica, (iii) os estudos de simulações do MCAT quando a seleção de itens é por \'K POT. P\' e o critério de parada por KPIC mostram que \'K POT. P\' é um critério adequado e indicado quando o objetivo é ter um teste com um número baixo e variável de itens administrados, mantendo um vício adequado e com alta acurácia na estimação da habilidade, (iv) o desenvolvimento de algoritmos inéditos para os critérios iniciais, (v) a validação de uma nova arquitetura que viabiliza a aplicação via Web de MCATs com usuários reais e (vi) a implementação e avaliação via ISO-9126 do sistema computacionalWeb MADEPT. Conclui-se que é possível desenvolver uma arquitetura que viabilize a aplicação viaWeb de MCATs com usuários reais, utilizando o critério de seleção \'K POT. P\' e critérios iniciais condizentes com as avaliações educacionais. Quando a intenção é aplicar MCATs em cenários reais, a seleção de itens por \'K POT. P\' combinado com o critério de parada KPIC proporcionam um teste mais curto e com mais acurácia do que aqueles que utilizam a metodologia bayesiana usual, e com um tempo computacional de processamento condizente com as características da abordagem multidimensional / Educational tests provide measures and indicators that enable evaluations and guide the definition of educational goals, besides supporting selection processes and public policies formulation. The evaluation of the examinees performance may consider one or multiple skills and abilities. As an alternative to hand-written tests, the Computer Based Test (CBT) provides the setup, application and correction of tests as well as provide individual and/or collective statistics about the examinees performance. Considering that the examinee has several abilities, the Computer Adaptive Test based on the Multidimensional Item Response Theory (MCAT) keeps the same accuracy of a traditional test, building on the personal knowledge inferred from the track record of responses to previous items. The item selection through Kullback Leibler between Subsequent Posteriors (\'K POT. P\') avoids to select a difficult item for a low ability examinee, suggesting that \'K POT. P\' is a criterion applicable to educational tests. The literature review evidenced: (i) the insufficiency of studies about the \'K POT. P\' criterion; (ii) the insufficiency of studies on operational MCATs in educational contexts for real users; (iii) the shortage of studies and proposals for initial and stop criteria for MCATs, given a variable number of administered items, and (iv) the lack of Brazilian studies in the area of MCATs. To bridge these gaps, this doctoral thesis addresses the following research question: What is the approach that enables to employ the \'K POT. P\' criterion in operational MCATs for educational contexts, ensuring that the implemented system be in accordance with the functionality, reliability, efficiency, maintainability and portability criteria of ISO-9126 (which is the base for computer based tests evaluation)? The specific objectives of this research are to: (i) implement and validate the \'K POT. P\' selection criterion, comparing it to the usual Bayesian criterion; (ii) propose improvements and calculate the computational time for item selection processing through \'K POT. P\'; (iii) propose initial criteria consistent with the reality and the need of educational evaluation; (iv) validate the novel stop criterion KPIC, aiming at MCATs that administer a variable number of items for the examinees; (v) develop an architecture that enables the application of MCATs via web to real users; (vi) discuss theoretic and methodological issues related to the new CBMAT via proof-of-concept, implementing the MADEPT, which evaluates the examinees under the perspective of the diagnostic evaluation; (vii) evaluateMADEPT according to the international standards software ISO-9126 and point out feasibility, viability, difficulties, advantages and limitations of CBMATs development for web environment. The methodology used to answer the research question was to: (i) organize and select the theories, the methods, the models and results inherent to MCATs; (ii) rewrite the equation of \'K POT. P\'; (iii) implement the MCAT considering the \'K POT. P\' selection criterion and the Bayesian methodology for item estimation and selection (iv) validate \'K POT. P\' and KPIC statistically; (v) implement CBMAT, considering MCAT as a subsystem and (vi) evaluate CBMAT according to ISO-9126. This research has many results: (i) it presents a broad literature review regarding theories/methods/criteria for MCATs computational implementation; (ii) it rewrites in a scientific programming language the equation that expresses the selection through \'K POT. P\'; (iii) it shows, through MCAT simulations, that \'K POT. P\' is a criterion adequate and indicated for tests with a small and variable number of administered items, using \'K POT. P\' for item selection and KPIC as stop criterion; (iv) it develops novel algorithms for initial criteria; (v) it validates a new architecture to enable the application of MCATs via Web to real users; (vi) it implements and evaluates the web computational system MADEPT according to ISO-9126. We conclude that it is possible to develop an architecture that enables the application of MCATs via web to real users, using \'K POT. P\' selection criterion and initial criteria consistent with the educational evaluation. If the aim is to apply MCATs in real scenarios, the item selection through \'K POIT. \'P associated with the stop criterion KPIC provide a shorter and more accurate test in comparison to those using bayesian methodology. Moreover, its processing computational time is in line with the features of the multidimensional approach
9

Avaliação da proficiência em inglês acadêmico através de um teste adaptativo informatizado / Assessment of proficiency in academic English through an adaptive computerized test

Silva, Vanessa Rufino da 09 April 2015 (has links)
Este trabalho descreve as etapas de transformação de um exame de proficiência em inglês acadêmico, aplicado via lápis-e-papel, com itens de múltipla escolha administrados segundo o método de Medida de Probabilidade Admissível (Shuford Jr et al., 1966), utilizado no programa de pós-graduação do Instituto de Ciências Matemáticas e de Computação da Universidade de São Paulo (ICMC-USP), em um teste adaptativo informatizado (TAI-PI) baseado em um modelo da Teoria de Resposta ao Item (TRI). Apesar do programa aceitar diversos exames que atestam a proficiência em inglês para indivíduos não-nativos de abrangência e reconhecimento internacionais, como o TOEFL (Test of English as a Foreign Language), IELTS (International English Language Testing System) e CPE (Certicate of Proficiency in English), por exemplo, a sua obrigatoriedade é incoerente em universidades públicas do Brasil devido ao custo que varia de 200 a 300 dólares por exame. O software TAI-PI (Teste Adaptativo Informatizado para Proficiência em Inglês), que foi desenvolvido em Java e SQLite, será utilizado para a avaliação da proficiência em inglês dos alunos do programa a partir do segundo semestre de 2013, de forma gratuita. A metodologia estatística implementada foi definida considerando a história e objetivos do exame e adotou o modelo de resposta gradual unidimensional de Samejima (Samejima, 1969), o critério de Kullback-Leibler para seleção de itens, o método de estimação da esperança a posteriori para os traços latentes (Baker, 2001) e a abordagem Shadow test (Van der Linden e Pashley, 2010) para imposição de restrições (de conteúdo e tamanho da prova) na composição do teste de cada indivíduo. Uma descrição da estrutura do exame, dos métodos empregados, dos resultados das aplicações do TAI-PI a alunos de pós-graduação do ICMC e estudos de classificação dos alunos em aprovados e reprovados, são apresentados neste trabalho, evidenciando a boa qualidade da nova proposta adotada e aprimoramento do exame com a utilização dos métodos de TRI e TAI. / This work describes the steps for converting a linear paper-and-pencil English proficiency test for academic purposes, composed with multiple choice items that are administered following the admissible probability measurement procedure (Shuford Jr et al., 1966), adopted by the graduate program of Institute of Mathematical Sciences and Computing of University of São Paulo (ICMCUSP), Brazil, to a computerized adaptive test (TAI-PI) based on an item response theory model (IRT). Despite the Institute recognizes reliable international English-language exams for academic purposes and non-native speakers, as TOEFL (Test of English as a Foreign Language), IELTS (International English Language Testing System) and CPE (Cambridge English: Proficiency), for instance, it is inconsistent that public universities in Brazil require them as certification because of the cost of approximately US$ 200.00 to US$ 300.00 per exam. The software TAI-PI (computerized adaptive test for English proficiency) was implemented in Java language, used SQLite as database engine, and it shall be offered free of charge for English proficiency assessment of the graduate students from October 2013. The statistical methodology employed for TAI-PI construction was defined considering the history and the aims of the evaluation and adopted the Samejima\'s graded response model (Samejima, 1969), the Kullback-Leibler information criterion for item selection, the expected a posteriori Bayesian estimation for latent trait (Baker, 2001) and shadow test approach (Van der Linden e Pashley, 2010) for test constraints (content and size of the test, for example). A description of the test design, the employed statistical methods, study results of a real application of TAI-PI to graduate students are presented in this work and the validation studies of the new methodology for pass/fail classification, highlighting the good quality of the new evaluation system and examination of improvement with the use of the methods of IRT and CAT.
10

Using a Computer-adaptive Test Simulation to Investigate Test Coordinators' Perceptions of a High-stakes Computer-based Testing Program

Hogan, Tiffany 10 January 2014 (has links)
This case study examined the efficiency and precision of computer classification and adaptive testing to elicit responses from test coordinators on implementing a high-stakes computer-based testing. Test coordinators from five elementary schools located in a Georgia school district participated in the study. The school district administered state-made, high-stakes tests using paper and pencil; locally developed tests via the computer or paper and pencil. A post-hoc simulation program, Comprehensive Simulation of Computerized Adatpive Testing, used 586 student item responses to produce results with a variable termination point and classification termination point. Results from the simulation were analyzed and used in the case study to elicit interview responses from test coordinators. The photographs of computer-labs and test schedule documents were collected and analyzed to validate school test coordinators' responses. Test coordinators responded positively to the efficiency and precision of simulation results. Some test coordinators preferred the use of computer-adaptive tests for diagnostic purposes only. Test coordinators experiences focused on the security, the emotions, and the management of testing. The findings of this study will benefit those interested in implementing a high-stakes, computer-based testing program by recommending a simulation study be conducted and feedback by solicited from test coordinators prior to an operational test administration.

Page generated in 0.0658 seconds