Spelling suggestions: "subject:"desponse 1heory"" "subject:"desponse btheory""
121 |
Using Posterior Predictive Checking of Item Response Theory Models to Study Invariance ViolationsXin, Xin 05 1900 (has links)
The common practice for testing measurement invariance is to constrain parameters to be equal over groups, and then evaluate the model-data fit to reject or fail to reject the restrictive model. Posterior predictive checking (PPC) provides an alternative approach to evaluating model-data discrepancy. This paper explores the utility of PPC in estimating measurement invariance. The simulation results show that the posterior predictive p (PP p) values of item parameter estimates respond to various invariance violations, whereas the PP p values of item-fit index may fail to detect such violations. The current paper suggests comparing group estimates and restrictive model estimates with posterior predictive distributions in order to demonstrate the pattern of misfit graphically.
|
122 |
Modelagem para construção de escalas avaliativas e classificatórias em exames seletivos utilizando teoria da resposta ao item uni e multidimensional / Modeling for constructing of classificatory and evaluative scales in selective tests using uni and multidimensional item response theoryQuaresma, Edilan de Sant'Ana 28 May 2014 (has links)
O uso de provas elaboradas na forma de itens, em processos de avaliação para classificação, é uma herança histórica dos séculos XVI e XVII, ainda em uso nos dias atuais tanto na educação formal quanto em processos seletivos, a exemplo dos exames vestibulares. Elaboradas para mensurar conhecimentos, traços latentes que não podem ser medidos diretamente, as provas costumam ser corrigidas considerando unicamente o escore obtido pelo sujeito avaliado, sem contemplar informações importantes relacionadas aos itens das mesmas. O presente trabalho teve como objetivos: (i) utilizar a modelagem baseada na teoria da resposta ao item unidimensional - TRI e multidimensional - TRIM para construir escalas do conhecimento para a prova da FUVEST e (ii) classificar os candidatos aos seis cursos de graduação oferecidos pela Escola Superior de Agricultura \"Luiz de Queiroz\", unidade da Universidade de São Paulo, com base na escala construída. A hipótese imbutida no corpo do trabalho admitiu que o uso da TRIM classifica de forma diferente os candidatos que os atuais métodos utilizados pela FUVEST. Foram utilizados os padrões de respostas dos 2326 candidatos submetidos à prova, para que uma análise unidimensional fosse realizada, sob o enfoque da TRI, gerando uma escala de proficiências . Quatro traços latentes foram diagnosticados no processo avaliativo, por meio da modelagem multidimensional da TRIM, gerando uma escala das quatro dimensões. Uma proposta para classificação dos candidatos é apresentada, baseada na média das proficiências individuais ponderada pelas cargas fatoriais diagnosticadas pela modelagem. Análise comparativa entre os critérios de classificação utilizados pela FUVEST e pela TRIM foram realizados, identificando discordância entre os mesmos. O trabalho apresenta propostas de interpretação pedagógica para as escalas unidimensional e multidimensional e indica a TRIM como o critério complementar para classificação dos candidatos, valorizando informações individuais dos itens e, portanto, utilizando uma avaliação classificatória mais abrangente. / The use of elaborate exams in the form of items, in evaluation procedures for classification, is a historical legacy of the 16th and 17th centuries, still in use today both in formal education and in selective cases such as entrance examinations. Designed to measure knowledge, latent trait that can not be measured directly, the exams are usually corrected considering only the score obtained by the subject, without including important information related to the items of it. This study aimed to: (i) use the modeling approach unidimensional and multidimensional item response theory (IRT and MIRT, respectively), to build knowledge scales of the entrance examination FUVEST/2012; (ii) classifing candidates for the 6 undergraduate courses offered by the \"Luiz de Queiroz\" College of Agriculture , unit of the University of São Paulo, based on the scale then. The hypothesis supposes that the use of MIRT ranked candidates differently than current methods used by FUVEST. We used the patterns of responses of 2326 candidates submitted to the test, so that a one-dimensional analysis was performed under the IRT approach, generating a range of proficiencies. Four latent traits were diagnosed in the evaluation process by means of multidimensional modeling MIRT, generating a scale of four dimensions. A proposal for classification of the candidates is presented, based on the weighted average of the individual proficiencies by the factor loadings diagnosed by modeling. Comparative analysis of the classification criteria used by FUVEST and MIRT were performed by identifying discrepancies between them. This work presents the proposals of the pedagogical interpretation for one-dimensional and multidimensional scales and indicates the MIRT as additional criteria for the candidates, to valorize individual information of the items and therefore using a more comprehensive classification review.
|
123 |
Modelagem para construção de escalas avaliativas e classificatórias em exames seletivos utilizando teoria da resposta ao item uni e multidimensional / Modeling for constructing of classificatory and evaluative scales in selective tests using uni and multidimensional item response theoryEdilan de Sant'Ana Quaresma 28 May 2014 (has links)
O uso de provas elaboradas na forma de itens, em processos de avaliação para classificação, é uma herança histórica dos séculos XVI e XVII, ainda em uso nos dias atuais tanto na educação formal quanto em processos seletivos, a exemplo dos exames vestibulares. Elaboradas para mensurar conhecimentos, traços latentes que não podem ser medidos diretamente, as provas costumam ser corrigidas considerando unicamente o escore obtido pelo sujeito avaliado, sem contemplar informações importantes relacionadas aos itens das mesmas. O presente trabalho teve como objetivos: (i) utilizar a modelagem baseada na teoria da resposta ao item unidimensional - TRI e multidimensional - TRIM para construir escalas do conhecimento para a prova da FUVEST e (ii) classificar os candidatos aos seis cursos de graduação oferecidos pela Escola Superior de Agricultura \"Luiz de Queiroz\", unidade da Universidade de São Paulo, com base na escala construída. A hipótese imbutida no corpo do trabalho admitiu que o uso da TRIM classifica de forma diferente os candidatos que os atuais métodos utilizados pela FUVEST. Foram utilizados os padrões de respostas dos 2326 candidatos submetidos à prova, para que uma análise unidimensional fosse realizada, sob o enfoque da TRI, gerando uma escala de proficiências . Quatro traços latentes foram diagnosticados no processo avaliativo, por meio da modelagem multidimensional da TRIM, gerando uma escala das quatro dimensões. Uma proposta para classificação dos candidatos é apresentada, baseada na média das proficiências individuais ponderada pelas cargas fatoriais diagnosticadas pela modelagem. Análise comparativa entre os critérios de classificação utilizados pela FUVEST e pela TRIM foram realizados, identificando discordância entre os mesmos. O trabalho apresenta propostas de interpretação pedagógica para as escalas unidimensional e multidimensional e indica a TRIM como o critério complementar para classificação dos candidatos, valorizando informações individuais dos itens e, portanto, utilizando uma avaliação classificatória mais abrangente. / The use of elaborate exams in the form of items, in evaluation procedures for classification, is a historical legacy of the 16th and 17th centuries, still in use today both in formal education and in selective cases such as entrance examinations. Designed to measure knowledge, latent trait that can not be measured directly, the exams are usually corrected considering only the score obtained by the subject, without including important information related to the items of it. This study aimed to: (i) use the modeling approach unidimensional and multidimensional item response theory (IRT and MIRT, respectively), to build knowledge scales of the entrance examination FUVEST/2012; (ii) classifing candidates for the 6 undergraduate courses offered by the \"Luiz de Queiroz\" College of Agriculture , unit of the University of São Paulo, based on the scale then. The hypothesis supposes that the use of MIRT ranked candidates differently than current methods used by FUVEST. We used the patterns of responses of 2326 candidates submitted to the test, so that a one-dimensional analysis was performed under the IRT approach, generating a range of proficiencies. Four latent traits were diagnosed in the evaluation process by means of multidimensional modeling MIRT, generating a scale of four dimensions. A proposal for classification of the candidates is presented, based on the weighted average of the individual proficiencies by the factor loadings diagnosed by modeling. Comparative analysis of the classification criteria used by FUVEST and MIRT were performed by identifying discrepancies between them. This work presents the proposals of the pedagogical interpretation for one-dimensional and multidimensional scales and indicates the MIRT as additional criteria for the candidates, to valorize individual information of the items and therefore using a more comprehensive classification review.
|
124 |
Influence of Item Response Theory and Type of Judge on a Standard Set Using the Iterative Angoff Standard Setting MethodHamberlin, Melanie Kidd 08 1900 (has links)
The purpose of this investigation was to determine the influence of item response theory and different types of judges on a standard. The iterative Angoff standard setting method was employed by all judges to determine a cut-off score for a public school district-wide criterion-reformed test. The analysis of variance of the effect of judge type and standard setting method on the central tendency of the standard revealed the existence of an ordinal interaction between judge type and method. Without any knowledge of p-values, one judge group set an unrealistic standard. A significant disordinal interaction was found concerning the effect of judge type and standard setting method on the variance of the standard. A positive covariance was detected between judges' minimum pass level estimates and empirical item information. With both p-values and b-values, judge groups had mean minimum pass levels that were positively correlated (ranging from .77 to .86), regardless of the type of information given to the judges. No differences in correlations were detected between different judge types or different methods. The generalizability coefficients and phi indices for 12 judges included in any method or judge type were acceptable (ranging from .77 to .99). The generalizability coefficient and phi index for all 24 judges were quite high (.99 and .96, respectively).
|
125 |
Informatikos pagrindų konceptualizavimas naudojant uždavinius / Conceptualisation of informatics fundamentals through tasksDaukšaitė, Gabrielė 01 July 2014 (has links)
Magistro darbe tyrinėjama, kaip Lietuvos ir kai kurių užsienio valstybių bendrojo lavinimo mokyklose yra mokoma informatikos, aiškinamasi, koks požiūris į šią mokomąja discipliną, kurie veiksniai tai įtakoja. Tyrimui pasirinktas įdomesnis kelias – naudojamasi informatikos ir kompiuterinio lavinimosi varžybomis „Bebras“, kurios vyksta daugiau kaip dešimtyje valstybių. Palyginti 2008–2010 metais Lietuvoje vykusių „Bebro“ varžybų užduočių rinkiniai pagal įvairius informatikos konceptus. Pasinaudojus 2010 metais Lietuvos „Bebro“ varžybose dalyvavusių mokinių rezultatų duomenimis bei pritaikius atitinkamus matematinius užduočių vertinimo modelius, buvo įvertinta užduočių rinkinio informacinė funkcija, kuri leidžia parinkti tinkamiausias užduotis atitinkamam mokinių žinių lygiui. Mokinių informatikos žinių lygis neatsiejamas nuo informatikos pagrindų, kurie formuojasi laikui bėgant, kai mokinys gauna tinkamą informaciją ne tik per informatikos ar informacinių technologijų pamokas, bet ir kai mokytojai informacines ir komunikacines priemones taiko per kitų dalykų pamokas. Darbe apskaičiuoti užduočių sunkumo koeficientai, kurie palyginti su užduočių sunkumo lygiais, kuriuos priskyrė uždavinių sudarytojai ar vertintojai. Taip pat nustatyti užduočių skiriamosios gebos indeksai, kurie nustato, kiek gerai užduotis atskiria geresnius mokinių darbus nuo blogesnių tikrinamo dalyko atžvilgiu. Tyrimo rezultatai svarbūs tiek mokytojams, kurie turi įtakos mokinių informatikos pagrindų... [toliau žr. visą tekstą] / In this master thesis, computer science curriculum in compulsory school of Lithuania and other foreign countries are reviewed. Data of "Beaver" information technology contest, which is organized in more than ten countries, has been selected as a more attractive way to imlement this study. The comparisons of tasks sets in Lithuanian “Beaver” competition in 2008 – 2010 according to informatics concepts are presented. In this thesis, there was assessed information function of tasks set by using data of pupils’ results. The data of results were obtained from Lithuanian competition of “Beaver” in 2010. Information function allows choosing the best tasks for due ability level of pupils. Pupils’ abilities level of computer science is inseparable from informatics fundamentals, which is forming over time when pupils get the right information about informatics fundamentals during the computers science lesson, or when their teachers use information and communication technologies. The difficulty parameters of tasks, and discriminations parameters of tasks, which describe how well an item can differentiate between examinees having abilities below the item location and those having abilities above the item location, are calculated. The results of this study are important for teachers, which influence formation of informatics fundamentals of pupils, as well as for experts and creators of competition tasks, because for them it is important the right and purposeful introduction to computer... [to full text]
|
126 |
CT3 as an Index of Knowledge Domain Structure: Distributions for Order Analysis and Information HierarchiesSwartz Horn, Rebecca 12 1900 (has links)
The problem with which this study is concerned is articulating all possible CT3 and KR21 reliability measures for every case of a 5x5 binary matrix (32,996,500 possible matrices). The study has three purposes. The first purpose is to calculate CT3 for every matrix and compare the results to the proposed optimum range of .3 to .5. The second purpose is to compare the results from the calculation of KR21 and CT3 reliability measures. The third purpose is to calculate CT3 and KR21 on every strand of a class test whose item set has been reduced using the difficulty strata identified by Order Analysis. The study was conducted by writing a computer program to articulate all possible 5 x 5 matrices. The program also calculated CT3 and KR21 reliability measures for each matrix. The nonparametric technique of Order Analysis was applied to two sections of test items to stratify the items into difficulty levels. The difficulty levels were used to reduce the item set from 22 to 9 items. All possible strands or chains of these items were identified so that both reliability measures (CT3 and KR21) could be calculated. One major finding of this study indicates that .3 to .5 is a desirable range for CT3 (cumulative p=.86 to p=.98) if cumulative frequencies are measured. A second major finding is that the KR21 reliability measure produced an invalid result more than half the time. The last major finding is that CT3, rescaled to range between 0 and 1, supports De Vellis' guidelines for reliability measures. The major conclusion is that CT3 is a better measure of reliability since it considers both inter- and intra-item variances.
|
127 |
Exploring the Relationship between English Composition Teachers' Beliefs about Written Feedback and Their Written Feedback PracticesVandercook, Sandra 15 December 2012 (has links)
For teachers of freshman English composition, the most time-consuming aspect of teaching is responding to student papers (Anson, 2012; Straub, 2000b). Teachers respond in various ways, but most teachers agree that they should offer written feedback to students (Beach & Friedrich, 2006). However, little research has been conducted to determine how teachers’ written feedback practices reflect their beliefs about the purpose of such feedback. This qualitative study explores the relationship between English composition teachers’ beliefs about written feedback and their actual written feedback practices.
The participants were a sample of four instructors of freshman English composition at a mid-sized metropolitan public university. Interviews, classroom observations, course documents, and samples of teachers’ written comments were analyzed to determine teachers’ written response practices and their beliefs related to the purposes of freshman writing and their roles as writing teachers. Results suggest that teachers were aware of their beliefs, and their written response practices were consistent with their beliefs. Teachers utilized different approaches to respond to student writing, but those approaches are consistent with current recommendations for responding to student writing.
Three major themes emerged from the study. First, teachers must be given the opportunity to reflect about and articulate their beliefs about written response so they will know why they respond in the way they do. Second, teachers work within the boundaries of their specific writing program to organize their written responses to student writing. Third, teachers must respond to student writing from varying perspectives as readers of the text. The findings support studies which indicate that written response is a sociocultural practice and teacher beliefs are just one aspect of the complex nature of teacher written response. The study should add to the fields of response theory and the formation of teacher beliefs.
|
128 |
Breaking Free from the Limitations of Classical Test Theory: Developing and Measuring Information Systems Scales Using Item Response TheoryRusch, Thomas, Lowry, Paul Benjamin, Mair, Patrick, Treiblmaier, Horst 03 1900 (has links) (PDF)
Information systems (IS) research frequently uses survey data to measure the interplay between technological systems and human beings. Researchers have developed sophisticated procedures to build and validate multi-item scales that measure latent constructs. The vast majority of IS studies uses classical test theory (CTT), but this approach suffers from three major theoretical shortcomings: (1) it assumes a linear relationship between the latent variable and observed scores, which rarely represents the empirical reality of behavioral constructs; (2) the true score can either not be estimated directly or only by making assumptions that are difficult to be met; and (3) parameters such as reliability, discrimination, location, or factor loadings depend on the sample being used. To address these issues, we present item response theory (IRT) as a collection of viable alternatives for measuring continuous latent variables by means of categorical indicators (i.e., measurement variables). IRT offers several advantages: (1) it assumes nonlinear relationships; (2) it allows more appropriate estimation of the true score; (3) it can estimate item parameters independently of the sample being used; (4) it allows the researcher to select items that are in accordance with a desired model; and (5) it applies and generalizes concepts such as reliability and internal consistency, and thus allows researchers to derive more information about the measurement process. We use a CTT approach as well as Rasch models (a special class of IRT models) to demonstrate how a scale for measuring hedonic aspects of websites is developed under both approaches. The results illustrate how IRT can be successfully applied in IS research and provide better scale results than CTT. We conclude by explaining the most appropriate circumstances for applying IRT, as well as the limitations of IRT.
|
129 |
O Desempenho em Matemática do ENEM de 2012 em Luis Eduardo Magalhães (BA), na Teoria de Resposta ao ItemOliveira, Leandro Santana 06 July 2017 (has links)
O desempenho de estudantes em matemática na prova do ENEM é a discussão central deste
trabalho. Com as mudanças no ENEM ocorridas no ano de 2009, a TRI - Teoria de Resposta
ao Item - passou a ser utilizada para elaboração e correção da prova, permitindo, assim,
mais confiabilidade nos resultados das provas, e, claro, uma resposta mais interessante ao
estudantes, para além do aspecto quantitativos de acerto e de erro em questões. O presente
trabalho tem o propósito analisar o desempenho de estudantes na prova do ENEM 2012,
na cidade de Luis Eduardo Magalhães, (BA) comparando com os resultados desta mesma
prova dos participantes de todo o estado da Bahia. Realizou-se uma análise de 10 questões
e seus resultados de acertos e erros, sendo possível uma análise, mesmo sem os parâmetros
TRI, sobre o desempenho dos estudantes na respectiva prova. Os resultados da pesquisa
são o encaminhamento de ações na educação básica voltados ao compromisso de elevar
o desempenho dos estudantes de ensino médio que realizam a prova do ENEM, através
de programas de estudos e outros meios, na forma de produtos educacionais. A presente
pesquisa aponta também o avanço de sua análise com a obtenção dos parâmetros TRI -
já que não são de domínio publico e sua obtenção não é de fácil localização e acesso no
Ministério da Educação, bem como, programas específicos de softwares, não são de fácil
acesso - que contribuiriam muito para melhorar o esclarecimento desta avaliação em todo
o Brasil e, por conseguinte, permitir a elevação dos índices de desempenho dos estudantes,
sobretudo, em matemática em Luis Eduardo Magalhães(BA). / The performance of students in mathematics in the ENEM test is the central discussion of
this work. With the changes in the ENEM in 2009, TRI - Item Response Theory - began to
be used for the preparation and correction of the test, allowing, therefore, more reliability
in the results of the tests, and, of course, a more interesting response To students, beyond
the quantitative aspect of correctness and error in questions. The purpose of this paper
is to analyze student performance in the ENEM 2012 test in the city of Luis Eduardo
Magalhães (BA), comparing with the results of this same test of the participants from the
entire state of Bahia. An analysis of 10 questions and their results of correct answers and
errors was made, and it was possible to analyze, even without the TRI parameters, on the
students’ performance in the respective test. The results of the research are the referral of
actions in basic education aimed at raising the performance of high school students who
take the ENEM test, through study programs and other means, in the form of educational
products. The present research also indicates the progress of its analysis with the obtaining
of the TRI parameters - since they are not of public domain and their obtaining is not
of easy location and access in the Ministry of Education, as well as, specific programs
of software, are not of Which would greatly contribute to improving the clarification of
this evaluation throughout Brazil and, consequently, to allow students to increase their
performance, especially in mathematics in Luis Eduardo Magalhães (BA).
|
130 |
Can a computer adaptive assessment system determine, better than traditional methods, whether students know mathematics skills?Whorton, Skyler 19 April 2013 (has links)
Schools use commercial systems specifically for mathematics benchmarking and longitudinal assessment. However these systems are expensive and their results often fail to indicate a clear path for teachers to differentiate instruction based on students’ individual strengths and weaknesses in specific skills. ASSISTments is a web-based Intelligent Tutoring System used by educators to drive real-time, formative assessment in their classrooms. The software is used primarily by mathematics teachers to deliver homework, classwork and exams to their students. We have developed a computer adaptive test called PLACEments as an extension of ASSISTments to allow teachers to perform individual student assessment and by extension school-wide benchmarking. PLACEments uses a form of graph-based knowledge representation by which the exam results identify the specific mathematics skills that each student lacks. The system additionally provides differentiated practice determined by the students’ performance on the adaptive test. In this project, we describe the design and implementation of PLACEments as a skill assessment method and evaluate it in comparison with a fixed-item benchmark.
|
Page generated in 0.0499 seconds