Spelling suggestions: "subject:"tem desponse 1heory"" "subject:"tem desponse btheory""
221 |
The Investigation Of Cognitive Processes In Mathematics Learning With Item Response TheorySecil, Selcen Ozkaya 01 September 2009 (has links) (PDF)
The importance of learning mathematics and using it in daily life is obvious.
On the other hand, the results from many national and international assessment
studies show that the achievement of Turkish students are very far away from the
bare minimum performance. However, in the measurement and evaluation
procedures of both primary and secondary educational system, there is a lack of
identification of this &ldquo / bare minimum&rdquo / or qualitative and clear descriptors for
performance levels. A great importance is dedicated to the national exam results
expressed in percentage terms of the correct responses, or in total score points in
weighted scale scores, but there is still no system of presenting to students their
scores with descriptions of these scores in terms of levels of skills that they did or did
not reach.
Therefore, this study has aimed to identify the knowledge and skills required
for different performance levels defined by setting cut points for the results of a 4th
grade mathematics achievement test. The test was conducted in 2007-2008
academic year with 269 fourth grade students in eight different private primary
schools in Istanbul. Then, in 2008-2009 academic year, a group of ten teachers of
mathematics and assessment experts took part in the study for identifying the
performance level descriptors for 4th grade mathematics performance. Two different
methods of standard setting were used. One of the methods was based on the oneparameter
model of Item Response Theory (IRT) and mostly named as Bookmark
Method. The method depended on the statistical identification of the cut points on
the scale for performance levels such as Below Basic, Basic, Proficient, and
Advanced. The other method was a judgmental method which required the
participant teachers to classify the item as carrying the characteristics of
performance levels, again, as Below Basic, Basic, Proficient, and Advanced.
The study revealed that the item mappings from two methods were congruent
to each other. There was a hierarchical ordering in terms of skills among the
performance levels. Also, the results demonstrated that understanding and
computation skills were heavily characteristics of Below Basic and Basic levels,
whereas, problem solving skill was reached by the students of Proficient and
Advanced levels.
|
222 |
Developing a screening measure for at-risk and advanced beginning readers to enhance response-to-intervention frameworks using the Rasch modelWeisenburgh, Amy Boward 01 February 2012 (has links)
The Rasch model was employed to analyze the psychometric properties of a diagnostic reading assessment and then create five short forms (n = 10, 16, 22, 28, 34 items) with an optimal test information function. The goal was to develop a universal screening measure that second grade teachers can use to identify advanced and at-risk readers to enhance Response-to-Intervention frameworks. These groups were targeted because both will need differentiated instruction in order to improve reading skills. The normative dataset of a national reading test developed with classical test theory methods was used to estimate person and item parameters. The measurement precision and classification accuracy of each short form was evaluated with the second grade students in the normative sample. Compared with full bank scores, all short forms produced highly correlated scores. The degree to which each short form identified exceptional readers was also analyzed. In consideration of classification accuracy and time-efficiency, the findings were most robust for the 10-item form. / text
|
223 |
THE FIVE-FACTOR OBSESSIVE-COMPULSIVE INVENTORY: AN ITEM RESPONSE THEORY ANALYSISPresnall-Shvorin, Jennifer R 01 January 2015 (has links)
Arguments have been made for dimensional models over categorical for the classification of personality disorder, and for the five-factor model (FFM) in particular. A criticism of the FFM of personality disorder is the absence of measures designed to assess pathological personality. Several measures have been developed based on the FFM to assess the maladaptive personality traits included within existing personality disorders.
One such example is the Five-Factor Obsessive-Compulsive Inventory (FFOCI). The current study applied item response theory analyses (IRT) to test whether scales of the FFOCI are extreme variants of respective FFM facet scales. It was predicted that both the height and slope of the item-response curves would differ for the conscientiousness-based scales, due to the bias towards assessing high conscientiousness as adaptive in general personality inventories (such as Goldberg’s International Personality Item Pool; IPIP). Alternatively, the remaining FFOCI scales and their IPIP counterparts were predicted to demonstrate no significant differences in IRCs across theta.
Nine hundred and seventy-two adults each completed the FFOCI and the IPIP, including 377 undergraduate students and 595 participants recruited online. A portion of the results supported the hypotheses, with select exceptions. Fastidiousness and Workaholism demonstrated the expected trends, with the FFOCI providing higher levels of fidelity at the higher end of theta, and the IPIP demonstrating superior coverage at the lower end of theta. Other conscientiousness scales failed to demonstrate the expected differences at a statistically significant level. In this context, the suitability of IRT in the analysis of rationally-derived, polytomous scales is explored.
|
224 |
Contributions to Kernel EquatingAndersson, Björn January 2014 (has links)
The statistical practice of equating is needed when scores on different versions of the same standardized test are to be compared. This thesis constitutes four contributions to the observed-score equating framework kernel equating. Paper I introduces the open source R package kequate which enables the equating of observed scores using the kernel method of test equating in all common equating designs. The package is designed for ease of use and integrates well with other packages. The equating methods non-equivalent groups with covariates and item response theory observed-score kernel equating are currently not available in any other software package. In paper II an alternative bandwidth selection method for the kernel method of test equating is proposed. The new method is designed for usage with non-smooth data such as when using the observed data directly, without pre-smoothing. In previously used bandwidth selection methods, the variability from the bandwidth selection was disregarded when calculating the asymptotic standard errors. Here, the bandwidth selection is accounted for and updated asymptotic standard error derivations are provided. Item response theory observed-score kernel equating for the non-equivalent groups with anchor test design is introduced in paper III. Multivariate observed-score kernel equating functions are defined and their asymptotic covariance matrices are derived. An empirical example in the form of a standardized achievement test is used and the item response theory methods are compared to previously used log-linear methods. In paper IV, Wald tests for equating differences in item response theory observed-score kernel equating are conducted using the results from paper III. Simulations are performed to evaluate the empirical significance level and power under different settings, showing that the Wald test is more powerful than the Hommel multiple hypothesis testing method. Data from a psychometric licensure test and a standardized achievement test are used to exemplify the hypothesis testing procedure. The results show that using the Wald test can provide different conclusions to using the Hommel procedure.
|
225 |
Relationships between Missing Response and Skill Mastery Profiles of Cognitive Diagnostic AssessmentZhang, Jingshun 13 August 2013 (has links)
This study explores the relationship between students’ missing responses on a large-scale assessment and their cognitive skill profiles and characteristics. Data from the 48 multiple-choice items on the 2006 Ontario Secondary School Literacy Test (OSSLT), a high school graduation requirement, were analyzed using the item response theory (IRT) three-parameter logistic model and the Reduced Reparameterized Unified Model, a Cognitive Diagnostic Model. Missing responses were analyzed by item and by student. Item-level analyses examined the relationships among item difficulty, item order, literacy skills targeted by the item, the cognitive skills required by the item, the percent of students not answering the item, and other features of the item. Student-level analyses examined the relationships among students’ missing responses, overall performance, cognitive skill mastery profiles, and characteristics such as gender and home language.
Most students answered most items: no item was answered by fewer than 98.8% of the students and 95.5% of students had 0 missing responses, 3.2% had 1 missing response, and only 1.3% had more than 1 missing responses). However, whether students responded to items was related to the student’s characteristics, including gender, whether the student had an individual education plan and language spoken at home, and to the item’s characteristics such as item difficulty and the cognitive skills required to answer the item.
Unlike in previous studies of large-scale assessments, the missing response rates were not higher for multiple-choice items appearing later in the timed sections. Instead, the first two items in some sections had higher missing response rates. Examination of the student-level missing response rates, however, showed that when students had high numbers of missing responses, these often represented failures to complete a section of the test. Also, if nonresponse was concentrated in items that required particular skills, the accuracy of the estimates for those skills was lower than for other skills.
The results of this study have implications for test designers who seek to improve provincial large-scale assessments, and for teachers who seek to help students improve their cognitive skills and develop test taking strategies.
|
226 |
Relationships between Missing Response and Skill Mastery Profiles of Cognitive Diagnostic AssessmentZhang, Jingshun 13 August 2013 (has links)
This study explores the relationship between students’ missing responses on a large-scale assessment and their cognitive skill profiles and characteristics. Data from the 48 multiple-choice items on the 2006 Ontario Secondary School Literacy Test (OSSLT), a high school graduation requirement, were analyzed using the item response theory (IRT) three-parameter logistic model and the Reduced Reparameterized Unified Model, a Cognitive Diagnostic Model. Missing responses were analyzed by item and by student. Item-level analyses examined the relationships among item difficulty, item order, literacy skills targeted by the item, the cognitive skills required by the item, the percent of students not answering the item, and other features of the item. Student-level analyses examined the relationships among students’ missing responses, overall performance, cognitive skill mastery profiles, and characteristics such as gender and home language.
Most students answered most items: no item was answered by fewer than 98.8% of the students and 95.5% of students had 0 missing responses, 3.2% had 1 missing response, and only 1.3% had more than 1 missing responses). However, whether students responded to items was related to the student’s characteristics, including gender, whether the student had an individual education plan and language spoken at home, and to the item’s characteristics such as item difficulty and the cognitive skills required to answer the item.
Unlike in previous studies of large-scale assessments, the missing response rates were not higher for multiple-choice items appearing later in the timed sections. Instead, the first two items in some sections had higher missing response rates. Examination of the student-level missing response rates, however, showed that when students had high numbers of missing responses, these often represented failures to complete a section of the test. Also, if nonresponse was concentrated in items that required particular skills, the accuracy of the estimates for those skills was lower than for other skills.
The results of this study have implications for test designers who seek to improve provincial large-scale assessments, and for teachers who seek to help students improve their cognitive skills and develop test taking strategies.
|
227 |
The Predictive Validity Of Baskent University Proficiency Exam (buepe) Through The Use Of The Three-parameter Irt Model& / #8217 / s Ability EstimatesYegin, Oya Perim 01 January 2003 (has links) (PDF)
The purpose of the present study is to investigate the predictive validity of the BUEPE through the use of the three-parameter IRT model& / #8217 / s ability estimates.
The study made use of the BUEPE September 2000 data which included the responses of 699 students. The predictive validity was established by using the departmental English courses (DEC) passing grades of a total number of 371 students.
As for the prerequisite analysis the best fitted model of IRT was determined by first, checking the assumptions of IRT / second, by analyzing the invariance of ability parameters and item parameters and thirdly, by interpreting the chi-square statistics.
After the prerequisite analyses, the best fitted model& / #8217 / s estimates were correlated with DEC passing grades to investigate the predictive power of BUEPE on DEC passing grades.
The findings indicated that the minimal guessing assumption of the one- and two-parameter models was not met. In addition, the chi-square statistics indicated a better fit to the three-parameter model. Therefore, it was concluded that the best fitted model was the three-parameter model. The findings of the predictive validity analyses revealed that the best predictors for DEC passing grades were the three-parameter model ability estimates. The second best predictor was the ability estimates obtained from sixty high information items. In the third place BUEPE total scores and the total scores obtained from sixty high information items followed with nearly the same correlation coefficients. Among the three sub-tests, the reading sub-test was found to be the best predictor of DEC passing grades.
|
228 |
An irt model to estimate differential latent change trajectories in a multi-stage, longitudinal assessmentShim, Hi Shin 08 April 2009 (has links)
Repeated measures designs are widely used in educational and psychological research to compare the changes exhibited in response to a treatment. Traditionally, measures of change are found by calculating difference scores (subtracting the observed initial score from the final score) for each person. However, problems such as the reliability paradox and the meaning of change scores arise from using simple difference scores to study change. A new item response theory model will be presented that estimates latent change scores instead of difference scores, addresses some of the limitations of using difference scores, and provides a direct comparison of the mean latent changes exhibited by different groups (e.g. females versus males). A simulation-based test was conducted to ascertain the viability of the model and results indicate that parameters of the newly developed model can be estimated accurately. Two sets of analyses were performed on the Early Childhood Longitudinal Study-Kindergarten cohort (ECLS-K) to examine differential growth in math ability between 1) male and female students and 2) Caucasian and African American students from kindergarten through fifth grade.
|
229 |
Avaliação de desempenho de fornecedores em cadeias de suprimentos utilizando a teoria da resposta ao itemSantos, Kathyana Vanessa Diniz 28 August 2017 (has links)
Submitted by Leonardo Cavalcante (leo.ocavalcante@gmail.com) on 2018-06-11T11:51:35Z
No. of bitstreams: 1
Arquivototal.pdf: 3445476 bytes, checksum: 555b1ed01b554f9292c07b7471e3de2f (MD5) / Made available in DSpace on 2018-06-11T11:51:35Z (GMT). No. of bitstreams: 1
Arquivototal.pdf: 3445476 bytes, checksum: 555b1ed01b554f9292c07b7471e3de2f (MD5)
Previous issue date: 2017-08-28 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / An adequate selection of suppliers can make a difference in the future of organizations, lowering operating costs, improving product quality and enabling quick responses to customer demands. In the current business context of supply chains, the appropriate choice of suppliers is essential for good management and maintenance and improvement of competitive advantages. Therefore, the objective of this dissertation is to develop a method to evaluate suppliers performance in the context of supply chains, using item response theory (IRT). To achieve this goal, 60 supplier performance aspects, covering seven dimensions (cost, time, quality, flexibility, innovation, reputation/ industry experience and sustainability), were considered in the formulation of a questionnaire with 67 items to evaluate supplier performance. The questionnaire made possible the evaluation of 243 supply links of companies across different sectors and 14 Brazilian States. The evaluation results of the 243 supply links were analyzed using IRT’s Graded Response Model (GRM). GRM establishes a difficulty parameter for each category of the presented items, defining levels of performance in an interpretable scale that indicates the aspects served by each evaluated relationship and informs what aspects the company evaluated still needs to evolve in. Depending on what the client company expects and prioritizes in the performance of its suppliers, it can associate scale levels (and the presence and/or absence of certain aspects) with decisions about what to do in the relationship with the supplier company (deepen relationship, request changes in behavior or end relationship, for example). / Uma seleção adequada de fornecedores pode fazer diferença no futuro das organizações, diminuindo custos operacionais, melhorando a qualidade dos produtos e possibilitando respostas rápidas às demandas dos clientes. No atual contexto empresarial de atuação em cadeias, a escolha apropriada de fornecedores é imprescindível para uma boa gestão e para manutenção e melhoria de vantagens competitivas. Assim, o objetivo desta dissertação é desenvolver um método para avaliar desempenho de fornecedores no contexto de cadeias de suprimentos, utilizando a Teoria de Resposta ao Item (TRI). Para atingir tal objetivo, foram levantados 60 aspectos de desempenho de fornecedores, que contemplam sete dimensões: custo, tempo, qualidade, flexibilidade, inovação, reputação/experiência no setor e sustentabilidade. Com base nestes aspectos, foi elaborado um questionário com 67 itens para avaliação do desempenho de fornecedores, que permitiu avaliar 243 elos de fornecimento de empresas respondentes de diversos setores e 14 Estados brasileiros. Estes resultados foram avaliados utilizando o Modelo de Resposta Gradual da TRI que permite estabelecer um parâmetro de dificuldade para cada categoria dos itens apresentados, definindo níveis de desempenho em uma escala interpretável que indica os aspectos atendidos por cada relacionamento avaliado e em quais aspectos ainda é preciso evoluir. A depender do que a empresa cliente espera e prioriza no desempenho de seus fornecedores, esta pode associar os níveis da escala (e a presença e/ou ausência de certos aspectos) a decisões sobre o que fazer no relacionamento com a empresa fornecedora (estreitar laços, solicitar mudanças no comportamento ou cortar relações, por exemplo).
|
230 |
Assessing Dimensionality in Complex Data Structures: A Performance Comparison of DETECT and NOHARM ProceduresJanuary 2011 (has links)
abstract: The purpose of this study was to investigate the effect of complex structure on dimensionality assessment in compensatory and noncompensatory multidimensional item response models (MIRT) of assessment data using dimensionality assessment procedures based on conditional covariances (i.e., DETECT) and a factor analytical approach (i.e., NOHARM). The DETECT-based methods typically outperformed the NOHARM-based methods in both two- (2D) and three-dimensional (3D) compensatory MIRT conditions. The DETECT-based methods yielded high proportion correct, especially when correlations were .60 or smaller, data exhibited 30% or less complexity, and larger sample size. As the complexity increased and the sample size decreased, the performance typically diminished. As the complexity increased, it also became more difficult to label the resulting sets of items from DETECT in terms of the dimensions. DETECT was consistent in classification of simple items, but less consistent in classification of complex items. Out of the three NOHARM-based methods, χ2G/D and ALR generally outperformed RMSR. χ2G/D was more accurate when N = 500 and complexity levels were 30% or lower. As the number of items increased, ALR performance improved at correlation of .60 and 30% or less complexity. When the data followed a noncompensatory MIRT model, the NOHARM-based methods, specifically χ2G/D and ALR, were the most accurate of all five methods. The marginal proportions for labeling sets of items as dimension-like were typically low, suggesting that the methods generally failed to label two (three) sets of items as dimension-like in 2D (3D) noncompensatory situations. The DETECT-based methods were more consistent in classifying simple items across complexity levels, sample sizes, and correlations. However, as complexity and correlation levels increased the classification rates for all methods decreased. In most conditions, the DETECT-based methods classified complex items equally or more consistent than the NOHARM-based methods. In particular, as complexity, the number of items, and the true dimensionality increased, the DETECT-based methods were notably more consistent than any NOHARM-based method. Despite DETECT's consistency, when data follow a noncompensatory MIRT model, the NOHARM-based method should be preferred over the DETECT-based methods to assess dimensionality due to poor performance of DETECT in identifying the true dimensionality. / Dissertation/Thesis / Ph.D. Educational Psychology 2011
|
Page generated in 0.0867 seconds