Global ETD Search

191	Teorie odpovědí na položku a její aplikace v oblasti Národních srovnávacích zkoušek / Item Response Theory and its Application in the National Comparative Exams Fiřtová, Lenka January 2012 (has links) Item Response Theory, a psychometric paradigm for test development and evaluation, comprises a collection of models which enable the estimation of the probability of a correct answer to a particular item in the test as a function of the item parameters and the level of a respondent's underlying ability. This paper, written in cooperation with the company Scio, is focused on the application of Item Response Theory in the context of the National Comparative Exams. Its aim is to propose a test-equating procedure which would ensure a fair comparison of respondents' scores in the Test of General Academic Prerequisites regardless of the particular test administration.
192	Accuracy and variability of item parameter estimates from marginal maximum a posteriori estimation and Bayesian inference via Gibbs samplers Wu, Yi-Fang 01 August 2015 (has links) Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and variability of the item parameter estimates from the marginal maximum a posteriori estimation via an expectation-maximization algorithm (MMAP/EM) and the Markov chain Monte Carlo Gibbs sampling (MCMC/GS) approach. In the study, the various factors which have an impact on the accuracy and variability of the item parameter estimates are discussed, and then further evaluated through a large scale simulation. The factors of interest include the composition and length of tests, the distribution of underlying latent traits, the size of samples, and the prior distributions of discrimination, difficulty, and pseudo-guessing parameters. The results of the two estimation methods are compared to determine the lower limit--in terms of test length, sample size, test characteristics, and prior distributions of item parameters--at which the methods can satisfactorily recover item parameters and efficiently function in reality. For practitioners, the results help to define limits on the appropriate use of the BILOG-MG (which implements MMAP/EM) and also, to assist in deciding the utility of OpenBUGS (which carries out MCMC/GS) for item parameter estimation in practice. Gibbs sampling item parameter estimation item response theory marginal maximum A posteriori estimation Marko chain Monte Carlo Educational Psychology
193	Towards optimal measurement and theoretical grounding of L2 English elicited imitation: Examining scales, (mis)fits, and prompt features from item response theory and random forest approaches Ji-young Shin (11560495) 14 October 2021 (has links) <p>The present dissertation investigated the impact of scales / scoring methods and prompt linguistic features on the meausrement quality of L2 English elicited imitation (EI). Scales / scoring methods are an important feature for the validity and reliabilty of L2 EI test, but less is known (Yan et al., 2016). Prompt linguistic features are also known to influence EI test quaity, particularly item difficulty, but item discrimination or corpus-based, fine-grained meausres have rarely been incorporated into examining the contribution of prompt linguistic features. The current study addressed the research needs, using item response theory (IRT) and random forest modeling.</p><p>Data consisted of 9,348 oral responses to forty-eight items, including EI prompts, item scores, and rater comments, which were collected from 779 examinees of an L2 English EI test at Purdue Universtiy. First, the study explored the current and alternative EI scales / scoring methods that measure grammatical / semantic accuracy, focusing on optimal IRT-based measurement qualities (RQ1 through RQ4 in Phase Ⅰ). Next, the project identified important prompt linguistic features that predict EI item difficulty and discrimination across different scales / scoring methods and proficiency, using multi-level modeling and random forest regression (RQ5 and RQ6 in Phase Ⅱ).</p><p>The main findings were (although not limited to): 1) collapsing exact repetition and paraphrase categories led to more optimal measurement (i.e., adequacy of item parameter values, category functioning, and model / item / person fit) (RQ1); there were fewer misfitting persons with lower proficiency and higher frequency of unexpected responses in the extreme categories (RQ2); the inconsistency of qualitatively distinguishing semantic errors and the wide range of grammatical accuracy in the minor error category contributed to misfit (RQ3); a quantity-based, 4-category ordinal scale outperformed quality-based or binary scales (RQ4); sentence length significantly explained item difficulty only, with small variance explained (RQ5); Corpus-based lexical measures and phrase-level syntactic complexity were important to predicting item difficulty, particularly for the higher ability level. The findings made implications for EI scale / item development in human and automatic scoring settings and L2 English proficiency development.</p> Elicited imitation scales scoring methods prompt linguistic features item response theory random forest regression misfit analysis
194	A Comparison of Traditional Norming and Rasch Quick Norming Methods Bush, Joan Spooner 08 1900 (has links) The simplicity and ease of use of the Rasch procedure is a decided advantage. The test user needs only two numbers: the frequency of persons who answered each item correctly and the Rasch-calibrated item difficulty, usually a part of an existing item bank. Norms can be computed quickly for any specific group of interest. In addition, once the selected items from the calibrated bank are normed, any test, built from the item bank, is automatically norm-referenced. Thus, it was concluded that the Rasch quick norm procedure is a meaningful alternative to traditional classical true score norming for test users who desire normative data. norming method rasch quick norming method Norm-referenced tests. Item response theory. Educational tests and measurements. Rasch, G. (Georg), 1901-1980.
195	Comparing Fountas and Pinnell's Reading Levels to Reading Scores on the Criterion Referenced Competency Test Walker, Shunda F. 01 January 2016 (has links) Reading competency is related to individuals' success at school and in their careers. Students who experience significant problems with reading may be at risk of long-term academic and social problems. High-quality measures that determine student progress toward curricular goals are needed for early identification and interventions to improve reading abilities and ultimately prevent subsequent failure in reading. The purpose of this quantitative nonexperimental ex post facto research study was to determine whether a correlation existed amongst student achievement scores on the Fountas and Pinnell Reading Benchmark Assessment and reading comprehension scores on the Criterion Reference Competency Test (CRCT). The item response theory served as the conceptual framework for examining whether a relationship exists between Fountas and Pinnell Benchmark Instructional Reading Levels and the reading comprehension scores on the CRCT of students in Grades 3, 4, and 5 in the year 2013-2014. Archival data for 329 students in Grades 3-5 were collected and analyzed through Spearman's rank-order correlation. The results showed positive relationships between the scores. The findings promote positive social change by supporting the use of benchmark assessment data to identify at-risk reading students early. assessment Benchmark Test data driven decision making formative assessment Fountas and Pinnell item response theory Education Liberal Studies
196	Exploring the Item Difficulty and Other Psychometric Properties of the Core Perceptual, Verbal, and Working Memory Subtests of the WAIS-IV Using Item Response Theory Schleicher-Dilks, Sara Ann 01 January 2015 (has links) The ceiling and basal rules of the Wechsler Adult Intelligence Scale – Fourth Edition (WAIS-IV; Wechsler, 2008) only function as intended if subtest items proceed in order of difficulty. While many aspects of the WAIS-IV have been researched, there is no literature about subtest item difficulty and precise item difficulty values are not available. The WAIS-IV was developed within the framework of Classical Test Theory (CTT) and item difficulty was most often determined using p-values. One limitation of this method is that item difficulty values are sample dependent. Both standard error of measurement, an important indicator of reliability, and p-values change when the sample changes. A different framework within which psychological tests can be created, analyzed and refined is called Item Response Theory (IRT). IRT places items and person ability onto the same scale using linear transformations and links item difficulty level to person ability. As a result, IRT is said to be produce sample-independent statistics. Rasch modeling, a form of IRT, is one parameter logistic model that is appropriate for items with only two response options and assumes that the only factors affecting test performance are characteristics of items, such as their difficulty level or their relationship to the construct being measured by the test, and characteristics of participants, such as their ability levels. The partial credit model is similar to the standard dichotomous Rasch model, except that it is appropriate for items with more than two response options. Proponents of standard dichotomous Rasch model argue that it has distinct advantages above both CTT-based methods as well as other IRT models (Bond & Fox, 2007; Embretson & Reise, 2000; Furr & Bacharach, 2013; Hambleton & Jones, 1993) because of the principle of monotonicity, also referred to as specific objectivity, the principle of additivity or double cancellation, which “establishes that two parameters are additively related to a third variable” (Embretson & Reise, 2000, p. 148). In other words, because of the principle of monotonicity, in Rasch modeling, probability of correctly answering an item is the additive function of individuals’ ability, or trait level, and the item’s degree of difficulty. As ability increases, so does an individual’s probability of answering that item. Because only item difficulty and person ability affect an individual’s chance of correctly answering an item, inter-individual comparisons can be made even if individuals did not receive identical items or items of the same difficulty level. This is why Rasch modeling is referred to as a test-free measurement. The purpose of this study was to apply a standard dichotomous Rasch model or partial credit model to the individual items of seven core perceptual, verbal and working memory subtests of the WAIS-IV: Block Design, Matrix Reasoning, Visual Puzzles, Similarities, Vocabulary, Information, Arithmetic Digits Forward, Digits Backward and Digit Sequencing. Results revealed that WAIS-IV subtests fall into one of three categories: optimally ordered, near optimally ordered and sub-optimally ordered. Optimally ordered subtests, Digits Forward and Digits Backward, had no disordered items. Near optimally ordered subtests were those with one to three disordered items and included Digit Sequencing, Arithmetic, Similarities and Block Design. Sub-optimally ordered subtests consisted of Matrix Reasoning, Visual Puzzles, Information and Vocabulary, with the number of disordered items ranging from six to 16. Two major implications of the result of this study were considered: the impact on individuals’ scores and the impact on overall test administration time. While the number of disordered items ranged from 0 to 16, the overall impact on raw scores was deemed minimal. Because of where the disordered items occur in the subtest, most individuals are administered all the items that they would be expected to answer correctly. A one-point reduction in any one subtest is unlikely to significantly affect overall index scores, which are the scores most commonly interpreted in the WAIS-IV. However, if an individual received a one-point reduction across all subtests, this may have a more noticeable impact on index scores. In cases where individuals discontinue before having a chance to answer items that were easier, clinicians may consider testing the limits. While this would have no impact on raw scores, it may provide clinicians with a better understanding of individuals’ true abilities. Based on the findings of this study, clinicians may consider administering only certain items in order to test the limits, based on the items’ difficulty value. This study found that the start point for most subtests is too easy for most individuals. For some subtests, most individuals may be administered more than 10 items that are too easy for them. Other than increasing overall administration time, it is not clear what impact, of any, this has. However, it does suggest the need to reevaluate current start items so that they are the true basal for most people. Future studies should break standard test administration by ignoring basal and ceiling rules to collect data on more items. In order to help clarify why some items are more or less difficult than would be expected given their ordinal rank, future studies should include a qualitative aspect, where, after each subtest, individuals are asked describe what they found easy and difficult about each item. Finally, future research should examine the effects of item ordering on participant performance. While this study revealed that only minimal reductions in index scores likely result from the prematurely stopping test administration, it is not known if disordering has other impacts on performance, perhaps by increasing or decreasing an individual’s confidence. item difficulty item response theory psychometric properties Rasch model WAIS-IV Psychology
197	Assessing the Absolute and Relative Performance of IRTrees Using Cross-Validation and the RORME Index DiTrapani, John B. 03 September 2019 (has links) No description available. Quantitative Psychology Item response theory cross-validation model selection model fit item response tree models quantitative psychology
198	Joint Analysis of Social and Item Response Networks with Latent Space Models Wang, Shuo January 2019 (has links) No description available. Statistics
199	Establishing Roots Before Branching Out: Parameter Recovery in Item Response Tree Models Ryan, Tyler 25 May 2023 (has links) No description available. Cognitive Psychology Personality Psychological Tests Psychology Quantitative Psychology Statistics Item Response Theory Item Response Trees IRTrees Measurement Psychometrics Simulation
200	Essays on Noncognitive Skills Nikolaou, Dimitrios 09 August 2013 (has links) No description available. Economics Noncognitive skills life satisfaction child development item response theory gender wage gap cognitive skills occupational choice

Search results