Spelling suggestions: "subject:"dominal response model"" "subject:"hominal response model""
1 |
Application of Item Response Theory Models to the Algorithmic Detection of Shift Errors on Paper and Pencil TestsCook, Robert Joseph 01 September 2013 (has links)
On paper and pencil multiple choice tests, the potential for examinees to mark their answers in incorrect locations presents a serious threat to the validity of test score interpretations. When an examinee skips one or more items (i.e., answers out of sequence) but fails to accurately reflect the size of that skip on their answer sheet, that can trigger a string of misaligned responses called shift errors. Shift errors can result in correct answers being marked as incorrect, leading to possible underestimation of an examinee's true ability. Despite movement toward computerized testing in recent years, paper and pencil multiple choice tests are still pervasive in many high stakes assessment settings, including K 12 testing (e.g., MCAS) and college entrance exams (e.g., SAT), leaving a continuing need to address issues that arise within this format.
Techniques for detecting aberrant response patterns are well established but do little to recognize reasons for the aberrance, limiting options for addressing the misfitting patterns. While some work has been done to detect and address specific forms of aberrant response behavior, little has been done in the area of shift error detection, leaving great room for improvement in addressing this source of aberrance. The opportunity to accurately detect construct irrelevant errors and either adjust scores to more accurately reflect examinee ability or flag examinees with inaccurate scores for removal from the dataset and retesting would improve the validity of important decisions based on test scores, and could positively impact model fit by allowing for more accurate item parameter and ability estimation.
The purpose of this study is to investigate new algorithms for shift error detection that employ IRT models for probabilistic determination as to whether misfitting patterns are likely to be shift errors. The study examines a matrix of detection algorithms, probabilistic models, and person parameter methods, testing combinations of these factors for their selectivity (i.e., true positives vs. false positives), sensitivity (i.e., true shift errors detected vs. undetected), and robustness to parameter bias, all under a carefully manipulated, multifaceted simulation environment. This investigation attempts to provide answers to the following questions, applicable across detection methods, bias reduction procedures, shift conditions, and ability levels, but stated generally as: 1) How sensitively and selectively can an IRT based probabilistic model detect shift error across the full range of probabilities under specific conditions?, 2) How robust is each detection method to the parameter bias introduced by shift error?, 3) How well does the detection method detect shift errors compared to other, more general, indices of person fit?, 4) What is the impact on bias of making proposed corrections to detected shift errors?, and 4) To what extent does shift error, as detected by the method, occur within an empirical data set?
Results show that the proposed methods can indeed detect shift errors at reasonably high detection rates with only a minimal number of false positives, that detection improves when detecting longer shift errors, and that examinee ability is a huge determinant factor in the effectiveness of the shift error detection techniques. Though some detection ability is lost to person parameter bias, when detecting all but the shortest shift errors, this loss is minimal. Application to empirical data also proved effective, though some discrepancies in projected total counts suggest that refinements in the technique are required. Use of a person fit statistic to detect examinees with shift errors was shown to be completely ineffective, underscoring the value of shift error specific detection methods.
|
2 |
An Examination of the Psychometric Properties of the Trauma Inventory for Partners of Sex Addicts (TIPSA)Stokes, Steven Scott 01 July 2017 (has links)
This study examined the psychometric properties of the Trauma Inventory for Partners of Sex Addicts (TIPSA). Using the Nominal Response Model (NRM), I examined several aspects of item and option functioning including discrimination, empirical category ordering, and information. Category Boundary Discrimination (CBD) parameters were calculated to determine the extent to which respondents distinguished between adjacent categories. Indistinguishable categories were collapsed through recoding. Empirically disordered response categories were also collapsed through recoding. Findings revealed that recoding solved some technical functioning issues in some items, and also revealed items (and perhaps option anchors) that were probably poorly conceived initially. In addition, nuisance or error variance was reduced only marginally by recoding, and the relative standing of respondents on the trait continuum remained largely unchanged. Items in need of modification or removal were identified, and issues of content validity were discussed.
|
3 |
An Examination of the Psychometric Properties of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors: An Item Response Theory ApproachMoulton, Sara E. 01 December 2016 (has links)
This research study examined the psychometric properties of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors (SRSS-IE) using Item Response Theory (IRT) methods among a sample of 2,122 middle school students. The SRSS-IE is a recently revised screening instrument aimed at identifying students who are potentially at risk for emotional and behavioral disorders (EBD). There are two studies included in this research. Study 1 utilized the Nominal Response and Generalized Partial Credit models of IRT to evaluate items from the SRSS-IE in terms of the degree to which the response options for each item functioned as intended by the scale developers and how well those response options discriminated among students who exhibited varying levels of EBD risk. Results from this first study indicated that the four response option configurations of the items on the SRSS-IE may not adequately discriminate among the frequency of externalizing and internalizing behaviors demonstrated by middle school students. Recommendations for item response option revisions or scale scoring revisions are discussed in this study. In study 2, differential item functioning (DIF) and differential step functioning (DSF) methods were used to examine differences in item and response option functioning according to student gender variables. Additionally, test information functions (TIFs) were used to determine whether preliminary recommendations for cut scores differ by gender. Results of this second study indicate that two of the items on the SRSS-IE systematically favor males over females and one item systematically favors females over males. Additionally, examination of TIFs demonstrated different degrees of measurement precision at various levels of theta for males and females on both the externalizing and internalizing constructs. Implications of these results are discussed in relation to possible revisions of the SRSS-IE items, cut scores, or scale scoring procedures.
|
4 |
Building a validity argument for the listening component of the Test de connaissance du français in the context of Quebec immigrationArias De Los Santos, Angel Manuel 03 1900 (has links)
No description available.
|
Page generated in 0.0844 seconds