Global ETD Search

371	An evaluation of item difficulty and person ability estimation using the multilevel measurement model with short tests and small sample sizes Brune, Kelly Diane 08 June 2011 (has links) Recently, researchers have reformulated Item Response Theory (IRT) models into multilevel models to evaluate clustered data appropriately. Using a multilevel model to obtain item difficulty and person ability parameter estimates that correspond directly with IRT models’ parameters is often referred to as multilevel measurement modeling. Unlike conventional IRT models, multilevel measurement models (MMM) can handle, the addition of predictor variables, appropriate modeling of clustered data, and can be estimated using non-specialized computer software, including SAS. For example, a three-level model can model the repeated measures (level one) of individuals (level two) who are clustered within schools (level three). Limitations in terms of the minimum sample size and number of test items that permit reasonable one-parameter logistic (1-PL) IRT model’s parameters have not been examined for either the two- or three-level MMM. Researchers (Wright and Stone, 1979; Lord, 1983; Hambleton and Cook, 1983) have found that sample sizes under 200 and fewer than 20 items per test result in poor model fit and poor parameter recovery for dichotomous 1-PL IRT models with data that meet model assumptions. This simulation study tested the performance of the two-level and three-level MMM under various conditions that included three sample sizes (100, 200, and 400), three test lengths (5, 10, and 20), three level-3 cluster sizes (10, 20, and 50), and two generated intraclass correlations (.05 and .15). The study demonstrated that use of the two- and three-level MMMs lead to somewhat divergent results for item difficulty and person-level ability estimates. The mean relative item difficulty bias was lower for the three-level model than the two-level model. The opposite was true for the person-level ability estimates, with a smaller mean relative parameter bias for the two-level model than the three-level model. There was no difference between the two- and three-level MMMs in the school-level ability estimates. Modeling clustered data appropriately; having a minimum total sample size of 100 to accurately estimate level-2 residuals and a minimum total sample size of 400 to accurately estimate level-3 residuals; and having at least 20 items will help ensure valid statistical test results. / text Multilevel measurement model MMM Item response theory IRT Hierarchical generalized linear modeling Testing
372	Effects of sample size, ability distribution, and the length of Markov Chain Monte Carlo burn-in chains on the estimation of item and testlet parameters Orr, Aline Pinto 25 July 2011 (has links) Item Response Theory (IRT) models are the basis of modern educational measurement. In order to increase testing efficiency, modern tests make ample use of groups of questions associated with a single stimulus (testlets). This violates the IRT assumption of local independence. However, a set of measurement models, testlet response theory (TRT), has been developed to address such dependency issues. This study investigates the effects of varying sample sizes and Markov Chain Monte Carlo burn-in chain lengths on the accuracy of estimation of a TRT model’s item and testlet parameters. The following outcome measures are examined: Descriptive statistics, Pearson product-moment correlations between known and estimated parameters, and indices of measurement effectiveness for final parameter estimates. / text Item response theory Testlet response theory Testlet Markov Chain Monte Carlo IRT TRT MCMC
373	The Influence of Lexical and Sublexical Factors on Acquired Alexia and Agraphia: An Item-Analysis Volk, Rebecca Brender January 2009 (has links) This study used an item-based approach to explore the full range of lexical-semantic (word frequency and imageability) and sublexical characteristics (regularity and consistency) of stimulus items. Oral reading and spelling-to-dictation data from 72 adults with acquired alexia/agraphia due to stroke or progressive aphasia were analyzed to determine unique influences of lexical-semantic and sublexical variables on performance. Multiple regression analyses were performed for each etiology and lesion group (i.e., perisylvian stoke, extrasylvian stroke, perisylvian atrophy, and extrasylvian atrophy). As expected, word frequency had a significant influence on reading and spelling performance in almost all contexts. Of particular interest was the consistent finding that written language performance associated with left perisylvian damage was moderated primarily by lexical-semantic features of stimuli (frequency and imageability), whereas performance by those with left extrasylvian damage was strongly influenced by sublexical features of sound-spelling regularity and, to a lesser extent, consistency. acquired alexia and agraphia item analysis lexical-semantic variables progressive aphasia stroke sublexical variables
374	A National Survey on Prescribers' Knowledge of and Their Source of Drug-Drug Interaction Information-An Application of Item Response Theory Ko, Yu January 2006 (has links) OBJECTIVES: (1) To assess prescribers' ability to recognize clinically significant DDIs, (2) to examine demographic and practice factors that may be associated with prescribers' DDI knowledge, and (3) to evaluate prescribers' perceived usefulness of various DDI information sources.METHODS: This study used a mailed questionnaire sent to a national sample of prescribers based on their past history of DDI prescribing which was determined using data from a pharmacy benefit manager covering over 50 million lives. The survey questionnaire included 14 drug-drug pairs that tested prescribers' ability to recognize clinically important DDIs and five 5-point Likert scale-type questions that assessed prescribers' perceived usefulness of DDI information provided by various sources. Demographic and practice characteristics were collected as well. Rasch analysis was used to evaluate the knowledge and usefulness questions.RESULTS: Completed questionnaires were obtained from 950 prescribers (overall response rate: 7.9%). The number of drug pairs correctly classified by the prescribers ranged from zero to thirteen, with a mean of 6 pairs (42.7%). The percentage of prescribers who correctly classified specific drug pairs ranged from 18.2% for warfarin-cimetidine to 81.2% for acetaminophen with codeine-amoxicillin. Half of the drug pair questions were answered "not sure" by over one-third of the respondents; among which, two were contraindicated. Rasch analysis of knowledge and usefulness questions revealed satisfactory model-data fit and person reliability of 0.72 and 0.61, respectively. A multiple regression analysis revealed that specialists were less likely to correctly identify interactions as compared to prescribers who were generalists. Other important predictors of DDI knowledge included the experience of seeing a harm caused by DDIs and the extent to which the risk of DDIs affected the prescribers' drug selection. ANOVA with the post-hoc Scheffe test indicated that prescribers considered DDI information provided by "other" sources to be more useful than that provided by computerized alert system. CONCLUSIONS: This study suggests that prescribers' DDI knowledge may be inadequate. The study found that for the drug interactions evaluated, generalists performed better than specialists. In addition, this study presents an application of IRT analysis to knowledge and attitude measurement in health science research. drug-drug interaction prescriber knowledge national survey item response theory information source
375	Using Three Different Categorical Data Analysis Techniques to Detect Differential Item Functioning Stephens-Bonty, Torie Amelia 16 May 2008 (has links) Diversity in the population along with the diversity of testing usage has resulted in smaller identified groups of test takers. In addition, computer adaptive testing sometimes results in a relatively small number of items being used for a particular assessment. The need and use for statistical techniques that are able to effectively detect differential item functioning (DIF) when the population is small and or the assessment is short is necessary. Identification of empirically biased items is a crucial step in creating equitable and construct-valid assessments. Parshall and Miller (1995) compared the conventional asymptotic Mantel-Haenszel (MH) with the exact test (ET) for the detection of DIF with small sample sizes. Several studies have since compared the performance of MH to logistic regression (LR) under a variety of conditions. Both Swaminathan and Rogers (1990), and Hildalgo and López-Pina (2004) demonstrated that MH and LR were comparable in their detection of items with DIF. This study followed by comparing the performance of the MH, the ET, and LR performance when both the sample size is small and test length is short. The purpose of this Monte Carlo simulation study was to expand on the research done by Parshall and Miller (1995) by examining power and power with effect size measures for each of the three DIF detection procedures. The following variables were manipulated in this study: focal group sample size, percent of items with DIF, and magnitude of DIF. For each condition, a small reference group size of 200 was utilized as well as a short, 10-item test. The results demonstrated that in general, LR was slightly more powerful in detecting items with DIF. In most conditions, however, power was well below the acceptable rate of 80%. As the size of the focal group and the magnitude of DIF increased, the three procedures were more likely to reach acceptable power. Also, all three procedures demonstrated the highest power for the most discriminating item. Collectively, the results from this research provide information in the area of small sample size and DIF detection. differential item functioning categorical data analysis exact test logistic regression MH Education Education Policy
376	Detecting Inaccurate Response Patterns in Korean Military Personality Inventory: An Application of Item Response Theory Hong, Seunghwa 16 December 2013 (has links) There are concerns regarding the risk of the inaccurate responses in the personality data. The inaccurate responses negatively affect in the individual selection contexts. Especially, in the military context, the personality score including inaccurate responses results in the selection of inappropriate personnel or allows enlistment dodgers to avoid their military duty. This study conducted IRT-based person-fit analysis with the dichotomous military dataset in the Korean Military Personality Inventory. In order for that, 2PL model was applied for the data and person-fit index l_(z) was used to detect aberrant respondents. Based on l_(z) values of each respondent, potentially inaccurate respondents was identified. In diagnosing possible sources of aberrant response patterns, PRCs was assessed. This study with the military empirical data shows that person-fit analysis using l_(z) is applicable and practical method for detecting inaccurate response patterns in the personnel selection contexts based on the personality measurement. Person-fit analysis item response theory personality measures military personnel selection
377	Measuring Dementia of the Alzheimer Type More Precisely Lowe, Deborah Anne 14 March 2013 (has links) Alzheimer’s disease (AD) progressively impairs cognitive and functional abilities. Research on pharmacological treatment of AD is shifting to earlier forms of the disease, including preclinical stages. However, assessment methods traditionally used in clinical research may be inappropriate for these populations. The Alzheimer Disease Assessment Scale-cognitive (ADAS-cog), a commonly used cognitive battery in AD research, is most sensitive in the moderate range of cognitive impairment. It focuses on immediate recall and recognition aspects of memory rather than retention and delayed recall. As clinical trials for dementia continue to focus on prodromal stages of AD, instruments need to be retooled to focus on cognitive abilities more prone to change in the earliest stages of the disease. One such domain is delayed recall, which is differentially sensitive to decline in the earliest stages of AD. A supplemental delayed recall subtest for the ADAS-cog is commonly implemented, but we do not know precisely where along the spectrum of cognitive dysfunction this subtest yields incremental information beyond what is gained from the standard ADAS-cog. An item response theory (IRT) approach can analyze this in a psychometrically rigorous way. This study’s aims are twofold: (1) to examine where along the AD spectrum the delayed recall subtest yields optimal information about cognitive dysfunction, and (2) to determine if adding delayed recall to the ADAS-cog can improve prediction of functional outcomes, specifically patients’ ability to complete basic and instrumental activities of daily living. Results revealed differential functioning of ADAS-cog subtests across the dimension of cognitive impairment. The delayed recall subtest provided optimal information and increased the ADAS-cog’s measurement precision in the relatively mild range of cognitive dysfunction. Moreover, the addition of delayed recall to the ADAS- cog, consistent with my hypothesis, increased covariation with instrumental but not basic activities of daily living. These findings provide evidence that the delayed recall subtest slightly improves the ADAS-cog’s ability to capture information about cognitive impairment in the mild range of severity and thereby improves prediction of instrumental functional deficits. activities of daily living item response theory dementia of the Alzheimer type Alzheimer's disease
378	Effekter av övning och instruktion på testprestation : Några empiriska studier och analyser avseende övningens och instruktionens betydelse för testprestationen Henriksson, Widar January 1981 (has links) This report consists of a compilation of a number of studies, which all of them have dealt with the effects of practice and instruction upon a person's test results. A literature review based on a dichotomy of practice and instruction under short-term and long-term conditions respectively, as well as on the denign of the investigations and certain characteristics of the individuals, resulted in general as well as in more specific statements on the arising of practice and instruction effects. In all the empirical investigations quantitative-numerical tests were used (KVR and NOG respectively), where one-of them (NOG) is a part of the so-called Swedish Scholastic Aptitude Test. The basic aim was to investigate if short-term practice (KTÖ) and short-term instruction (KTI) could have an effect upon the score of these tests. In this case practice was defined as taking a pre-test, and in the first three studies the instruction consisted of two phases: a first one concerning general testing strategies, and a second one concerning a specially adapted problem solving strategy. As a basis for the forming of this special problem solving strategy for NOG served a logically constructed and sequential problem solving strategy, which was formed as a direct influence on the individual via instruction in the first three studies, and as an indirect influence on the individual via the item format in the last three studies. From the achieved results can be mentioned that the literature paid attention to the importance of a person's knowledge of and familiarity with tests and examinations in a specific as well as in a more general sense. If a person has no or very little former experience of tests there is a certain probability that there will be a higher score due to KTÖ or KTI. This probability is reduced considerably if the person has some or relatively great former experience in this respect. This was also verified indirectly in the empirical studies, which were mainly carried out on individuals who could be classified as experienced in and familiar with tests. No practice or instruction effects could be found either in the first three studies or indirectly in the last three studies. In these ones a specially constructed item format had been used, which had been formed according to the sequential problem solving strategy. With the intention to obtain an interpretation frame for practice and instruction effects on tests, the basis in this report has been a theoretically constructed model, whereby the achieved effects are first of all related to the individual variable former experience, and secondly to whether a test consists of correctly or incorrectly constructed items. / digitalisering@umu Practice instruction coaching item format test-taking behaviour test-wiseness validity
379	The enactment effect : studies of a memory phenomenon Nyberg, Lars January 1993 (has links) <p>Diss. (sammanfattning) Umeå : Umeå universitet, 1993, härtill 4 uppsatser.</p> / digitalisering@umu Memory enactment memory laws memory trace item- specific information relational information distinctiveness
380	Item response theory and factor analysis applied to the Neuropsychological Symptom Scale (NSS) / Analysis of the NSS / Analysis of the Neuropsychological Sympton Lutz, Jacob T. 21 July 2012 (has links) The Neuropsychological Symptom Inventory (NSI; Rattan, Dean, & Rattan, 1989), a self report measure of psychiatric and neurological symptoms, was revised to be presented in an electronic format. This revised instrument, the Neuropsychological Symptom Scale (Dean, 2010), was administered to 1,141 adult volunteers from a medium-sized Midwestern university. The collected data was subjected to exploratory factor analysis which suggested three primary factors related to emotional, cognitive, and somatosensory functioning. The items on the NSS were then organized into three subscales reflecting these areas of functioning. A fourth experimental subscale was also created to facilitate the collection of data on items that did not load on any of the three primary subscales. Item Response Theory (IRT) analysis and Classical Test Theory (CTT) approaches were then applied and compared as means of developing standard scores on the three primary subscales of the NSS. The results of these analyses are provided along with recommendations related to the further development of the NSS as an assessment tool. / Department of Educational Psychology Item response theory Factor analysis Neuropsychological tests -- Evaluation Self-report inventories -- Evaluation

Search results