Global ETD Search

31	Extended Rasch Modeling: The eRm Package for the Application of IRT Models in R Mair, Patrick, Hatzinger, Reinhold 22 February 2007 (has links) (PDF) Item response theory models (IRT) are increasingly becoming established in social science research, particularly in the analysis of performance or attitudinal data in psychology, education, medicine, marketing and other fields where testing is relevant. We propose the R package eRm (extended Rasch modeling) for computing Rasch models and several extensions. A main characteristic of some IRT models, the Rasch model being the most prominent, concerns the separation of two kinds of parameters, one that describes qualities of the subject under investigation, and the other relates to qualities of the situation under which the response of a subject is observed. Using conditional maximum likelihood (CML) estimation both types of parameters may be estimated independently from each other. IRT models are well suited to cope with dichotomous and polytomous responses, where the response categories may be unordered as well as ordered. The incorporation of linear structures allows for modeling the effects of covariates and enables the analysis of repeated categorical measurements. The eRm package fits the following models: the Rasch model, the rating scale model (RSM), and the partial credit model (PCM) as well as linear reparameterizations through covariate structures like the linear logistic test model (LLTM), the linear rating scale model (LRSM), and the linear partial credit model (LPCM). We use an unitary, efficient CML approach to estimate the item parameters and their standard errors. Graphical and numeric tools for assessing goodness-of-fit are provided. (authors' abstract)
32	Multiple Choice and Constructed Response Tests: Do Test Format and Scoring Matter? Kastner, Margit, Stangl, Barbara 10 March 2011 (has links) (PDF) Problem Statement: Nowadays, multiple choice (MC) tests are very common, and replace many constructed response (CR) tests. However, literature reveals that there is no consensus whether both test formats are equally suitable for measuring students' ability or knowledge. This might be due to the fact that neither the type of MC question nor the scoring rule used when comparing test formats are mentioned. Hence, educators do not have any guidelines which test format or scoring rule is appropriate. Purpose of Study: The study focuses on the comparison of CR and MC tests. More precisely, short answer questions are contrasted to equivalent MC questions with multiple responses which are graded with three different scoring rules. Research Methods: An experiment was conducted based on three instruments: A CR and a MC test using a similar stem to assure that the questions are of an equivalent level of difficulty. This procedure enables the comparison of the scores students gained in the two forms of examination. Additionally, a questionnaire was handed out for further insights into students' learning strategy, test preference, motivation, and demographics. In contrast to previous studies the present study applies the many-facet Rasch measurement approach for analyzing data which allows improving the reliability of an assessment and applying small datasets. Findings: Results indicate that CR tests are equal to MC tests with multiple responses if Number Correct (NC) scoring is used. An explanation seems straight forward since the grader of the CR tests did not penalize wrong answers and rewarded partially correct answers. This means that s/he uses the same logic as NC scoring. All other scoring methods such as the All or-Nothing or University-Specific rule neither reward partial knowledge nor penalize guessing. Therefore, these methods are found to be stricter than NC scoring or CR tests and cannot be used interchangeably. Conclusions: CR tests can be replaced by MC tests with multiple responses if NC scoring is used, due to the fact that the multiple response format measures more complex thinking skills than conventional MC questions. Hence, educators can take advantage of low grading costs, consistent grading, no scoring biases, and greater coverage of the syllabus while students benefit from timely feedback. (authors' abstract)
33	Extended Rasch Modeling: The eRm Package for the Application of IRT Models in R Mair, Patrick, Hatzinger, Reinhold January 2007 (has links) (PDF) Item response theory models (IRT) are increasingly becoming established in social science research, particularly in the analysis of performance or attitudinal data in psychology, education, medicine, marketing and other fields where testing is relevant. We propose the R package eRm (extended Rasch modeling) for computing Rasch models and several extensions. A main characteristic of some IRT models, the Rasch model being the most prominent, concerns the separation of two kinds of parameters, one that describes qualities of the subject under investigation, and the other relates to qualities of the situation under which the response of a subject is observed. Using conditional maximum likelihood (CML) estimation both types of parameters may be estimated independently from each other. IRT models are well suited to cope with dichotomous and polytomous responses, where the response categories may be unordered as well as ordered. The incorporation of linear structures allows for modeling the effects of covariates and enables the analysis of repeated categorical measurements. The eRm package fits the following models: the Rasch model, the rating scale model (RSM), and the partial credit model (PCM) as well as linear reparameterizations through covariate structures like the linear logistic test model (LLTM), the linear rating scale model (LRSM), and the linear partial credit model (LPCM). We use an unitary, efficient CML approach to estimate the item parameters and their standard errors. Graphical and numeric tools for assessing goodness-of-fit are provided. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics
34	Attempting measurement of psychological attributes Salzberger, Thomas 26 February 2013 (has links) (PDF) Measures of psychological attributes abound in the social sciences as much as measures of physical properties do in the physical sciences. However, there are crucial differences between the scientific underpinning of measurement. While measurement in the physical sciences is supported by empirical evidence that demonstrates the quantitative nature of the property assessed, measurement in the social sciences is, in large part, made possible only by a vague, discretionary definition of measurement that places hardly any restrictions on empirical data. Traditional psychometric analyses fail to address the requirements of measurement as defined more rigorously in the physical sciences. The construct definitions do not allow for testable predictions; and content validity becomes a matter of highly subjective judgment. In order to improve measurement of psychological attributes, it is suggested to, first, readopt the definition of measurement in the physical sciences; second, to devise an elaborate theory of the construct to be measured that includes the hypothesis of a quantitative attribute; and third, to test the data for the structure implied by the hypothesis of quantity as well as predictions derived from the theory of the construct. (author's abstract)
35	Measuring Job Satisfaction Among Kentucky Head Principals Using the Rasch Rating Scale Model Webb, Xavier J. 01 January 2012 (has links) The continued expansion of principals' responsibilities is having a detrimental effect on their job satisfaction; therefore, it is increasingly challenging to retain these important leaders. Effective principals can impact student learning and other vital outcomes; thus, it is important to be able to retain effective school leaders. Examining the perceived sources of principals’ satisfaction and dissatisfaction with their work has strong implications for policies and practices that can be implemented to increase principal retention. The purpose of this study was to measure the job satisfaction of head principals in Kentucky. The research conducted was an exploratory study using survey research methods. The study sought to obtain a census sample of all head principals throughout Kentucky’s 174 public school districts (N=1,158). A total of 478 responses were collected providing a response rate of 41%. A profile of the demographic and personal characteristics of Kentucky principals was constructed, and principals’ satisfaction with specified job facets was measured using the Rasch Rating Scale Model (RRSM). Findings determined that economic job attributes were not significant sources of dissatisfaction for principals in this sample. Principals were also found to be satisfied with psychological job attributes with the exception of the effect of their job on their personal life. Data in this study indicated that head principals in Kentucky were: (a) highly dissatisfied with the amount of hours they work; (b) highly dissatisfied with the amount of time spent on tasks that have nothing to do with their primary responsibility of improving student outcomes; and (c) highly dissatisfied with the lack of time they are able to spend on tasks that are directly related to improving student outcomes. A primary implication of this research was that Kentucky policy makers and superintendents could simultaneously increase principal retention and student outcomes by eliminating managerial job tasks not directly tied to instruction from the principalship so that principals can focus solely on instructional leadership. job satisfaction principals retention Kentucky Rasch
36	Measuring Transformational Leadership in Athletic Training: A Comparative Analysis Yates, Kristan M. 01 January 2013 (has links) The purpose of this study was to measure the construct of transformational leadership among athletic training academicians and clinicians. Additionally, this study sought to determine whether perspectives regarding transformational leadership were the same or different based on full-time vocational roles. Finally, this study introduced a methodology for survey data analysis relatively unknown in athletic training research circles. Participants included athletic training education program directors as well as individuals in leadership roles at the state, district, and national level. leadership transformational Rasch measurement athletic training
37	En Raschanalys för att jämföra två svenska översättningar av en enkät som mäter hälsorelaterad livskvalitet Kielén, Martina, Wallentinsson, Emma January 2016 (has links) During the 1980’s the non-profit organisation RAND Corporation conducted the two-year Medical Outcomes Study with the goal of creating a comprehensive medical questionnaire. The resulting 116-item questionnaire measures health related quality of life (HRQoL) topics such as physical, mental and general health. The questionnaire is available as a free resource on their web page. SF-36, which contains 36 of these questions, is distributed for a fee by the US company Quality Metric Inc. The company has translated the questionnaire into several languages, including Swedish, and has also taken license for the translations. Registercentrum sydost has made a new Swedish translation of the same questions as in the SF-36. This survey is called RAND-36 and is license free. Because Quality Metric Inc has taken license for its Swedish translation, the surveys are similar but not identical. This study aims to compare the aforementioned HRQoL-instruments to determine whether it is possible to replace the licensed questionnaire SF-36 with the license free RAND-36. The distribution of items with response options according ordinal scale were compared with Mann-Whitney U-test. The test yielded a significant difference for eight items in the measure PF(physical functioning), MH(mental health), VT (vitality) and GH (general health perceptions). The distribution of items with response options according dichotomous scale were compared with X2-test. The test yielded significant difference for an item in the measure RE (emotional role functioning). The reliability of questionnaire was compared with ordinal alpha. In the selection the reliability between MH and VT is equivalent. The biggest difference between the surveys is the measure RP (physical role functioning) where the RAND-36 meets the requirement that the measure can be used for reliable conclusions on the individual level, which is a condition that SF-36 can’t met. The probability of entering an answer, given the respondent's ability, was compared with Rasch analysis. Wald's test gave DIF between most items within the measures PF, MH, VT and GH. Rasch analysis item response theory SF-36 RAND-36
38	Measurement Disturbance Effects on Rasch Fit Statistics and the Logit Residual Index Mount, Robert E. (Robert Earl) 08 1900 (has links) The effects of random guessing as a measurement disturbance on Rasch fit statistics (unweighted total, weighted total, and unweighted ability between) and the Logit Residual Index (LRI) were examined through simulated data sets of varying sample sizes, test lengths, and distribution types. Three test lengths (25, 50, and 100), three sample sizes (25, 50, and 100), two item difficulty distributions (normal and uniform), and three levels of guessing (no guessing (0%), 25%, and 50%) were used in the simulations, resulting in 54 experimental conditions. The mean logit person ability for each experiment was +1. Each experimental condition was simulated once in an effort to approximate what could happen on the single administration of a four option per item multiple choice test to a group of relatively high ability persons. Previous research has shown that varying item and person parameters have no effect on Rasch fit statistics. Consequently, these parameters were used in the present study to establish realistic test conditions, but were not interpreted as effect factors in determining the results of this study. Rasch fit statistics Logit Residual Index Statistics. Logits.
39	Using TIMSS and PIRLS to Construct Global Indicators of Effective Environments for Learning Preuschoff, Anna Corinna January 2011 (has links) Thesis advisor: Ina V.S. Mullis / As an extension of the effort devoted to updating the questionnaires for TIMSS and PIRLS 2011, this dissertation explored a new reporting strategy for contextual questionnaire data. The study investigated the feasibility of constructing "global indicators" from a large number of diverse background variables, which could provide policy makers and practitioners with meaningful information on effective learning environments. Four broad constructs of effective learning environments were derived from the TIMSS and PIRLS Contextual Frameworks for 2011. These were: 1) effective school environments for learning to read, 2) effective home environments for learning to read, 3) effective classroom environments for learning mathematics, and 4) students' motivation to learn mathematics. Using the TIMSS and PIRLS 2011 Frameworks, the conceptual definitions of the constructs were formulated as constructs maps. Next, relevant questionnaire items were identified that addressed each aspect of the construct maps, capitalizing on the full range of background information in the TIMSS 2007 and PIRLS 2006 International Databases. The questionnaire items were used to create sets of variables for scaling, and subsequent to principal component analysis to confirm scale unidimensionality, the variables were combined into 1-Parameter IRT (Rasch) scales. The idea of conveying the meaning of the broad contextual scales through item mapping was explored, as well as reporting country-by-country results on the global scales. The scaling was successful and it was concluded that contextual information could be reported more globally in future cycles of TIMSS and PIRLS. However, the study also demonstrated that it is extremely complicated to choose background constructs at the right level of aggregation for both analysis and reporting. It is difficult to develop scales that summarize data for educational policy makers without loss of vital information. / Thesis (PhD) — Boston College, 2011. / Submitted to: Boston College. Lynch School of Education. / Discipline: Educational Research, Measurement, and Evaluation. Global Indicators Rasch Scaling Reporting Background Data TIMSS and PIRLS
40	Development of an Agitation Rating Scale for Use with Acute Presentation Behavioral Management Patients Strout, Tania Denise Shaffer January 2011 (has links) Thesis advisor: June A. Horowitz / Agitation is a distressing set of behaviors frequently observed in emergency department psychiatry patients. Key to developing and evaluating treatment strategies aimed at decreasing and preventing agitation is the availability of a reliable, valid instrument to measure behaviors representative of agitation. Currently, an agitation rating instrument appropriate for use in the emergency setting does not exist and clinicians are left without standard language for communicating about the phenomenon. The Agitation Severity Scale was developed to fill this void using facilitated focus groups to generate an initial item pool. Beginning evidence of content validity was established through a survey of clinical providers and a panel of content experts. The objectives of this methodological study were to: (a) develop an observation-based rating scale to assess the continuum of behaviors known as agitation in adult emergency department patients, and (b) to evaluate the psychometric properties of the newly developed instrument. Psychometric evaluation was conducted using a sample of 270 emergency department psychiatric patients. A 17-item instrument with a standardized Cronbach's alpha coefficient of 0.91 resulted, providing evidence of a high degree of internal consistency reliability. Principle components analysis revealed a 4-component solution accounting for 69% of observed variance. Internal consistency reliability ranged from 0.71 to 0.91 for the scale components. Equivalence reliability was established through the evaluation of Agitation Severity Scores assigned by independent evaluators, <italic> r </italic>= 0.99, &kappa = 0.98. Construct validity was established through comparison of mean scores for subjects in the highest and lowest scoring quartiles. A statistically significant difference in scores was noted when comparing these groups, <italic> t </italic> = -17.688, df = 155, <italic> p </italic> < 0.001. Convergent validity was evaluated by testing the association between Agitation Severity Scores and scores obtained using a well-established instrument, the Overt Agitation Severity Scale. Pearson's correlation coefficient for the associations between the scores ranged from 0.91 to 0.93, indicating a strong, positive relationship between the scores. Finally, the Rasch measurement model was employed to further evaluate the functioning of the instrument. In sum, the Agitation Severity Scale was found to be reliable and valid when used to measure agitation in the emergency setting. / Thesis (PhD) — Boston College, 2011. / Submitted to: Boston College. Connell School of Nursing. / Discipline: Nursing. agitation emergency department nursing psychiatry psychometrics Rasch analysis

Search results