Spelling suggestions: "subject:"tem desponse 1heory"" "subject:"tem desponse btheory""
121 |
Augmented testing and effects on item and proficiency estimates in different calibration designsWall, Nathan Lane 01 May 2011 (has links)
Broadening the term augmented testing to include a combination of multiple measures to assess examinee performance on a single construct, the issues of IRT item parameter and proficiency estimates were investigated. The intent of this dissertation is to determine if different IRT calibration designs result in differences to item and proficiency parameter estimates and to understand the nature of those differences.
Examinees were sampled from a testing program in which each examinee was administered three mathematics assessments measuring a broad mathematics domain at the high school level. This sample of examinees was used to perform a real data analysis to investigate the item and proficiency estimates. A simulation study was also conducted based upon the real data.
The factors investigated for the real data study included three IRT calibration designs and two IRT models. The calibration designs included: separately calibrating each assessment, calibrating all assessments in one joint calibration, and separately calibrating items in three distinct content areas. Joint calibration refers to the use of IRT methodology to calibrate two or more tests, which have been administered to a single group, together so as to place all of the items on a common scale. The two IRT models were the one- and three-parameter logistic model. Also investigated were five proficiency estimators: maximum likelihood estimates, expected a posteriori, maximum a posteriori, summed-score EAP, and test characteristic curve estimates. The simulation study included the same calibration designs and IRT models but the data were simulated with varying levels of correlations among the proficiencies to determine the affect upon the item parameter estimates.
The main findings indicate that item parameter and proficiency estimates are affected by the IRT calibration design. The discrimination parameter estimates of the three-parameter model were larger when calibrated under the joint calibration design for one assessment but not for the other two. Noting that equal item discrimination is an assumption of the 1-PL model, this finding raises questions as to the degree of model fit when the 1-PL model is used. Items on a second assessment had lower difficulty parameters in the joint calibration design while the item parameter estimates of the other two assessments were higher. Differences in proficiency estimates between calibration designs were also discovered, which were found to result in examinees being inconsistently classified into performance categories. Differences were observed in regards to the choice of IRT model. Finally, as the level of correlation among proficiencies increased in the simulation data, the differences observed in the item parameter estimates were decreased.
Based upon the findings, IRT item parameter estimates resulting from differing calibrations designs should not be used interchangeably. Practitioners who use item pools should base the pool refreshment calibration design upon the one used to originally create the pool. Limitations to this study include the use of a single dataset consisting of high school examinees in only one subject area, thus the degree of generalization regarding research findings to other content areas of grade levels should be made with caution.
|
122 |
NKTS, SOFIE e ESQUADA: escalas para avaliar o conhecimento nutricional, as motivações para escolhas alimentares e a qualidade da dieta com aplicação da Teoria de Resposta ao Item / NKTS, SOFIE and ESQUADA: scales to evaluate the nutritional knowledge, the motivations influencing food choices, and the quality of diet using the Item Response TheorySantos, Thanise Sabrina Souza 27 August 2019 (has links)
Introdução - O estudo do conhecimento nutricional, das motivações para as escolhas alimentares e da qualidade da dieta traz informações importantes para o controle das crescentes prevalências de excesso de peso e doenças crônicas. Apesar da eficiência das estratégias de controle depender de uma escala de boa qualidade, se desconhece a precisão das escalas existentes ou estas não estão em acordo com recomendações vigentes. Objetivo - Desenvolver escalas para avaliar o conhecimento nutricional, a motivação de saúde para escolhas alimentares e a qualidade da dieta com aplicação da Teoria de Resposta ao Item (TRI). Métodos - Para desenvolver as escalas de conhecimento nutricional (NKTS) e motivação de saúde para as escolhas alimentares (SOFIE) foram utilizados questionários já existentes e utilizados no estudo HELENA, uma investigação multicêntrica da União Europeia: Nutritional Knowledge Test (NKT) e Food Choices and Preferences (FCP). Para desenvolver a escala de qualidade da dieta (ESQUADA) foi desenvolvido um questionário, baseado no Guia Alimentar para a População Brasileira e cuja relevância e compreensão foram estudadas a partir das sugestões de nutricionistas, em grupos focais, e adolescentes e jovens adultos brasileiros, em um questionário online. Para cada escala, a dimensionalidade dos itens foi estudada separadamente pela análise fatorial exploratória. A TRI foi aplicada para identificar os itens com melhor discriminação da informação de interesse, bem como localizá-los nos diferentes níveis do continuum e calcular os escores dos indivíduos. Foi verificada a associação entre os escores de NKTS e SOFIE com consumo de alimentos e biomarcadores nutricionais, avaliados no estudo HELENA. Para construir as escalas foram utilizados BILOG-MG versão 3, GGUM 2004 e o R. As sugestões dos nutricionistas foram analisadas no MAXQDA versão 12. As sugestões dos adolescentes e jovens adultos e a caracterização das escalas foram realizadas no Microsoft Office Excel versão 2013. O estudo das associações foi realizado no Stata versão 14. Resultados - A análise fatorial e a TRI indicaram que onze itens do NKT e dezesseis itens do FCP avaliam adequadamente o conhecimento nutricional e a motivação de saúde para as escolhas alimentares, compondo, respectivamente, NKTS e SOFIE. NKTS identifica indivíduos com conhecimento nutricional básico, adequado e avançado. SOFIE classifica indivíduos com baixa, indiferente e alta motivação de saúde para as escolhas alimentares. Os escores de NKTS e SOFIE associaram positivamente com marcadores de alimentação saudável. Em relação à ESQUADA, os nutricionistas consideraram os itens relevantes para avaliar a qualidade da dieta, mas indicaram a necessidade de alterar a escrita de alguns, bem como de suas alternativas de resposta. Todos os itens foram facilmente compreendidos por adolescentes e adultos jovens. A análise fatorial e a TRI reteram 25 itens para a ESQUADA, possibilitando identificar cinco níveis de qualidade da dieta. Conclusão - NKTS, SOFIE e ESQUADA, respectivamente, apresentaram medidas acuradas do conhecimento nutricional, das motivações de saúde para as escolhas alimentares e da qualidade da dieta, permitindo caracterizar seus diferentes níveis. Outros estudos podem analisar a relação entre ESQUADA e marcadores de consumo alimentar, bem como selecionar as escalas desenvolvidas para avaliação em outras populações. / Introduction - The study of the nutritional knowledge, the motivations to food choices, and the diet quality provides important information to control the increasing prevalences of overweight and chronic diseases. Although the effectiveness of the control strategies depends on a good quality scale, the accuracy of existing scales is unknown or these scales are not in line with current recommendations. Objective - This thesis aimed to develop scales to evaluate the nutritional knowledge, health motivation influencing food choices, and diet quality using the IRT analysis. Methods - Due to develop the scales of the nutritional knowledge (NKTS) and health motivation to food choices (SOFIE), questionnaires already used in the HELENA study, a European multicenter research, were used: Nutritional Knowlwdge Test (NKT) and Food Choices and Preferences (FCP). Due to develop the scale of diet quality (ESQUADA), a questionnaire, based on the Food Guide for the Brazilian Population, was created. The relevance and laypersons\' comprehension of this questionnaire were studied based on the suggestions from: nutritionists using focus groups\' discussions and Brazilian adolescents and young adults using an online questionnaire. The dimensionality of the items was analysed separately using the exploratory factor analysis for each scale. The IRT analysis was applied to identify the items with the best discrimination of the information of interest, as well as to locate them at the different levels of the continuum and to calculate the IRT scores. The association between IRT scores, food consumption and nutritional biomarkers was analyzed around NKTS and SOFIE. The softwares BILOG-MG version 3, GGUM 2004, and R were used to construct the scales. The nutritionists\' suggestions were analyzed in MAXQDA version 12. The laypersons\' suggestions and the characterization of each scale level were performed in Microsoft Office Excel version 2013. The study of association was performed in Stata version 14. Results - Factor analysis and IRT analysis indicated that eleven items from NKT and sixteen items from FCP adequately evaluate the nutritional knowledge and health motivation influencing food choices and compose NKTS and SOFIE, respectively. NKTS identifies individuals with basic, adequate, and advanced nutritional knowledge. SOFIE classifies individuals with low, indifferent, and high health motivation influencing food choices. The scores from NKTS and SOFIE were positively associated with healthy food markers. Regarding the ESQUADA, the nutritionists considered the items relevant to assess the quality of diet. However, they indicated the need to change the writing of some items as well as their response options. Adolescents and young adults easily understood all items. The factor analysis and IRT analysis retained 25 items in ESQUADA, possibiting to identify five levels of diet quality. Conclusion - NKTS, SOFIE, and ESQUADA, respectively, presented accurate measures of nutritional knowledge, health motivation to food choices, and quality of diet, allowing the characterization of their different levels. Other studies may analyze the relationship between ESQUADA and food consumption, as well as select the developed scales to evaluate these latent traits in other populations.
|
123 |
Health Knowledge & Health Behavior Outcomes in Adolescents with Elevated Blood PressureFitzpatrick, Stephanie L 24 May 2011 (has links)
The purpose of this current study was to examine the influence of cardiovascular health knowledge on dietary and physical activity changes in 15-17 year olds with elevated blood pressure. The sample consisted of 167 adolescents randomized into one of three treatment conditions (minimal, moderate, or intense). Each adolescent completed a fitness test (peak VO2), 24-hour dietary recall, 7 Day Activity Recall (kilocalories expended per day), Self-efficacy Questionnaire, and Stages of Change Questionnaire every three months. The Health Knowledge Assessment was given at baseline and at post-intervention. Classical test theory, confirmatory factor analysis, and item response theory frameworks were applied to examine psychometric properties of the Health Knowledge Assessment. Structural equation modeling was used to examine the change in health behaviors and the relationship with health knowledge, self-efficacy, and readiness for change. The 34-item Health Knowledge Assessment had good internal consistency and the items loaded onto a single factor at pretest and posttest. Furthermore, there was a good distribution of easy, moderate, and hard items at pretest, but additional hard items were needed at posttest. There were no treatment condition differences in level of health knowledge at pretest. The intense condition had significantly higher health knowledge than the minimal and moderate conditions at posttest; level of health knowledge for the moderate condition was significantly higher than the minimal condition at posttest. Level of nutrition knowledge at posttest was not associated with any of the dietary intake variables nor was level of exercise knowledge associated with the two physical activity variables at post-intervention. However, there was a marginally significant association between level of nutrition knowledge and nutrition self-efficacy at posttest. Nutrition self-efficacy and nutrition readiness for change at posttest were also associated with a decrease in sugar consumption at post-intervention. Implications of this study suggest that a cardiovascular health intervention for adolescents with elevated blood pressure, consisting of group sessions and/or individual sessions over the course of three to six months, was effective in terms of increasing cardiovascular health knowledge, self-efficacy, and readiness for change. Nonetheless, the role that health knowledge plays in health behavior change needs to be further examined.
|
124 |
Self- Versus Informant Reports of Posttraumatic Stress Disorder: An Application of Item Response TheoryFissette, Caitlin 1984- 14 March 2013 (has links)
As men and women return from serving on the frontlines of Operations Enduring Freedom (OEF; Afghanistan) and Iraqi Freedom (OIF; Iraq), many struggle with emotional or behavioral difficulties stemming from the stresses of battle. However, research has shown that these service members may be unwilling or unable to recognize or report such difficulties due to such factors as amnesia, avoidance, or cognitive impairment. Hence, the burden to recognize distress and encourage treatment increasingly falls on peers, friends, and especially intimate partners. Given that this responsibility is often placed on significant others, it is imperative to determine which symptoms are amenable to detection by informants and which are not. The current study examined the ability of female spouses of Vietnam veterans to report on various indicators of posttraumatic stress disorder (PTSD) using the Mississippi Scale for Combat-Related PTSD. Item response theory (IRT) analyses were conducted with a dataset composed of both self- and informant reports using the same items regarding the same individual in order to examine the item-level properties.
Results from these analyses indicated that the ability of both spouses and veterans to detect PTSD symptoms varies across item content and that items themselves do not relate equally to, or become diagnostic at the same level of, PTSD. Overall, veterans showed greater sensitivity to their own symptoms and were able to provide more information than their spouses for nearly every item rated by independent experts to be overt or covert. However, some items provided greater information when endorsed by the spouse versus the veteran even though, consistent with the majority of other items, these items were endorsed by the spouse only once the PTSD symptoms had reached greater severity. Implications of these findings as well as future directions for research regarding observer reports of PTSD symptomatology were explored.
|
125 |
Partial Credit Models for Scale Construction in Hedonic Information SystemsMair, Patrick, Treiblmaier, Horst January 2008 (has links) (PDF)
Information Systems (IS) research frequently uses survey data to measure the interplay between technological systems and human beings. Researchers have developed sophisticated procedures to build and validate multi-item scales that measure real world phenomena (latent constructs). Most studies use the so-called classical test theory (CTT), which suffers from several shortcomings. We first compare CTT to Item Response Theory (IRT) and subsequently apply a Rasch model approach to measure hedonic aspects of websites. The results not only show which attributes are best suited for scaling hedonic information systems, but also introduce IRT as a viable substitute that overcomes severall shortcomings of CTT. (author´s abstract) / Series: Research Report Series / Department of Statistics and Mathematics
|
126 |
Bayesian Modeling Using Latent StructuresWang, Xiaojing January 2012 (has links)
<p>This dissertation is devoted to modeling complex data from the</p><p>Bayesian perspective via constructing priors with latent structures.</p><p>There are three major contexts in which this is done -- strategies for</p><p>the analysis of dynamic longitudinal data, estimating</p><p>shape-constrained functions, and identifying subgroups. The</p><p>methodology is illustrated in three different</p><p>interdisciplinary contexts: (1) adaptive measurement testing in</p><p>education; (2) emulation of computer models for vehicle crashworthiness; and (3) subgroup analyses based on biomarkers.</p><p>Chapter 1 presents an overview of the utilized latent structured</p><p>priors and an overview of the remainder of the thesis. Chapter 2 is</p><p>motivated by the problem of analyzing dichotomous longitudinal data</p><p>observed at variable and irregular time points for adaptive</p><p>measurement testing in education. One of its main contributions lies</p><p>in developing a new class of Dynamic Item Response (DIR) models via</p><p>specifying a novel dynamic structure on the prior of the latent</p><p>trait. The Bayesian inference for DIR models is undertaken, which</p><p>permits borrowing strength from different individuals, allows the</p><p>retrospective analysis of an individual's changing ability, and</p><p>allows for online prediction of one's ability changes. Proof of</p><p>posterior propriety is presented, ensuring that the objective</p><p>Bayesian analysis is rigorous.</p><p>Chapter 3 deals with nonparametric function estimation under</p><p>shape constraints, such as monotonicity, convexity or concavity. A</p><p>motivating illustration is to generate an emulator to approximate a computer</p><p>model for vehicle crashworthiness. Although Gaussian processes are</p><p>very flexible and widely used in function estimation, they are not</p><p>naturally amenable to incorporation of such constraints. Gaussian</p><p>processes with the squared exponential correlation function have the</p><p>interesting property that their derivative processes are also</p><p>Gaussian processes and are jointly Gaussian processes with the</p><p>original Gaussian process. This allows one to impose shape constraints</p><p>through the derivative process. Two alternative ways of incorporating derivative</p><p>information into Gaussian processes priors are proposed, with one</p><p>focusing on scenarios (important in emulation of computer</p><p>models) in which the function may have flat regions.</p><p>Chapter 4 introduces a Bayesian method to control for multiplicity</p><p>in subgroup analyses through tree-based models that limit the</p><p>subgroups under consideration to those that are a priori plausible.</p><p>Once the prior modeling of the tree is accomplished, each tree will</p><p>yield a statistical model; Bayesian model selection analyses then</p><p>complete the statistical computation for any quantity of interest,</p><p>resulting in multiplicity-controlled inferences. This research is</p><p>motivated by a problem of biomarker and subgroup identification to</p><p>develop tailored therapeutics. Chapter 5 presents conclusions and</p><p>some directions for future research.</p> / Dissertation
|
127 |
Making Diagnostic Thresholds Less ArbitraryUnger, Alexis Ariana 2011 May 1900 (has links)
The application of diagnostic thresholds plays an important role in the classification of mental disorders. Despite their importance, many diagnostic thresholds are set arbitrarily, without much empirical support. This paper seeks to introduce and analyze a new empirically based way of setting diagnostic thresholds for a category of mental disorders that has historically had arbitrary thresholds, the personality disorders (PDs). I analyzed data from over 2,000 participants that were part of the Methods to Improve Diagnostic Assessment and Services (MIDAS) database. Results revealed that functional outcome scores, as measured by Global Assessment of Functioning (GAF) scores, could be used to identify diagnostic thresholds and that the optimal thresholds varied somewhat by personality disorder (PD) along the spectrum of latent severity. Using the Item response theory (IRT)-based approach, the optimal threshold along the spectrum of latent severity for the different PDs ranged from θ = 1.50 to 2.25. Effect sizes using the IRT-based approach ranged from .34 to 1.55. These findings suggest that linking diagnostic thresholds to functional outcomes and thereby making them less arbitrary is an achievable goal. This study has introduced a new and uncomplicated way to empirically set diagnostic thresholds while also taking into consideration that items within diagnostic sets may function differently. Although purely an initial demonstration meant only to serve as an example, by using this approach, there exists the potential that diagnostic thresholds for all disorders could one day be set on an empirical basis.
|
128 |
Identifying and measuring cognitive aspects of a mathematics achievement testLutz, Megan E. 16 March 2012 (has links)
Cognitive Diagnostic Models (CDMs) are a useful way to identify potential areas of intervention for students who may not have mastered various skills and abilities at the same time as their peers. Traditionally, CDMs have been used on narrowly defined classroom tests, such as those for determining whether students are able to use different algebraic principles correctly. In the current study, the Deterministic Input, Noisy "And" Gate model (DINA; Haertel, 1989; Junker&Sijtsma, 2001) and the Compensatory Reparameterized Unified Model (CRUM; Hartz, 2002), as parameterized by the log-linear cognitive diagnosis model (LCDM; Henson, Templin,&Willse, 2009), were used to analyze the utility of pre-defined cognitive components in estimating students' abilities in a broadly defined, standardized mathematics achievement test. The attribute mastery profile distributions were compared; the majority of students was classified into the extremes of no mastery or complete mastery for both the CRUM and DINA models, though greater variability among attribute mastery classifications was obtained by the CRUM.
|
129 |
Detecting Aberrant Responding on Unidimensional Pairwise Preference Tests: An Application of based on the Zinnes Griggs Ideal Point IRT ModelLee, Philseok 01 January 2013 (has links)
This study investigated the efficacy of the lz person fit statistic for detecting aberrant responding with unidimensional pairwise preference (UPP) measures, constructed and scored based on the Zinnes-Griggs (ZG, 1974) IRT model, which has been used for a variety of recent noncognitive testing applications. Because UPP measures are used to collect both "self-" and "other-" reports, I explored the capability of lz to detect two of the most common and potentially detrimental response sets, namely fake good and random responding. The effectiveness of lz was studied using empirical and theoretical critical values for classification, along with test length, test information, the type of statement parameters, and the percentage of items answered aberrantly (20%, 50%, 100%). We found that lz was ineffective in detecting fake good responding, with power approaching zero in the 100% aberrance conditions. However, lz was highly effective in detecting random responding, with power approaching 1.0 in long-test, high information conditions, and there was no diminution in efficacy when using marginal maximum likelihood estimates of statement parameters in place of the true values. Although using empirical critical values for classification provided slightly higher power and more accurate Type I error rates, theoretical critical values, corresponding to a standard normal distribution, provided nearly as good results.
|
130 |
What Drives Package Authors to Participate in the R Project for Statistical Computing? Exploring Motivation, Values, and Work DesignMair, Patrick, Hofmann, Eva, Gruber, Kathrin, Hatzinger, Reinhold, Zeileis, Achim, Hornik, Kurt January 2015 (has links) (PDF)
One of the cornerstones of the R system for statistical computing is the multitude of packages contributed by numerous package authors. This makes an extremely broad range of statistical techniques and other quantitative methods freely available. So far no empirical study has investigated psychological factors that drive authors to participate in the R project. This article presents a
study of R package authors, collecting data on different types of participation (number of packages, participation in mailing lists, participation in conferences), three psychological scales (types of motivation, psychological values, and work design characteristics), as well as various sociodemographic factors. The data are
analyzed using item response models and subsequent generalized linear models, showing that the most important determinants for participation are a hybrid form of motivation and the social characteristics of the work design. Other factors are found to have less impact or influence only specific aspects of participation. (authors' abstract)
|
Page generated in 0.0452 seconds