• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 246
  • 92
  • 12
  • 10
  • 9
  • 8
  • 6
  • 5
  • 4
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 481
  • 481
  • 395
  • 84
  • 81
  • 80
  • 79
  • 73
  • 68
  • 68
  • 64
  • 58
  • 50
  • 47
  • 44
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

The effectiveness of automatic item generation for the development of cognitive ability tests

Loe, Bao Sheng January 2019 (has links)
Research has shown that the increased use of computer-based testing has brought about new challenges. With the ease of online test administration, a large number of items are necessary to maintain the item bank and minimise the exposure rate. However, the traditional item development process is time-consuming and costly. Thus, alternative ways of creating items are necessary to improve the item development process. Automatic Item Generation (AIG) is an effective method in generating items rapidly and efficiently. AIG uses algorithms to create questions for testing purposes. However, many of these generators are in the closed form, available only to the selected few. There is a lack of open source, publicly available generators that researchers can utilise to study AIG in greater depth and to generate items for their research. Furthermore, research has indicated that AIG is far from being understood, and more research into its methodology and the psychometric properties of the items created by the generators are needed for it to be used effectively. The studies conducted in this thesis have achieved the following: 1) Five open source item generators were created, and the items were evaluated and validated. 2) Empirical evidence showed that using a weak theory approach to develop item generators was just as credible as using a strong theory approach, even though they are theoretically distinct. 3) The psychometric properties of the generated items were estimated using various IRT models to assess the impact of the template features used to create the items. 4) Joint responses and response time modelling was employed to provide new insights into cognitive processes that go beyond those obtained by typical IRT models. This thesis suggests that AIG provides a tangible solution for improving the item development process for content generation and reducing the procedural cost of generating a large number of items, with the possibility of a unified approach towards test administration (i.e. adaptive item generation). Nonetheless, this thesis focused on rule-based algorithms. The application of other forms of item generation methods and the potential for measuring the intelligence of artificial general intelligence (AGI) is discussed in the final chapter, proposing that the use of AIG techniques create new opportunities as well as challenges for researchers that will redefine the assessment of intelligence.
132

Rural Opioid and Other Drug Use Disorder Diagnosis: Assessing Measurement Invariance and Latent Classification of DSM-IV Abuse and Dependence Criteria

Brooks, Billy 01 August 2015 (has links)
The rates of non-medical prescription drug use in the United States (U.S.) have increased dramatically in the last two decades, leading to a more than 300% increase in deaths from overdose, surpassing motor vehicle accidents as the leading cause of injury deaths. In rural areas, deaths from unintentional overdose have increased by more than 250% since 1999 while urban deaths have increased at a fraction of this rate. The objective of this research was to test the hypothesis that cultural, economic, and environmental factors prevalent in rural America affect the rate of substance use disorder (SUD) in that population, and that diagnosis of these disorders across rural and urban populations may not be generalizable due to these same effects. This study applies measurement invariance analysis and factor analysis techniques: item response theory (IRT), multiple indicators, multiple causes (MIMIC), and latent class analysis (LCA), to the DSM-IV abuse and dependency diagnosis instrument. The sample used for the study was a population of adult past-year illicit drug users living in a rural or urban area drawn from the 2011-2012 National Survey on Drug Use and Health data files (N = 3,369| analyses 1 and 2; N = 12,140| analysis 3). Results of the IRT and MIMIC analyses indicated no significant variance in DSM item function across rural and urban sub-groups; however, several socio-demographic variables including age, race, income, and gender were associated with bias in the instrument. Latent class structures differed across the sub-groups in quality and number, with the rural sample fitting a 3-class structure and the urban fitting 6-class model. Overall the rural class structure exhibited less diversity and lower prevalence of SUD in multiple drug categories (e.g. cocaine, hallucinogens, and stimulants). This result suggests underlying elements affecting SUD patterns in the two populations. These findings inform the development of surveillance instruments, clinical services, and public health programming tailored to specific communities.
133

A comparison of fixed item parameter calibration methods and reporting score scales in the development of an item pool

Chen, Keyu 01 August 2019 (has links)
The purposes of the study were to compare the relative performances of three fixed item parameter calibration methods (FIPC) in item and ability parameter estimation and to examine how the ability estimates obtained from these different methods affect interpretations using reported scales of different lengths. Through a simulation design, the study was divided into two stages. The first stage was the calibration stage, where the parameters of pretest items were estimated. This stage investigated the accuracy of item parameter estimates and the recovery of the underlying ability distributions for different sample sizes, different numbers of pretest items, and different types of ability distributions under the three-parameter logistic model (3PL). The second stage was the operational stage, where the estimated parameters of the pretest items were put on operational forms and were used to score examinees. The second stage investigated the effect of item parameter estimation had on the ability estimation and reported scores for the new test forms. It was found that the item parameters estimated from the three FIPC methods showed subtle differences, but the results of the DeMars method were closer to those of the separate calibration with linking method than to the FIPC with simple-prior update and FIPC with iterative prior update methods, while the FIPC with simple-prior update and FIPC with iterative prior update methods performed similarly. Regarding the experimental factors that were manipulated in the simulation, the study found that the sample size influenced the estimation of item parameters. The effect of the number of pretest items on estimation of item parameters was strong but ambiguous, likely because the effect was confounded by changes of both the number of the pretest items and the characteristics of the pretest items among the item sets. The effect of ability distributions on estimation of item parameters was not as evident as the effect of the other two factors. After the pretest items were calibrated, the parameter estimates of these items were put into operational use. The abilities of the examinees were then estimated based on the examinees’ response to the existing operational items and the new items (previously called pretest items), of which the item parameters were estimated under different conditions. This study found that there were high correlations between the ability estimates and the true abilities of the examinees when forms containing pretest items calibrated using any of the three FIPC methods. The results suggested that all three FIPC methods were similarly competent in estimating parameters of the items, leading to satisfying determination of the examinees’ abilities. When considering the scale scores, because the estimated abilities were very similar, there were small differences among the scaled scores on the same scale; the relative frequency of examinees classified into performance categories and the classification consistency index also showed the interpretation of reported scores across scales were similar. The study provided a comprehensive comparison on the use of FIPC methods in parameter estimation. It was hoped that this study would help the practitioners choose among the methods according to the needs of the testing programs. When ability estimates were linearly transformed into scale scores, the lengths of scales did not affect the statistical properties of scores, however, they may impact how the scores are subjectively perceived by stakeholders and therefore should be carefully selected.
134

Effect of Violating Unidimensional Item Response Theory Vertical Scaling Assumptions on Developmental Score Scales

Topczewski, Anna Marie 01 July 2013 (has links)
Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be inconsistent in estimating grade-to-grade growth, within-grade variability, and separation of grade distributions (effect size) of developmental score scale. In particular the finding of scale shrinkage (decreasing within-grade score variability as grade-level increases) has led to concerns about and criticism of IRT vertical scales. The causes of scale shrinkage have yet to be fully understood. Real test data and simulation studies have been unable to provide complete answers as to why IRT vertical scaling inconsistencies occur. Violations of assumptions have been a commonly cited potential cause for the inconsistent results. For this reason, this dissertation is an extensive investigation into how violations of the three assumptions of UIRT vertical scaling - local item dependence, unidimensionality, and similar reliability of grade level tests - affect estimated developmental score scales. Simulated tests were developed that purposefully violated a UIRT vertical scaling assumption. Three sets of simulated tests were created to test the effect of violating a single assumption. First, simulated tests were created with increasing, decreasing, low, medium, and high local item dependence. Second, multidimensional simulated tests were created by varying the correlation between dimensions. Third, simulated tests with dissimilar reliability were created by varying item parameters characteristics of the grade level tests. Multiple versions of twelve simulated tests were used to investigate UIRT vertical scaling assumption violations. The simulated tests were calibrated under the UIRT model to purposefully violate an assumption of UIRT vertical scaling. Each simulated test version was replicated for 1000 random examinee samples to assess the bias and standard error of estimated grade-to-grade-growth, within-grade-variability, and separation-of-grade-distributions (effect size) of the estimated developmental score scales. The results suggest that when UIRT vertical scaling assumptions are violated the resulting estimated developmental score scales contain standard error and bias. For this study, the magnitude of standard error was similar across all simulated tests regardless of the assumption violation. However, bias fluctuated as a result of different types and magnitudes of UIRT vertical scaling assumption violations. More local item dependence resulted in more grade-to-grade-growth and separation-of-grade-distributions bias. And local item dependence resulted in developmental score scales that displayed scale expansion. Multidimensionality resulted in more grade-to-grade-growth and separation-of-grade-distributions bias when the correlation between dimensions was smaller. Multidimensionality resulted in developmental score scales that displayed scale expansion. Dissimilar reliability of grade level tests resulted in more grade-to-grade-growth bias and minimal separation-of-grade-distributions bias. Dissimilar reliability of grade level tests resulted in scale expansion or scale shrinkage depending on the item characteristics of the test. Limitations of this study and future research are discussed.
135

Simple structure MIRT equating for multidimensional tests

Kim, Stella Yun 01 May 2018 (has links)
Equating is a statistical process used to accomplish score comparability so that the scores from the different test forms can be used interchangeably. One of the most widely used equating procedures is unidimensional item response theory (UIRT) equating, which requires a set of assumptions about the data structure. In particular, the essence of UIRT rests on the unidimensionality assumption, which requires that a test measures only a single ability. However, this assumption is not likely to be fulfilled for many real data such as mixed-format tests or tests composed of several content subdomains: failure to satisfy the assumption threatens the accuracy of the estimated equating relationships. The main purpose of this dissertation was to contribute to the literature on multidimensional item response theory (MIRT) equating by developing a theoretical and conceptual framework for true-score equating using a simple-structure MIRT model (SS-MIRT). SS-MIRT has several advantages over other complex MIRT models such as improved efficiency in estimation and a straightforward interpretability. In this dissertation, the performance of the SS-MIRT true-score equating procedure (SMT) was examined and evaluated through four studies using different data types: (1) real data, (2) simulated data, (3) pseudo forms data, and (4) intact single form data with identity equating. Besides SMT, four competitors were included in the analyses in order to assess the relative benefits of SMT over the other procedures: (a) equipercentile equating with presmoothing, (b) UIRT true-score equating, (c) UIRT observed-score equating, and (d) SS-MIRT observed-score equating. In general, the proposed SMT procedure behaved similarly to the existing procedures. Also, SMT showed more accurate equating results compared to the traditional UIRT equating. Better performance of SMT over UIRT true-score equating was consistently observed across the three studies that employed different criterion relationships with different datasets, which strongly supports the benefit of a multidimensional approach to equating with multidimensional data.
136

Improving the Transition Readiness Assessment Questionnaire (TRAQ) using Item Response Theory

Wood, David L., Johnson, Kiana R., McBee, Matthew 01 January 2017 (has links)
Background: Measuring the acquisition of self-management and health care utilization skills are part of evidence based transition practice. The Transition Readiness Assessment Questionnaire (TRAQ) is a validated 20-question and 5-factor instrument with a 5-point Likert response set using a Stages of Change Framework. Objective: To improve the performance of the TRAQ and allow more precise measurement across the full range of transition readiness skills (from precontemplation to initiation to mastery). Design/Methods: On data from 506 previously completed TRAQs collected from several clinical practices we used MPlus v.7.4 to apply a graded response model (GRM), examining item discrimination and difficulty. New questions were written and added across all domains to increase the difficulty and discrimination of the overall scale. To evaluate the performance of new items and the resulting factor structure of the revised scale we fielded a new version of the TRAQ (with a total of 30 items) using an online anonymous survey of first year college students (in process). Results: We eliminated the five least discriminating TRAQ items with minimal impact to the conditional test information. After item elimination (k = 15) the factor structure of the instrument was maintained with good quality, ?2 (86) = 365.447, CFI = 0.977, RMSEA = 0.079, WRMR = 1.017. We also found that a majority of items could reliably discriminate only across lower levels of transition readiness (precontemplation to initiation) but could not discriminate at higher levels of transition readiness (action and mastery). Therefore we wrote 15 additional items intended to have higher difficulty. On the new 30 item TRAQ, confirmatory factor analysis, internal reliability and IRT results will be reported from a large sample of college students Conclusion(s): Using IRT and factor analyses we eliminated 5 of 20 TRAQ items that were poorly discriminating. We found that many of the items in the TRAQ could discriminate among those in the early stages of transition readiness, but could not discriminate among those in later stages of transition readiness. To have a more robust measure of transition readiness we added more difficult items and are evaluating the scale’s psychometric properties.
137

Using Item Response Theory to Develop a Shorter Version of the Transition Readiness Assessment Questionnaire (TRAQ)

Johnson, K. R., McBee, A. L., Wood, David L. 01 January 2016 (has links)
No description available.
138

Augmented testing and effects on item and proficiency estimates in different calibration designs

Wall, Nathan Lane 01 May 2011 (has links)
Broadening the term augmented testing to include a combination of multiple measures to assess examinee performance on a single construct, the issues of IRT item parameter and proficiency estimates were investigated. The intent of this dissertation is to determine if different IRT calibration designs result in differences to item and proficiency parameter estimates and to understand the nature of those differences. Examinees were sampled from a testing program in which each examinee was administered three mathematics assessments measuring a broad mathematics domain at the high school level. This sample of examinees was used to perform a real data analysis to investigate the item and proficiency estimates. A simulation study was also conducted based upon the real data. The factors investigated for the real data study included three IRT calibration designs and two IRT models. The calibration designs included: separately calibrating each assessment, calibrating all assessments in one joint calibration, and separately calibrating items in three distinct content areas. Joint calibration refers to the use of IRT methodology to calibrate two or more tests, which have been administered to a single group, together so as to place all of the items on a common scale. The two IRT models were the one- and three-parameter logistic model. Also investigated were five proficiency estimators: maximum likelihood estimates, expected a posteriori, maximum a posteriori, summed-score EAP, and test characteristic curve estimates. The simulation study included the same calibration designs and IRT models but the data were simulated with varying levels of correlations among the proficiencies to determine the affect upon the item parameter estimates. The main findings indicate that item parameter and proficiency estimates are affected by the IRT calibration design. The discrimination parameter estimates of the three-parameter model were larger when calibrated under the joint calibration design for one assessment but not for the other two. Noting that equal item discrimination is an assumption of the 1-PL model, this finding raises questions as to the degree of model fit when the 1-PL model is used. Items on a second assessment had lower difficulty parameters in the joint calibration design while the item parameter estimates of the other two assessments were higher. Differences in proficiency estimates between calibration designs were also discovered, which were found to result in examinees being inconsistently classified into performance categories. Differences were observed in regards to the choice of IRT model. Finally, as the level of correlation among proficiencies increased in the simulation data, the differences observed in the item parameter estimates were decreased. Based upon the findings, IRT item parameter estimates resulting from differing calibrations designs should not be used interchangeably. Practitioners who use item pools should base the pool refreshment calibration design upon the one used to originally create the pool. Limitations to this study include the use of a single dataset consisting of high school examinees in only one subject area, thus the degree of generalization regarding research findings to other content areas of grade levels should be made with caution.
139

NKTS, SOFIE e ESQUADA: escalas para avaliar o conhecimento nutricional, as motivações para escolhas alimentares e a qualidade da dieta com aplicação da Teoria de Resposta ao Item / NKTS, SOFIE and ESQUADA: scales to evaluate the nutritional knowledge, the motivations influencing food choices, and the quality of diet using the Item Response Theory

Santos, Thanise Sabrina Souza 27 August 2019 (has links)
Introdução - O estudo do conhecimento nutricional, das motivações para as escolhas alimentares e da qualidade da dieta traz informações importantes para o controle das crescentes prevalências de excesso de peso e doenças crônicas. Apesar da eficiência das estratégias de controle depender de uma escala de boa qualidade, se desconhece a precisão das escalas existentes ou estas não estão em acordo com recomendações vigentes. Objetivo - Desenvolver escalas para avaliar o conhecimento nutricional, a motivação de saúde para escolhas alimentares e a qualidade da dieta com aplicação da Teoria de Resposta ao Item (TRI). Métodos - Para desenvolver as escalas de conhecimento nutricional (NKTS) e motivação de saúde para as escolhas alimentares (SOFIE) foram utilizados questionários já existentes e utilizados no estudo HELENA, uma investigação multicêntrica da União Europeia: Nutritional Knowledge Test (NKT) e Food Choices and Preferences (FCP). Para desenvolver a escala de qualidade da dieta (ESQUADA) foi desenvolvido um questionário, baseado no Guia Alimentar para a População Brasileira e cuja relevância e compreensão foram estudadas a partir das sugestões de nutricionistas, em grupos focais, e adolescentes e jovens adultos brasileiros, em um questionário online. Para cada escala, a dimensionalidade dos itens foi estudada separadamente pela análise fatorial exploratória. A TRI foi aplicada para identificar os itens com melhor discriminação da informação de interesse, bem como localizá-los nos diferentes níveis do continuum e calcular os escores dos indivíduos. Foi verificada a associação entre os escores de NKTS e SOFIE com consumo de alimentos e biomarcadores nutricionais, avaliados no estudo HELENA. Para construir as escalas foram utilizados BILOG-MG versão 3, GGUM 2004 e o R. As sugestões dos nutricionistas foram analisadas no MAXQDA versão 12. As sugestões dos adolescentes e jovens adultos e a caracterização das escalas foram realizadas no Microsoft Office Excel versão 2013. O estudo das associações foi realizado no Stata versão 14. Resultados - A análise fatorial e a TRI indicaram que onze itens do NKT e dezesseis itens do FCP avaliam adequadamente o conhecimento nutricional e a motivação de saúde para as escolhas alimentares, compondo, respectivamente, NKTS e SOFIE. NKTS identifica indivíduos com conhecimento nutricional básico, adequado e avançado. SOFIE classifica indivíduos com baixa, indiferente e alta motivação de saúde para as escolhas alimentares. Os escores de NKTS e SOFIE associaram positivamente com marcadores de alimentação saudável. Em relação à ESQUADA, os nutricionistas consideraram os itens relevantes para avaliar a qualidade da dieta, mas indicaram a necessidade de alterar a escrita de alguns, bem como de suas alternativas de resposta. Todos os itens foram facilmente compreendidos por adolescentes e adultos jovens. A análise fatorial e a TRI reteram 25 itens para a ESQUADA, possibilitando identificar cinco níveis de qualidade da dieta. Conclusão - NKTS, SOFIE e ESQUADA, respectivamente, apresentaram medidas acuradas do conhecimento nutricional, das motivações de saúde para as escolhas alimentares e da qualidade da dieta, permitindo caracterizar seus diferentes níveis. Outros estudos podem analisar a relação entre ESQUADA e marcadores de consumo alimentar, bem como selecionar as escalas desenvolvidas para avaliação em outras populações. / Introduction - The study of the nutritional knowledge, the motivations to food choices, and the diet quality provides important information to control the increasing prevalences of overweight and chronic diseases. Although the effectiveness of the control strategies depends on a good quality scale, the accuracy of existing scales is unknown or these scales are not in line with current recommendations. Objective - This thesis aimed to develop scales to evaluate the nutritional knowledge, health motivation influencing food choices, and diet quality using the IRT analysis. Methods - Due to develop the scales of the nutritional knowledge (NKTS) and health motivation to food choices (SOFIE), questionnaires already used in the HELENA study, a European multicenter research, were used: Nutritional Knowlwdge Test (NKT) and Food Choices and Preferences (FCP). Due to develop the scale of diet quality (ESQUADA), a questionnaire, based on the Food Guide for the Brazilian Population, was created. The relevance and laypersons\' comprehension of this questionnaire were studied based on the suggestions from: nutritionists using focus groups\' discussions and Brazilian adolescents and young adults using an online questionnaire. The dimensionality of the items was analysed separately using the exploratory factor analysis for each scale. The IRT analysis was applied to identify the items with the best discrimination of the information of interest, as well as to locate them at the different levels of the continuum and to calculate the IRT scores. The association between IRT scores, food consumption and nutritional biomarkers was analyzed around NKTS and SOFIE. The softwares BILOG-MG version 3, GGUM 2004, and R were used to construct the scales. The nutritionists\' suggestions were analyzed in MAXQDA version 12. The laypersons\' suggestions and the characterization of each scale level were performed in Microsoft Office Excel version 2013. The study of association was performed in Stata version 14. Results - Factor analysis and IRT analysis indicated that eleven items from NKT and sixteen items from FCP adequately evaluate the nutritional knowledge and health motivation influencing food choices and compose NKTS and SOFIE, respectively. NKTS identifies individuals with basic, adequate, and advanced nutritional knowledge. SOFIE classifies individuals with low, indifferent, and high health motivation influencing food choices. The scores from NKTS and SOFIE were positively associated with healthy food markers. Regarding the ESQUADA, the nutritionists considered the items relevant to assess the quality of diet. However, they indicated the need to change the writing of some items as well as their response options. Adolescents and young adults easily understood all items. The factor analysis and IRT analysis retained 25 items in ESQUADA, possibiting to identify five levels of diet quality. Conclusion - NKTS, SOFIE, and ESQUADA, respectively, presented accurate measures of nutritional knowledge, health motivation to food choices, and quality of diet, allowing the characterization of their different levels. Other studies may analyze the relationship between ESQUADA and food consumption, as well as select the developed scales to evaluate these latent traits in other populations.
140

Health Knowledge & Health Behavior Outcomes in Adolescents with Elevated Blood Pressure

Fitzpatrick, Stephanie L 24 May 2011 (has links)
The purpose of this current study was to examine the influence of cardiovascular health knowledge on dietary and physical activity changes in 15-17 year olds with elevated blood pressure. The sample consisted of 167 adolescents randomized into one of three treatment conditions (minimal, moderate, or intense). Each adolescent completed a fitness test (peak VO2), 24-hour dietary recall, 7 Day Activity Recall (kilocalories expended per day), Self-efficacy Questionnaire, and Stages of Change Questionnaire every three months. The Health Knowledge Assessment was given at baseline and at post-intervention. Classical test theory, confirmatory factor analysis, and item response theory frameworks were applied to examine psychometric properties of the Health Knowledge Assessment. Structural equation modeling was used to examine the change in health behaviors and the relationship with health knowledge, self-efficacy, and readiness for change. The 34-item Health Knowledge Assessment had good internal consistency and the items loaded onto a single factor at pretest and posttest. Furthermore, there was a good distribution of easy, moderate, and hard items at pretest, but additional hard items were needed at posttest. There were no treatment condition differences in level of health knowledge at pretest. The intense condition had significantly higher health knowledge than the minimal and moderate conditions at posttest; level of health knowledge for the moderate condition was significantly higher than the minimal condition at posttest. Level of nutrition knowledge at posttest was not associated with any of the dietary intake variables nor was level of exercise knowledge associated with the two physical activity variables at post-intervention. However, there was a marginally significant association between level of nutrition knowledge and nutrition self-efficacy at posttest. Nutrition self-efficacy and nutrition readiness for change at posttest were also associated with a decrease in sugar consumption at post-intervention. Implications of this study suggest that a cardiovascular health intervention for adolescents with elevated blood pressure, consisting of group sessions and/or individual sessions over the course of three to six months, was effective in terms of increasing cardiovascular health knowledge, self-efficacy, and readiness for change. Nonetheless, the role that health knowledge plays in health behavior change needs to be further examined.

Page generated in 0.0375 seconds