51 |
Effect of Unequal Sample Sizes on the Power of DIF Detection: An IRT-Based Monte Carlo Study with SIBTEST and Mantel-Haenszel ProceduresAwuor, Risper Akelo 04 August 2008 (has links)
This simulation study focused on determining the effect of unequal sample sizes on statistical power of SIBTEST and Mantel-Haenszel procedures for detection of DIF of moderate and large magnitudes. Item parameters were estimated by, and generated with the 2PLM using WinGen2 (Han, 2006). MULTISIM was used to simulate ability estimates and to generate response data that were analyzed by SIBTEST. The SIBTEST procedure with regression correction was used to calculate the DIF statistics, namely the DIF effect size and the statistical significance of the bias. The older SIBTEST was used to calculate the DIF statistics for the M-H procedure. SAS provided the environment in which the ability parameters were simulated; response data generated and DIF analyses conducted. Test items were observed to determine if a priori manipulated items demonstrated DIF. The study results indicated that with unequal samples in any ratio, M-H had better Type I error rate control than SIBTEST. The results also indicated that not only the ratios, but also the sample size and the magnitude of DIF influenced the behavior of SIBTEST and M-H with regard to their error rate behavior. With small samples and moderate DIF magnitude, Type II errors were committed by both M-H and SIBTEST when the reference to focal group sample size ratio was 1:.10 due to low observed statistical power and inflated Type I error rates. / Ph. D.
|
52 |
Análise de questionários com itens constrangedores / Analysis of questionnaire with embarrassing itemsCúri, Mariana 11 August 2006 (has links)
As pesquisas científicas na área da Psiquiatria freqüentemente avaliam características subjetivas de indivíduos como, por exemplo, depressão, ansiedade e fobias. Os dados são coletados através de questionários, cujos itens tentam identificar a presença ou ausência de certos sintomas associados à morbidade psiquiátrica de interesse. Alguns desses itens, entretanto, podem provocar constrangimento em parte dos indivíduos respondedores por abordarem características ou comportamentos socialmente questionáveis ou, até, ilegais. Um modelo da teoria de resposta ao item é proposto neste trabalho visando diferenciar a relação entre a probabilidade de presença do sintoma e a gravidade da morbidade de indivíduos constrangidos e não constrangidos. Itens que necessitam dessa diferenciação são chamados \\textbf{itens com comportamento diferencial}. Adicionalmente, o modelo permite assumir que indivíduos constrangidos em responder um item possam vir a mentir em suas respostas, no sentido de omitir a presença de um sintoma. Aplicações do modelo proposto a dados simulados para questionários com 20 itens mostraram que as estimativas dos parâmetros são próximas aos seus verdadeiros valores. A qualidade das estimativas piora com a diminuição da amostra de indivíduos, com o aumento do número de itens com comportamento diferencial e, principalmente, com o aumento do número de itens com comportamento diferencial suscetíveis à mentira. A aplicação do modelo a um conjunto de dados reais, coletados para avaliar depressão em adolescentes, ilustra a diferença do padrão de resposta do item ``crises de choro\" entre homens e mulheres. / Psychiatric scientific research often evaluate subjective characteristics of the individual such as depression, anxiety and phobias. Data are collected through questionnaires with items that try to identify the presence or absence of certain symptoms associated with the psychiatric disease. Some of these items though could make some people embarrassed since they are related to questionable or even illegal social behaviors. The item response theory model proposed within this work envisions to differentiate the relationship between the probability of the symptom presence and the gravity of the disease of embarrassed and non-embarrassed individuals. Items that need this differentiation are called differential item functioning (dif). Additionally, the model has the assumption that individuals embarrassed with one particular item could lie across other answers to omit a possible condition. Applications of the proposed model to simulated data for a 20-item questionnaire have showed that parameter estimates of the proposed model are close to their real values. The estimate accuracy gets worse as the number of individuals decreases, the number of dif increases, and especially as the number of dif susceptible to lying increases. The application of the model to a group of real data, collected to evaluate teenager depression, shows the difference in the probability of \"crying crisis\" presence between men and women.
|
53 |
An Examination of the Psychometric Properties of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors: An Item Response Theory ApproachMoulton, Sara E. 01 December 2016 (has links)
This research study examined the psychometric properties of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors (SRSS-IE) using Item Response Theory (IRT) methods among a sample of 2,122 middle school students. The SRSS-IE is a recently revised screening instrument aimed at identifying students who are potentially at risk for emotional and behavioral disorders (EBD). There are two studies included in this research. Study 1 utilized the Nominal Response and Generalized Partial Credit models of IRT to evaluate items from the SRSS-IE in terms of the degree to which the response options for each item functioned as intended by the scale developers and how well those response options discriminated among students who exhibited varying levels of EBD risk. Results from this first study indicated that the four response option configurations of the items on the SRSS-IE may not adequately discriminate among the frequency of externalizing and internalizing behaviors demonstrated by middle school students. Recommendations for item response option revisions or scale scoring revisions are discussed in this study. In study 2, differential item functioning (DIF) and differential step functioning (DSF) methods were used to examine differences in item and response option functioning according to student gender variables. Additionally, test information functions (TIFs) were used to determine whether preliminary recommendations for cut scores differ by gender. Results of this second study indicate that two of the items on the SRSS-IE systematically favor males over females and one item systematically favors females over males. Additionally, examination of TIFs demonstrated different degrees of measurement precision at various levels of theta for males and females on both the externalizing and internalizing constructs. Implications of these results are discussed in relation to possible revisions of the SRSS-IE items, cut scores, or scale scoring procedures.
|
54 |
Occupational performance in school settings : evaluation and intervention using the school AMPSMunkholm, Michaela January 2010 (has links)
Background: This thesis is was designed to evaluate aspects of reliability and validity of the School Version of the Assessment of Motor and Process Skills (School AMPS) (Fisher, Bryze, Hume, & Griswold, 2007), an observation-based evaluation of quality of occupational performance when children perform schoolwork tasks in school settings. The long term goal was to contribute to knowledge about children at risk or with mild disabilities who experience difficulties with occupational performance in school settings, and describe how the School AMPS can be used when a true top−down process of planning and implementing school-based occupational therapy services is implemented in a Swedish context. Methods: In Study I, two different split-half methods and were used to estimate reliability of the School AMPS measures. These were cross-validated using Rasch equivalent of Cronbach’s alpha. The standard error of measurement (m) was also calculated. In Studies II and III, many-facet Rasch analyses and/or relevant inferential statistics (e.g., ANOVA, tests) were used to examine for evidence of validity based on (1) internal structure related to differential item functioning (DIF), (2) relations to other variables (sensitivity) in terms of comparing groups (typically-developing children vs. children with mild disabilities), and (3) consequences of testing (benefits of testing) in terms of test fairness. In Study IV, ANOVA and tests were used to examine relations to other variables in terms of sensitivity of the School AMPS measures for detecting change based on repeated School AMPS evaluations pre- and post-interventions. Results: The three methods for estimating reliability of the School AMPS measures yielded high reliability coefficient estimates (≥0.73) and low ms. Minimal DIF was identified, and despite minimal DIF, the School AMPS measures were found to be free of differential test functioning. The School AMPS measures were sensitive enough to detect differences between groups as well as changes following consultative occupational therapy services provided in natural school settings. Conclusions: The results support the reliability and validity of the School AMPS scales and measures when used to evaluate quality of occupational performance in school settings. The results are also of clinical importance as they provide evidence that occupational therapists can have confidence in the School AMPS measures when they are used in the process of making decisions about individual students, planning interventions, and later perform follow-up evaluations to measure the outcomes. We also have objective evidence that children with mild disabilities demonstrate diminished quality of "doing" when performing schoolwork tasks. The potential long term benefits of such evidence may be to support or justify the need for children with mild disabilities to receive occupational therapy services within school settings in Sweden; and through collaboration with teachers, plan and implement better targeted and more effective interventions.
|
55 |
Detection and Classification of DIF Types Using Parametric and Nonparametric Methods: A comparison of the IRT-Likelihood Ratio Test, Crossing-SIBTEST, and Logistic Regression ProceduresLopez, Gabriel E. 01 January 2012 (has links)
The purpose of this investigation was to compare the efficacy of three methods for detecting differential item functioning (DIF). The performance of the crossing simultaneous item bias test (CSIBTEST), the item response theory likelihood ratio test (IRT-LR), and logistic regression (LOGREG) was examined across a range of experimental conditions including different test lengths, sample sizes, DIF and differential test functioning (DTF) magnitudes, and mean differences in the underlying trait distributions of comparison groups, herein referred to as the reference and focal groups. In addition, each procedure was implemented using both an all-other anchor approach, in which the IRT-LR baseline model, CSIBEST matching subtest, and LOGREG trait estimate were based on all test items except for the one under study, and a constant anchor approach, in which the baseline model, matching subtest, and trait estimate were based on a predefined subset of DIF-free items. Response data for the reference and focal groups were generated using known item parameters based on the three-parameter logistic item response theory model (3-PLM). Various types of DIF were simulated by shifting the generating item parameters of select items to achieve desired DIF and DTF magnitudes based on the area between the groups' item response functions. Power, Type I error, and Type III error rates were computed for each experimental condition based on 100 replications and effects analyzed via ANOVA. Results indicated that the procedures varied in efficacy, with LOGREG when implemented using an all-other approach providing the best balance of power and Type I error rate. However, none of the procedures were effective at identifying the type of DIF that was simulated.
|
56 |
Burnout, work, stress of conscience and coping among female and male patrolling police officers / Utbrändhet, arbete, samvetsstress och coping hos kvinnliga och manliga poliserBackteman-Erlanson, Susann January 2013 (has links)
Background. Police work is a stressful occupation with frequent exposure to traumatic events and psychological strain from work might increase the risk of burnout. This thesis focuses on patrolling police officers (PPO), who work most of their time in the community and have daily contact with the public. Since police work traditionally is a male coded occupation we assume that there are differences between women and men in burnout as well as experiences from psychosocial work environment. Aim. The overall aim of this thesis is to explore burnout, psychosocial and physical work environment, coping strategies, and stress of conscience when taking gender into consideration among patrolling police officers. Methods. This thesis employs both qualitative and quantitative methods. In Paper I a qualitative approach with narrative interviews was used where male PPO described experiences of traumatic situations when caring for victims of traffic accidents. A convenience sample of nine male PPO from a mid-sized police authority was recruited. Interviews were analyzed using qualitative content analysis. Papers II, III, and IV were based on a cross-sectional survey from a randomly selected sample stratified for gender from all 21 local police authorities in Sweden. In the final sample, 1554 PPOs were invited (778 women, 776 men), response rate was 55% (n=856) in total, 56% for women (n=437) and 53% for men (n=419). The survey included a self-administered questionnaire based on instruments measuring burnout, stress of conscience, psychosocial and physical work environment, and coping. Results. Findings from Paper I were presented in three themes; “being secure with the support system,” “being confident about prior successful actions,” and “being burdened with uncertainty.” Results from Paper II showed high levels of emotional exhaustion (EE), 30% for female PPOs and 26% for male PPOs. High levels of depersonalization (DP) were reported for 52 % of female PPO, corresponding proportions for male were 60%. Multiple logistic regression showed that stress of conscience (SCQ-A), high demand, and organizational climate increased the risk of EE for female PPO. For male PPO stress of conscience (SCQ-A), low control and high demand increased the risk of EE. Independent of gender, stress of conscience (SCQ-A) increased the risk of DP. Psychometric properties of the WOCQ were investigated with exploratory factor analysis and confirmatory factor analysis, a six-factor solution was confirmed. DIF analysis was detected for a third of the items in relation to gender. In Paper IV a block wise hierarchical multiple regression analysis was performed investigating the predictive impact of psychological demand, decision latitude, social support, coping strategies, and stress of conscience on EE as well as DP. Findings revealed that, regardless of gender, risk of EE and DP increased with a troubled conscience amongst the PPO. Conclusion. “Being burdened with uncertainty” in this male-dominated context indicate that the PPO did not feel confident talking about traumatic situations, which might influence their coping strategies when arriving to a similar situation. This finding can be related to Paper II and IV showing that stress of conscience increased the risk of both EE and DP. The associations between troubled conscience and the risk of experiencing both emotional exhaustion and depersonalization indicate that stress of conscience should be considered when studying the influence of the psychosocial work environment on burnout. Results from this study show that the psychosocial work environment is not satisfying and needs improvement for patrolling police officers in Sweden. Further studies including both qualitative and quantitative (longitudinal) methods should be used to improve knowledge in this area to increase conditions for preventive and rehabilitative actions.
|
57 |
Towards establishing the equivalence of the IsiXhosa and English versions of the Woodcok Munoz language survey : an item and construct bias analysis of the verbal analogies scaleRoomaney, Rizwana January 2010 (has links)
This study formed part of a larger project that is concerned with the adaptation of a test of cognitive academic language proficiency, the Woodcock Muñoz Language Survey (WMLS). The WMLS has been adapted from English into isiXhosa and the present study is located within the broader study that is concerned with establishing overall equivalence between the two language versions of the WMLS. It was primarily concerned with the Verbal Analogies (VA) scale. Previous research on this scale has demonstrated promising results, but continues to find evidence of some inequivalence. This study aimed to cross-validate previous research on the two language versions of the WMLS and improve on methodological issues by employing matched groups. It drew upon an existing dataset from the larger research project. The study employed a monolingual matched two-group design consisting of 150 mainly English speaking and 149 mainly isiXhosa learners in grades 6 and 7. This study had two sub aims. The first was to investigate item bias by identifying DIF items in the VA scale across the isiXhosa and English by conducting a logistic regression and Mantel-Haenszel procedure. Five items were identified by both techniques as DIF. The second sub aim was to evaluate construct equivalence between the isiXhosa and English versions of the WMLS on the VA scale by conducting a factor analysis on the tests after removal of DIF items. Two factors were requested during the factor analysis. The first factor displayed significant loadings across both language versions and was identified as a stable factor. This was confirmed by the Tucker’s Phi and scatter plot. The second factor was stable for the English version but not for the isiXhosa version. The Tucker’s phi and scatter plot indicated that this factor is not structurally equivalent across the two language versions / Magister Artium (Psychology) - MA(Psych)
|
58 |
混合試題與受試者模型於試題差異功能分析之研究 / A Mixture Items-and-Examinees Model Analysis on Differential Item Functioning黃馨瑩, Huang, Hsin Ying Unknown Date (has links)
依據「多層次混合試題反應理論」與「隨機試題混合模型」,本研究提出「混合試題與受試者模型」。本研究旨在評估此模型在不同樣本數、不同試題差異功能的試題數下,偵測試題差異功能的表現,以及其參數回復性情形。研究結果顯示,「混合試題與受試者模型」在樣本數大、試題差異功能試題數較多之情境下,具有正確的參數回復性,能正確判斷出試題是否存在試題差異功能,且具有良好的難度估計值,並能將樣本正確地分群,其也與「隨機試題混合模型」的估計表現頗為相近。建議未來可將「混合試題與受試者模型」應用於大型教育資料庫相關研究上,並加入其他變項後進一步探討。 / Drawing upon the framework of the multilevel mixture item response theory model and the random item mixture model, the study attempts to propose one model, called the mixture items and examinees model(MIE model). The purpose of this study was to assess the respective performances of the model on different sample-sizes and differential item functioning (DIF) items. Particularly, the study assessed the model performances in the detection of DIF items, and the accurate parameters recovery. The results of the study revealed that with large sample-sizes and more DIF items, the MIE model had the good parameters recovery, the accurate detection of the DIF items, the good estimate of the item difficulty, and the accurate classifications of the sub-samples. These model performances appeared similar to those of the random item mixture model. The findings suggest that future studies should apply the MIE model to the analyses on large-scale education databases, and should add more variables to the MIE model.
|
59 |
The Differential Item Functioning (dif) Analysis Of Mathematics Items In The International Assessment ProgramsYildirim, Huseyin Husnu 01 April 2006 (has links) (PDF)
Cross-cultural studies, like TIMSS and PISA 2003, are being conducted since 1960s with an idea that these assessments can provide a broad perspective for evaluating and improving education. In addition countries can assess their relative positions in mathematics achievement among their competitors in the global world. However, because of the different cultural and language settings of different countries, these international tests may not be functioning as expected across all the countries. Thus, tests may not be equivalent, or fair, linguistically and culturally across the participating countries. In this conte! ! xt, the present study aimed at assessing the equivalence of mathematics items of TIMSS 1999 and PISA 2003 across cultures and languages, to fin! d out if mathematics achievement possesses any culture specifi! c aspect s. For this purpose, the present study assessed Turkish and English versions of TIMSS 1999 and PISA 2003 mathematics items with respect to, (a) psychometric characteristics of items, and (b) possible sources of Differential Item Functioning (DIF) between these two versions. The study used Restricted Factor Analysis, Mantel-Haenzsel Statistics and Item Response Theory Likelihood Ratio methodologies to determine DIF items. The results revealed that there were adaptation problems in both TIMSS and PISA studies. However it was still possible to determine a subtest of items functioning fairly between cultures, to form a basis for a cross-cultural comparison. In PISA, there was a high rate of agreement among the DIF methodologies used. However, in TIMSS, the agree! ment ra! te decreased considerably possibly because the rate o! f differ e! ntially functioning items within TIMSS was higher, and differential guessing and differential discriminating were also issues in the test. The study! also revealed that items requiring competencies of reproduction of practiced knowledge, knowledge of facts, performance of routine procedures, application of technical skills were less likely to be biased against Turkish students with respect to American students at the same ability level. On the other hand, items requiring students to communicate mathematically, items where various results must be compared, and items that had real-world context were less likely to be in favor of Turkish students.
|
60 |
A Multivariate Analysis In Detecting Differentially Functioning Items Through The Use Of Programme For Internetional Student Assessment (pisa) 2003 Mathematics Literacy ItemsCet, Selda 01 April 2006 (has links) (PDF)
Differential Item Functioning Analyses investigates whether individuals with same ability in different groups also show similar performance on an item. In matching the individuals of the same ability, most of the methodologies use total scores of the tests which are usually constructed to be unidimensional. th purpose of the present study is evaluating the PISA 2003 mathematics literacy items through the use of DIF methodology which uses a multidimensional approach in matching students instead of single total score, improve the matching for DIF analyses.
In the study factor structure of the tests will be determeined via both exploratory and confirmatory analyses in a complimentary fashion. then DIF analyses conducted using Logistic regression (LR) and Mantel -Haenszel methods.Analyses showed that the matching criterion improved when multivariate analyses were used. the number of DIF items was decreased when the matching criterion is defined based on multiple criterion scores such as mathematical literacy and problem solving scores or two different mathematical literacy subtest score.
In addition, qualitative reviews and examination of the distribution of DIF items by content categories, cognitive demands, item types,item text, visual-spatial factors and linguistic properties of items were analyzed to explain the differential performance. Curriculum, cultural and translation differences were the main criteria for the qualitative analyses of DIF items. The results imply that curriculum and translation differences in items might be causing the DIF across Turkish and English versions of the tests.
|
Page generated in 0.1614 seconds