Global ETD Search

61	The Differential Item Functioning (dif) Analysis Of Mathematics Items In The International Assessment Programs Yildirim, Huseyin Husnu 01 April 2006 (has links) (PDF) Cross-cultural studies, like TIMSS and PISA 2003, are being conducted since 1960s with an idea that these assessments can provide a broad perspective for evaluating and improving education. In addition countries can assess their relative positions in mathematics achievement among their competitors in the global world. However, because of the different cultural and language settings of different countries, these international tests may not be functioning as expected across all the countries. Thus, tests may not be equivalent, or fair, linguistically and culturally across the participating countries. In this conte! ! xt, the present study aimed at assessing the equivalence of mathematics items of TIMSS 1999 and PISA 2003 across cultures and languages, to fin! d out if mathematics achievement possesses any culture specifi! c aspect s. For this purpose, the present study assessed Turkish and English versions of TIMSS 1999 and PISA 2003 mathematics items with respect to, (a) psychometric characteristics of items, and (b) possible sources of Differential Item Functioning (DIF) between these two versions. The study used Restricted Factor Analysis, Mantel-Haenzsel Statistics and Item Response Theory Likelihood Ratio methodologies to determine DIF items. The results revealed that there were adaptation problems in both TIMSS and PISA studies. However it was still possible to determine a subtest of items functioning fairly between cultures, to form a basis for a cross-cultural comparison. In PISA, there was a high rate of agreement among the DIF methodologies used. However, in TIMSS, the agree! ment ra! te decreased considerably possibly because the rate o! f differ e! ntially functioning items within TIMSS was higher, and differential guessing and differential discriminating were also issues in the test. The study! also revealed that items requiring competencies of reproduction of practiced knowledge, knowledge of facts, performance of routine procedures, application of technical skills were less likely to be biased against Turkish students with respect to American students at the same ability level. On the other hand, items requiring students to communicate mathematically, items where various results must be compared, and items that had real-world context were less likely to be in favor of Turkish students.
62	A Multivariate Analysis In Detecting Differentially Functioning Items Through The Use Of Programme For Internetional Student Assessment (pisa) 2003 Mathematics Literacy Items Cet, Selda 01 April 2006 (has links) (PDF) Differential Item Functioning Analyses investigates whether individuals with same ability in different groups also show similar performance on an item. In matching the individuals of the same ability, most of the methodologies use total scores of the tests which are usually constructed to be unidimensional. th purpose of the present study is evaluating the PISA 2003 mathematics literacy items through the use of DIF methodology which uses a multidimensional approach in matching students instead of single total score, improve the matching for DIF analyses. In the study factor structure of the tests will be determeined via both exploratory and confirmatory analyses in a complimentary fashion. then DIF analyses conducted using Logistic regression (LR) and Mantel -Haenszel methods.Analyses showed that the matching criterion improved when multivariate analyses were used. the number of DIF items was decreased when the matching criterion is defined based on multiple criterion scores such as mathematical literacy and problem solving scores or two different mathematical literacy subtest score. In addition, qualitative reviews and examination of the distribution of DIF items by content categories, cognitive demands, item types,item text, visual-spatial factors and linguistic properties of items were analyzed to explain the differential performance. Curriculum, cultural and translation differences were the main criteria for the qualitative analyses of DIF items. The results imply that curriculum and translation differences in items might be causing the DIF across Turkish and English versions of the tests.
63	Análise de questionários com itens constrangedores / Analysis of questionnaire with embarrassing items Mariana Cúri 11 August 2006 (has links) As pesquisas científicas na área da Psiquiatria freqüentemente avaliam características subjetivas de indivíduos como, por exemplo, depressão, ansiedade e fobias. Os dados são coletados através de questionários, cujos itens tentam identificar a presença ou ausência de certos sintomas associados à morbidade psiquiátrica de interesse. Alguns desses itens, entretanto, podem provocar constrangimento em parte dos indivíduos respondedores por abordarem características ou comportamentos socialmente questionáveis ou, até, ilegais. Um modelo da teoria de resposta ao item é proposto neste trabalho visando diferenciar a relação entre a probabilidade de presença do sintoma e a gravidade da morbidade de indivíduos constrangidos e não constrangidos. Itens que necessitam dessa diferenciação são chamados \\textbf{itens com comportamento diferencial}. Adicionalmente, o modelo permite assumir que indivíduos constrangidos em responder um item possam vir a mentir em suas respostas, no sentido de omitir a presença de um sintoma. Aplicações do modelo proposto a dados simulados para questionários com 20 itens mostraram que as estimativas dos parâmetros são próximas aos seus verdadeiros valores. A qualidade das estimativas piora com a diminuição da amostra de indivíduos, com o aumento do número de itens com comportamento diferencial e, principalmente, com o aumento do número de itens com comportamento diferencial suscetíveis à mentira. A aplicação do modelo a um conjunto de dados reais, coletados para avaliar depressão em adolescentes, ilustra a diferença do padrão de resposta do item ``crises de choro\" entre homens e mulheres. / Psychiatric scientific research often evaluate subjective characteristics of the individual such as depression, anxiety and phobias. Data are collected through questionnaires with items that try to identify the presence or absence of certain symptoms associated with the psychiatric disease. Some of these items though could make some people embarrassed since they are related to questionable or even illegal social behaviors. The item response theory model proposed within this work envisions to differentiate the relationship between the probability of the symptom presence and the gravity of the disease of embarrassed and non-embarrassed individuals. Items that need this differentiation are called differential item functioning (dif). Additionally, the model has the assumption that individuals embarrassed with one particular item could lie across other answers to omit a possible condition. Applications of the proposed model to simulated data for a 20-item questionnaire have showed that parameter estimates of the proposed model are close to their real values. The estimate accuracy gets worse as the number of individuals decreases, the number of dif increases, and especially as the number of dif susceptible to lying increases. The application of the model to a group of real data, collected to evaluate teenager depression, shows the difference in the probability of \"crying crisis\" presence between men and women.
64	The development and evaluation of Africanised items for multicultural cognitive assessment Bekwa, Nomvuyo Nomfusi 01 1900 (has links) Nothing in life is to be feared, it is only to be understood. Now is the time to understand more, so that we may fear less. Marie Curie Debates about how best to test people from different contexts and backgrounds continue to hold the spotlight of testing and assessment. In an effort to contribute to the debates, the purpose of the study was to develop and evaluate the viability and utility of nonverbal figural reasoning ability items that were developed based on inspirations from African cultural artefacts such as African material prints, art, decorations, beadwork, paintings, et cetera. The research was conducted in two phases, with phase 1 focused on the development of the new items, while phase 2 was used to evaluate the new items. The aims of the study were to develop items inspired by African art and cultural artefacts in order to measure general nonverbal figural reasoning ability; to evaluate the viability of the items in terms of their appropriateness in representing the African art and cultural artefacts, specifically to determine the face and content validity of the items from a cultural perspective; and to evaluate the utility of the items in terms of their psychometric properties. These elements were investigated using the exploratory sequential mixed method research design with quantitative embedded in phase 2. For sampling purposes, the sequential mixed method sampling design and non-probability sampling strategies were used, specifically the purposive and convenience sampling methods. The data collection methods that were used included interviews with a cultural expert and colour-blind person, open-ended questionnaires completed by school learners and test administration to a group of 946 participants undergoing a sponsored basic career-related training and guidance programme. Content analysis was used for the qualitative data while statistical analysis mainly based on the Rasch model was utilised for quantitative data. The results of phase 1 were positive and provided support for further development of the new items, and based on this feedback, 200 new items were developed. This final pool of items was then used for phase 2 – the evaluation of the new items. The v statistical analysis of the new items indicated acceptable psychometric properties of the general reasoning (“g” or fluid ability) construct. The item difficulty values (pvalues) for the new items were determined using classical test theory (CTT) analysis and ranged from 0.06 (most difficult item) to 0.91 (easiest item). Rasch analysis showed that the new items were unidimensional and that they were adequately targeted to the level of ability of the participants, although there were elements that would need to be improved. The reliability of the new items was determined using the Cronbach alpha reliability coefficient (α) and the person separation index (PSI), and both methods indicated similar indices of internal consistency (α = 0.97; PSI = 0.96). Gender-related differential item functioning (DIF) was investigated, and the majority of the new items did not indicate any significant differences between the gender groups. Construct validity was determined from the relationship between the new items and the Learning Potential Computerised Adaptive Test (LPCAT), which uses traditional item formats to measure fluid ability. The correlation results for the total score of the new items and the pre- and post-tests were 0.616 and 0.712 respectively. The new items were thus confirmed to be measuring fluid ability using nonverbal figural reasoning ability items. Overall, the results were satisfactory in indicating the viability and utility of the new items. The main limitation of the research was that because the sample was not representative of the South African population, there were limited for generalisation. This led to a further limitation, namely that it was not possible to conduct important analysis on DIF for various other subgroups. Further research has been recommended to build on this initiative. / Industrial and Organisational Psychology Multicultural cognitive assessment Fluid ability Item development African art and cultural artefacts Culture-fair assessment Item analysis Rasch analysis Realibility Validity Item difficulty Differential item functioning
65	Towards establishing the equivalence of the English version of the verbal analogies scale of the Woodcock Munuz Language Survey across English and Xhosa first language speakers Ismail, Ghouwa January 2010 (has links) Magister Artium - MA / In the majority of the schools in South Africa (SA), learners commence education in English. This English milieu poses a considerable challenge for English second-language speakers. In an attempt to bridge the gap between English as the main medium of instruction and the nine indigenous languages of the country and assist with the implementation of mother-tongue based bilingual education, this study focuses on the cross-validation of a monolingual English test used in the assessment of multilingual or bilingual learners in the South African context. This test, namely the Woodcock Muñoz Language Survey (WMLS), is extensively used in the United States in Additive Bilingual Education in the country. The present study is a substudy of a broader study, in which the original WMLS (American-English version) was adapted into SA English and Xhosa. For this specific sub-study, the researcher was interested in investigating the scalar equivalence of the adapted English version of the Verbal Analogies (VA) subscale of the WMLS across English first-language speakers and Xhosa first-language speakers. This was achieved by utilising differential item functioning (DIF) and construct bias statistical techniques. The Mantel-Haenszel DIF detection method was employed to detect DIF, while construct equivalence was examined by means of exploratory factor analysis (EFA) utilising an a priori two-factor structure. The Tucker's phi coefficient was used to assess the congruence of the construct across the two language groups / South Africa Verbal Analogies Scale Differential item functioning Woodcock Muñoz Language Survey Bilingualism Cognitive academic language proficiency Threshold theory Verbal reasoning Bias and equivalence theory Secondary data analysis Exploratory factor analysis
66	Exploring the scalar equivalence of the picture vocabulary scale of the Woodcock Munoz language survey across rural and urban isiXhosa-speaking learners Brown, Qunita January 2012 (has links) Magister Artium (Psychology) - MA(Psych) / The fall of apartheid and the rise of democracy have brought assessment issues in multicultural societies to the forefront in South Africa. The rise of multicultural assessment demands the development of tests that are culturally relevant to enhance fair testing practices, and issues of bias and equivalence of tests become increasingly important. This study forms part of a larger project titled the Additive Bilingual Education Project (ABLE). The Woodcock Munoz Language Survey (WMLS) was specifically selected to evaluate the language aims in the project, and was adapted from English to isiXhosa. Previous research has indicated that one of the scales in the adapted isiXhosa version of the WMLS, namely the Picture Vocabulary Scale (PV), displays some item bias, or differential item functioning (DIF), across rural and urban isiXhosa learners. Research has also indicated that differences in dialects can have an impact on test takers’ scores. It is therefore essential to explore the structural equivalence of the adapted isiXhosa version of the WMLS on the PV scale across rural and urban isiXhosa learners, and to ascertain whether DIF is affecting the extent to which the same construct is measured across both groups. The results contribute to establishing the scalar equivalence of the adapted isiXhosa version of the WMLS across rural and urban isiXhosa-speaking learners. Secondary Data Analysis (SDA) was employed because this allowed the researcher to re-analyse the existing data in order to further evaluate construct equivalence. The sample of the larger study consisted of 260 learners, both male and female, selected from a population of Grade 6 and 7 learners attending schools in the Eastern Cape. The data was analysed by using the statistical programme Comprehensive Exploratory Factor Analysis (CEFA) and the Statistical Package for Social Sciences (SPSS). Exploratory factor analysis and the Tucker’s phi coefficient were used. The results indicated distinct factor loadings for both groups, but slight differences were observed which raised concerns about construct equivalence. Scatter plots were employed to investigate further, which also gave cause for concern. It was therefore concluded that construct equivalence was only partially attained. In addition, the Cronbach’s Alpha per factor was calculated, showing that internal consistency was displayed only for Factor 1 and not for Factor 2 for the rural group, or both factors for the urban group. Scalar equivalence across the two groups must therefore be explored further. Differential item functioning Scalar equivalence Construct equivalence Woodcock Munoz Language survey Picture vocabulary scale isiXhosa dialects Secondary data analysis Exploratory factor analysis Bias Equivalence
67	An evaluation of group differences and items bias, across rural isiXhosa learners and urban isiXhosa learners, of the isiXhosa version of the Woodcock Muñoz Language Survey (WMLS) Silo, Unathi Lucia January 2010 (has links) Magister Psychologiae - MPsych / In many countries defined by multilingualism, language has been identified as a great influence during psychological and educational testing. In South Africa (SA), factors such as changes in policies and social inequalities also influence testing. Literature supports the translation and adaptation of tests used in such contexts in order to avoid bias caused by language. Different language versions of tests then need to be evaluated for equivalence, to ensure that scores across the different language versions have the same meaning. Differences in dialects may also impact on the results of such tests.Results of an isiXhosa version of the Woodcock Muñoz Language Survey (WMLS),which is a test used to measure isiXhosa learners’ language proficiency, show significant mean score differences on the test scores across rural and urban firstlanguage speakers of isiXhosa. These results have indicated a possible problem regarding rural and urban dialects during testing. This thesis evaluates the item bias of the subtests in this version of the WMLS across rural and urban isiXhosa learners. This was accomplished by evaluating the reliability and item characteristics for group differences, and by evaluating differential item functioning across these two groups on the subtests of the WMLS. The sample in this thesis comprised of 260 isiXhosa learners from the Eastern Cape Province in grade 6 and grade 7, both males and females. This sample was collected in two phases: (1) secondary data from 49 rural and 133 urban isiXhosa learners was included in the sample; (2) adding to the secondary data, a primary data collection from 78 rural isiXhosa learners was made to equalise the two sample groups. All ethical considerations were included in this thesis. The results were surprising and unexpected. Two of the subtests in the WMLS showed evidence of scalar equivalence as only a few items were identified as problematic. However, two of the subtests demonstrated more problematic items. These results mean that two subtests of the WMLS that demonstrated evidence of scalar equivalence can be used to measure the construct of language proficiency, while the other two sub-tests that showed problematic items need to be further investigated, as the responses given by learners on these items seem to be determined by their group membership and not by their ability. Differential item functioning (DIF) Scalar equivalence Psychological testing Crosscultural assessment Language in education Language policies Language proficiency IsiXhosa dialects Test translation Test adaptation
68	A Structural and Psychometric Evaluation of a Situational Judgment Test: The Workplace Skills Survey Wei, Min 08 1900 (has links) Some basic but desirable employability skills are antecedents of job performance. The Workplace Skills Survey (WSS) is a 48-item situational judgment test (SJT) used to assess non-technical workplace skills for both entry-level and experienced workers. Unfortunately, the psychometric evidence for use of its scores is far from adequate. The purpose of current study was two-fold: (a) to examine the proposed structure of WSS scores using confirmatory factor analysis (CFA), and (b) to explore the WSS item functioning and performance using item response theory (IRT). A sample of 1,018 Jamaican unattached youth completed the WSS instrument as part of a longitudinal study on the efficacy of a youth development program in Jamaica. Three CFA models were tested for the construct validity of WSS scores. Parameter estimations of item difficulty, item discrimination, and examinee’s proficiency estimations were obtained with item response theory (IRT) and plotted in item characteristics curves (ICCs) and item information curves (IICs). Results showed that the WSS performed quite well as a whole and provided precise measurement especially for respondents at latent trait levels of -0.5 and +1.5. However, some modifications of some items were recommended. CFA analyses showed supportive evidence of the one-factor construct model, while the six-factor model and higher-order model were not achieved. Several directions for future research are suggested. SJTs WSS factor analysis item response theory structural validity item functioning Workplace Skills Survey situational judgement test
69	Identifying Unbiased Items for Screening Preschoolers for Disruptive Behavior Problems Studts, Christina R., Polaha, Jodi, van Zyl, Michiel A. 25 October 2016 (has links) Objective: Efficient identification and referral to behavioral services are crucial in addressing early-onset disruptive behavior problems. Existing screening instruments for preschoolers are not ideal for pediatric primary care settings serving diverse populations. Eighteen candidate items for a new brief screening instrument were examined to identify those exhibiting measurement bias (i.e., differential item functioning, DIF) by child characteristics. Method: Parents/guardians of preschool-aged children (N = 900) from four primary care settings completed two full-length behavioral rating scales. Items measuring disruptive behavior problems were tested for DIF by child race, sex, and socioeconomic status using two approaches: item response theory-based likelihood ratio tests and ordinal logistic regression. Results: Of 18 items, eight were identified with statistically significant DIF by at least one method. Conclusions: The bias observed in 8 of 18 items made them undesirable for screening diverse populations of children. These items were excluded from the new brief screening tool. bias differential item functioning disruptive behavior problems item response theory preschool screening Family Medicine Family Medicine
70	Item Discrimination, Model-Data Fit, and Type I Error Rates in DIF Detection using Lord's <i>χ<sup>2</sup></i>, the Likelihood Ratio Test, and the Mantel-Haenszel Procedure Price, Emily A. 11 June 2014 (has links) No description available. Educational Tests and Measurements Education Differential Item Functioning Item Discrimination Type I Error Monte Carlo Simulation Lords Chi-Square Likelihood Ratio Test Mantel-Haenszel Procedure Model-Data Fit

Search results