Spelling suggestions: "subject:"differential tem functioning"" "subject:"differential tem unctioning""
1 |
Differential Item Functioning on the Armed Services Vocational Aptitude BatteryGibson, Shanan Gwaltney IV 19 November 1998 (has links)
Utilizing Item Response Theory (IRT) methodologies, the Armed Services Vocational Aptitude Battery (ASVAB) was examined for differential item functioning (DIF) on the basis of crossed gender and ethnicity variables. Both the Mantel-Haenszel procedure and an IRT area-based technique were utilized to assess the degree of uniform and non-uniform DIF in a sample of ASVAB takers. The analysis was performed such that each subgroup of interest functioned as the focal group to be compared to the male reference group. This type of DIF analysis allowed for comparisons within ethnic group, within gender group, as well as crossed ethnic/gender group. The groups analyzed were: White, Black, and Hispanic males, and White and Black females. It was hypothesized that DIF would be found, at the scale level, on several of the ASVAB sub-tests as a result of unintended latent trait demands of items. In particular, those tests comprised of items requiring specialized jargon, visuospatial ability, or advanced English vocabulary are anticipated to show bias toward white males and/or white females.
Findings were mixed. At the item level, DIF fluctuated greatly. Numerous instances of DIF favoring the reference as well as the focal group were found. At the scale level, inconsistencies existed across the forms and versions. Tests varied in their tendency to be biased against the focal group of interest and at times, performed contrary to expectations. / Master of Science
|
2 |
Evaluation of two types of Differential Item Functioning in factor mixture models with binary outcomesLee, Hwa Young, doctor of educational psychology 22 February 2013 (has links)
Differential Item Functioning (DIF) occurs when examinees with the same ability have different probabilities of endorsing an item. Conventional DIF detection methods (e.g., the Mantel-Hansel test) can be used to detect DIF only across observed groups, such as gender or ethnicity. However, research has found that DIF is not typically fully explained by an observed variable (e.g., Cohen & Bolt, 2005). True source of DIF may be unobserved, including variables such as personality, response patterns, or unmeasured background variables.
The Factor Mixture Model (FMM) is designed to detect unobserved sources of heterogeneity in factor structures, and an FMM with binary outcomes has recently been used for assessing DIF (DeMars & Lau, 2011; Jackman, 2010). However, FMMs with binary outcomes for detecting DIF have not been thoroughly explored to investigate both types of between-class latent DIF (LDIF) and class-specific observed DIF (ODIF).
The present simulation study was designed to investigate whether models correctly specified in terms of LDIF and/or ODIF influence the performance of model fit indices (AIC, BIC, aBIC, and CAIC) and entropy, as compared to models incorrectly specified in terms of either LDIF or ODIF. In addition, the present study examined the recovery of item difficulty parameters and investigated the proportion of replications in which items were correctly or incorrectly identified as displaying DIF, by manipulating DIF effect size and latent class probability. For each simulation condition, two latent classes of 27 item responses were generated to fit a one parameter logistic model with items’ difficulties generated to exhibit DIF across the classes and/or the observed groups.
Results showed that FMMs with binary outcomes performed well in terms of fit indices, entropy, DIF detection, and recovery of large DIF effects. When class probabilities were unequal with small DIF effects, performance decreased for fit indices, power, and the recovery of DIF effects compared to equal class probability conditions. Inflated Type I errors were found for invariant DIF items across simulation conditions. When data were generated to fit a model having ODIF but estimated LDIF, specifying LDIF in the model fully captured ODIF effects when DIF effect sizes were large. / text
|
3 |
特異項目機能検出方法の比較 : BILOG-MGとSIBTESTを用いた検討熊谷, 龍一, KUMAGAI, Ryuichi, 脇田, 貴文, WAKITA, Takafumi 25 December 2003 (has links)
国立情報学研究所で電子化したコンテンツを使用している。
|
4 |
USING DIFFERENTIAL FUNCTIONING OF ITEMS AND TESTS (DFIT) TO EXAMINE TARGETED DIFFERENTIAL ITEM FUNCTIONINGO'Brien, Erin L. January 2014 (has links)
No description available.
|
5 |
Differential Item Functioning on the International Personality Item Pool's Neuroticism ScaleMcBride, Nadine LeBarron 29 December 2008 (has links)
As use of the public-domain International Personality Item Pool (IPIP) scales has grown significantly over the past decade (Goldberg, Johnson, Eber, Hogan, Ashton, Cloninger, & Gough, 2006) research on the psychometric properties of the items and scales have become increasingly important. This research study examines the IPIP scale constructed to measure the Five Factor Model (FFM) domain of Neuroticism (as measured by the NEO-PI-R) for occurrences of differential functioning at both the item and test level by gender and three age ranges using the DFIT framework (Raju, van der Linden, & Fleer, 1993) This study found six items that displayed differential item functioning by gender and three items that displayed differential item functioning by age. No differential functioning at the test level was found. Items demonstrating DIF and implications for potential scale revision are discussed. / Ph. D.
|
6 |
Predictive Modeling of Uniform Differential Item Functioning Preservation Likelihoods After Applying Disclosure Avoidance Techniques to Protect PrivacyLemons, Marlow Q. 04 April 2014 (has links)
The need to publish and disseminate data continues to grow. Administrators of large-scale educational assessment should provide examinee microdata in addition to publishing assessment reports. Disclosure avoidance methods are applied to the data to protect examinee privacy before doing so, while attempting to preserve as many item statistical properties as possible. When important properties like differential item functioning are lost due to these disclosure avoidance methods, the microdata can give off misleading messages of effectiveness in measuring the test construct. In this research study, I investigated the preservation of differential item functioning in a large-scale assessment after disclosure avoidance methods have been applied to the data. After applying data swapping to protect the data, I attempted to empirically model and explain the likelihood of preserving various levels of differential item functioning as a function of several factors including the data swapping rate, the reference-to-focal group ratio, the type of item scoring, and the level of DIF prior to data swapping. / Ph. D.
|
7 |
The Impact of Differential Item Functioning of MCAS Mathematics Exams on Immigrant Students and CommunitiesSuarez Munist, Octavio Nestor January 2011 (has links)
Thesis advisor: Walt Haney / Migration is now a major component of globalization. The combination of better economic opportunities and lower fertility rates in developed nations suggests that the current migratory wave will last for many decades to come (United Nations Population Fund, 2007). In the U.S., immigration over the last thirty years has significantly changed the face of the workforce and the classroom. At the state level, Massachusetts has been one of the top immigrant-receiving states in the Union. Since the 1990's, Massachusetts has been implementing a policy of standardized testing for accountability and graduation. The Massachusetts Comprehensive Assessment System (MCAS) is a set of standardized, norm-referenced tests administered to comply with the test-based accountability provisions of the 1993 No Child Left Behind federal legislation (NCLB). Used today for high-stakes decisions such as NCLB accountability as well as high school graduation requirements, MCAS has raised a number of validity concerns. Differential item functioning analysis, a technique to statistically identify potentially biased in tests, has not been used to challenge the validity of the tests, although it can provide new insights into test bias that were not previously available. This dissertation investigates the presence of differential item functioning in MCAS between native students and immigrant students. It identifies one test, the 2008 Grade 3 MCAS Mathematics test, as having a significant number of items exhibiting differential functioning and compares the original test version to a purified test version with these items removed. The purified test version results in larger test score improvements for immigrants as well as other non-mainstream students. These alternative test scores are sufficiently large to affect the determination of NCLB-based performance status for many schools and districts that are comparatively poorer and more diverse than the average. While the lack of more precise data on immigrants and other characteristics of the data set reduce the definiteness of the results, there is ample cause for concern about the presence of differential item functioning-based bias on MCAS and the need to further study this phenomenon as NCLB-based accountability determinations impact a growing number of schools, districts and communities. / Thesis (EdD) — Boston College, 2011. / Submitted to: Boston College. Lynch School of Education. / Discipline: Educational Research, Measurement, and Evaluation.
|
8 |
A Comparison of Adjacent Categories and Cumulative DSF Effect EstimatorsGattamorta, Karina Alvarez 18 December 2009 (has links)
The study of measurement invariance in polytomous items that targets individual score levels is known as differential step functioning (DSF; Penfield, 2007, 2008). DSF methods provide specific information describing the manifestation of the invariance effect within particular score levels and therefore serve a diagnostic role in identifying the individual score levels involved in the item's invariance effect. The analysis of DSF requires the creation of a set of dichotomizations of the item response variable. There are two primary approaches for creating the set of dichotomizations to conduct a DSF analysis. The first approach, known as the adjacent categories approach, is consistent with the dichotomization scheme underlying the generalized partial credit model (GPCM; Muraki, 1992) and considers each pair of adjacent score levels while treating the other score levels as missing. The second approach, known as the cumulative approach, is consistent with the dichotomization scheme underlying the graded response model (GRM; Samejima, 1997) and includes data from every score level in each dichotomization. To date, there is limited research on how the cumulative and adjacent categories approaches compare within the context of DSF, particularly as applied to a real data set. The understanding of how the interpretation and practical outcomes may vary given these two approaches is also limited. The current study addressed these two issues. This study evaluated the results of a DSF analysis using both the adjacent categories and cumulative dichotomization schemes in order to determine if the two approaches yield similar results and interpretations of DSF. These approaches were applied to data from a polytomously scored alternate assessment administered to children with significant cognitive disabilities. The results of the DSF analyses revealed that the two approaches generally led to consistent results, particularly in the case where DSF effects were negligible. For steps where significant DSF was present, the two approaches generally guide analysts to the same location of the item. However, several aspects of the results rose questions about the use of the adjacent categories dichotomization scheme. First, there seemed to be a lack of independence of the adjacent categories method since large DSF effects at one step are often paired with large DSF effects in the opposite direction found in the previous step. Additionally, when a substantial DSF effect existed, it was more likely to be significant using the cumulative approach over the adjacent categories approach. This is likely due to the smaller standard errors that lead to greater stability of the cumulative approach. In sum, the results indicate that the cumulative approach is preferable over the adjacent categories approach when conducting a DSF analysis.
|
9 |
Controlling Type 1 Error Rate in Evaluating Differential Item Functioning for Four DIF Methods: Use of Three Procedures for Adjustment of Multiple Item TestingKim, Jihye 25 October 2010 (has links)
In DIF studies, a Type I error refers to the mistake of identifying non-DIF items as DIF items, and a Type I error rate refers to the proportion of Type I errors in a simulation study. The possibility of making a Type I error in DIF studies is always present and high possibility of making such an error can weaken the validity of the assessment. Therefore, the quality of a test assessment is related to a Type I error rate and to how to control such a rate. Current DIF studies regarding a Type I error rate have found that the latter rate can be affected by several factors, such as test length, sample size, test group size, group mean difference, group standard deviation difference, and an underlying model. This study focused on another undiscovered factor that may affect a Type I error rate; the effect of multiple testing. DIF analysis conducts multiple significance testing of items in a test, and such multiple testing may increase the possibility of making a Type I error at least once. The main goal of this dissertation was to investigate how to control a Type I error rate using adjustment procedures for multiple testing which have been widely used in applied statistics but rarely used in DIF studies. In the simulation study, four DIF methods were performed under a total of 36 testing conditions; the methods were the Mantel-Haenszel method, the logistic regression procedure, the Differential Functioning Item and Test framework, and the Lord’s chi-square test. Then the Bonferroni correction, the Holm’s procedure, and the BH method were applied as an adjustment of multiple significance testing. The results of this study showed the effectiveness of three adjustment procedures in controlling a Type I error rate.
|
10 |
Deriving an executive behaviour screener from the Behavior Assessment System for Children - 2: applications to adolescent hockey players with and without concussionsWong, Ryan 08 January 2018 (has links)
Objective: Executive functions govern our ability to navigate complex and novel
situations in day-to-day life. There is increased interest on environmental influences that may cause changes to executive functioning. The current thesis involves two studies examining the derivation and performance of an executive behaviour screener from the Behavioral Assessment System for Children (BASC-2-PRS; Reynolds & Kamphaus, 2004) on two different adolescent samples using a previously derived four-factor model of executive functioning (Garcia-Barrera et al., 2011, 2013). Participants and Methods: Study 1. BASC-2 PRS standardization data consisting of a demographically matched American sample of 2722 12-21 year olds was obtained. The screener was derived using 25 items assigned a priori to each executive factor. Confirmatory factor analysis (CFA), invariance testing, and multiple indicators multiple causes (MIMIC) models were used to evaluate the screener. Study 2. The screener was applied to a previously collected sample of 479 elite adolescent hockey players from Canada with or without a history of concussion, followed through a single season of play. CFA, invariance testing, and MIMIC models were used to evaluate the screener and the hockey sample was compared to the standardization sample. Results: Study 1. Acceptable-to-good reliability was obtained for all factors (α = .75-.89). The four-factor model was the best fit to the data (CFI = .990, TLI = .989, RMSEA = .037). Configural, metric, and scalar but not latent mean invariance was shown for sex. Age-related uniform differential item functioning (DIF) and SES-related uniform and non-uniform DIF were shown. Standardized norms for use in clinical settings were created. Study 2. Acceptable-to-good reliability was shown for 3 factors (α = .72-.85). Emotional Control showed poor reliability (α = .58). The four-factor model was the best fit to the data (CFI = .991, TLI = .990, RMSEA
= .026). Configural, metric, and scalar but not latent mean invariance was shown between the two samples. Uniform and non-uniform DIF were not observed for those with an increasing number of past concussions. Conclusions: Findings support the four-factor model measured through the screener in adolescence. Females and hockey players demonstrate fewer executive behaviour problems overall. Sex, age, and SES may influence the interpretation of factor scores. Continued exploration and development of the screener is suggested. / Graduate / 2018-09-27
|
Page generated in 0.1685 seconds