Global ETD Search

31	Assessing Invariance of Factor Structures and Polytomous Item Response Model Parameter Estimates Reyes, Jennifer McGee 2010 December 1900 (has links) The purpose of the present study was to examine the invariance of the factor structure and item response model parameter estimates obtained from a set of 27 items selected from the 2002 and 2003 forms of Your First College Year (YFCY). The first major research question of the present study was: How similar/invariant are the factor structures obtained from two datasets (i.e., identical items, different people)? The first research question was addressed in two parts: (1) Exploring factor structures using the YFCY02 dataset; and (2) Assessing factorial invariance using the YFCY02 and YFCY03 datasets. After using exploratory and confirmatory and factor analysis for ordered data, a four-factor model using 20 items was selected based on acceptable model fit for the YFCY02 and YFCY03 datasets. The four factors (constructs) obtained from the final model were: Overall Satisfaction, Social Agency, Social Self Concept, and Academic Skills. To assess factorial invariance, partial and full factorial invariance were examined. The four-factor model fit both datasets equally well, meeting the criteria for partial and full measurement invariance. The second major research question of the present study was: How similar/invariant are person and item parameter estimates obtained from two different datasets (i.e., identical items, different people) for the homogenous graded response model (Samejima, 1969) and the partial credit model (Masters, 1982)? To evaluate measurement invariance using IRT methods, the item discrimination and item difficulty parameters obtained from the GRM need to be equivalent across datasets. The YFCY02 and YFCY03 GRM item discrimination parameters (slope) correlation was 0.828. The YFCY02 and YFCY03 GRM item difficulty parameters (location) correlation was 0.716. The correlations and scatter plots indicated that the item discrimination parameter estimates were more invariant than the item difficulty parameter estimates across the YFCY02 and YFCY03 datasets. measurement invariance parameter invariance factor analysis Item response theory polytomous
32	Item and person parameter estimation using hierarchical generalized linear models and polytomous item response theory models Williams, Natasha Jayne. January 2003 (has links) Thesis (Ph. D.)--University of Texas at Austin, 2003. / Vita. Includes bibliographical references. Available also from UMI Company.
33	A New Item Response Theory Model for Estimating Person Ability and Item Parameters for Multidimensional Rank Order Responses Seybert, Jacob 01 January 2013 (has links) The assessment of noncognitive constructs poses a number of challenges that set it apart from traditional cognitive ability measurement. Of particular concern is the influence of response biases and response styles that can influence the accuracy of scale scores. One strategy to address these concerns is to use alternative item presentation formats (such as multidimensional forced choice (MFC) pairs, triads, and tetrads) that may provide resistance to such biases. A variety of strategies for constructing and scoring these forced choice measured have been proposed, though they often require large sample sizes, are limited in the way that statements can vary in location, and (in some cases) require a separate precalibration phase prior to the scoring of forced-choice responses. This dissertation introduces new item response theory models for estimating item and person parameters from rank-order responses indicating preferences among two or more alternatives representing, for example, different personality dimensions. Parameters for this new model, called the Hyperbolic Cosine Model for Rank order responses (HCM-RANK), can be estimated using Markov chain Monte Carlo (MCMC) methods that allow for the simultaneous evaluation of item properties and person scores. The efficacy of the MCMC parameter estimation procedures for these new models was examined via three studies. Study 1 was a Monte Carlo simulation examining the efficacy of parameter recovery across levels of sample size, dimensionality, and approaches to item calibration and scoring. It was found that estimation accuracy improves with sample size, and trait scores and location parameters can be estimated reasonably well in small samples. Study 2 was a simulation examining the robustness of trait estimation to error introduced by substituting subject matter expert (SME) estimates of statement location for MCMC item parameter estimates and true item parameters. Only small decreases in accuracy relative to the true parameters were observed, suggesting that using SME ratings of statement location for scoring might be a viable short-term way of expediting MFC test deployment in field settings. Study 3 was included primarily to illustrate the use of the newly developed IRT models and estimation methods with real data. An empirical investigation comparing validities of personality measures using different item formats yielded mixed results and raised questions about multidimensional test construction practices that will be explored in future research. The presentation concludes with a discussion of MFC methods and potential applications in educational and workforce contexts. Forced Choice Item Response Theory Multidimensional IRT Noncognitive Assessment Psychology
34	Item and person parameter estimation using hierarchical generalized linear models and polytomous item response theory models Williams, Natasha Jayne 27 July 2011 (has links) Not available / text Parameter estimation Linear models (Statistics)
35	Nonparametric item response modeling for identifying differential item functioning in the moderate-to-small-scale testing context Witarsa, Petronilla Murlita 11 1900 (has links) Differential item functioning (DIF) can occur across age, gender, ethnic, and/or linguistic groups of examinee populations. Therefore, whenever there is more than one group of examinees involved in a test, a possibility of DIF exists. It is important to detect items with DIF with accurate and powerful statistical methods. While finding a proper DIP method is essential, until now most of the available methods have been dominated by applications to large scale testing contexts. Since the early 1990s, Ramsay has developed a nonparametric item response methodology and computer software, TestGraf (Ramsay, 2000). The nonparametric item response theory (IRT) method requires fewer examinees and items than other item response theory methods and was also designed to detect DIF. However, nonparametric IRT's Type I error rate for DIF detection had not been investigated. The present study investigated the Type I error rate of the nonparametric IRT DIF detection method, when applied to moderate-to-small-scale testing context wherein there were 500 or fewer examinees in a group. In addition, the Mantel-Haenszel (MH) DIF detection method was included. A three-parameter logistic item response model was used to generate data for the two population groups. Each population corresponded to a test of 40 items. Item statistics for the first 34 non-DIF items were randomly chosen from the mathematics test of the 1999 TEVISS (Third International Mathematics and Science Study) for grade eight, whereas item statistics for the last six studied items were adopted from the DIF items used in the study of Muniz, Hambleton, and Xing (2001). These six items were the focus of this study. Test bias -- Evaluation Examinations -- Validity Nonparametric statistics Item response theory
36	Comparison of vertical scaling methods in the context of NCLB Gotzmann, Andrea Julie Unknown Date No description available. Vertical Scaling Decision Accuracy Decision Consistency Item Response Theory
37	Establishing the protocol validity of an electronic standardised measuring instrument / Sebastiaan Rothmann Rothmann, Sebastiaan January 2009 (has links) Over the past few decades, the nature of work has undergone remarkable changes, resulting in a shift from manual demands to mental and emotional demands on employees. In order to manage these demands and optimise employee performance, organisations use well-being surveys to guide their interventions. Because these interventions have a drastic financial implication it is important to ensure the validity and reliability of the results. However, even if a validated measuring instrument is used, the problem remains that wellness audits might be reliable, valid and equivalent when the results of a group of people are analysed, but cannot be guaranteed for each individual. It is therefore important to determine the validity and reliability of individual measurements (i.e. protocol validity). However, little information exists concerning the efficiency of different methods to evaluate protocol validity. The general objective of this study was to establish an efficient, real-time method/indicator for determining protocol validity in web-based instruments. The study sample consisted of 14 592 participants from several industries in South Africa and was extracted from a work-related well-being survey archive. A protocol validity indicator that detects random responses was developed and evaluated. It was also investigated whether Item Response Theory (IRT) fit statistics have the potential to serve as protocol validity indicators and this was compared to the newly developed protocol validity indicator. The developed protocol validity indicator makes use of neural networks to predict whether cases have protocol validity. A neural network was trained on a large non-random sample and a computer-generated random sample. The neural network was then cross-validated to see whether posterior cases can be accurately classified as belonging to the random or non-random sample. The neural network proved to be effective in detecting 86,39% of the random responses and 85,85% of the non-random responses correctly. Analyses on the misclassified cases demonstrated that the neural network was accurate because non-random classified cases were in fact valid and reliable, while random classified cases showed a problematic factor structure and low internal consistency. Neural networks proved to be an effective technique for the detection of potential invalid and unreliable cases in electronic well-being surveys. Subsequently, the protocol validity detection capability of IRT fit statistics was investigated. The fit statistics were calculated for the study population and for random generated data with a uniform distribution. In both the study population and the random data, cases with higher outfit statistics showed problems with validity and reliability. When compared to the neural network technique, the fit statistics suggested that the neural network was more effective in classifying non-random cases than it was in classifying random cases. Overall, the fit statistics proved to be effective indicators of protocol invalidity (rather than validity) provided that some additional measures be imposed. Recommendations were made for the organisation as well as with a view to future research. / Thesis (M.Sc. (Human Resource Management))--North-West University, Potchefstroom Campus, 2010. Protocol validity Item response theory Neural networks Well-being instruments
38	Measurement of Stigma and Relationships Between Stigma, Depression, and Attachment Style Among People with HIV and People with Hepatitis C Cabrera, Christine M. 19 December 2013 (has links) This dissertation is composed of three studies that examined illness-related stigma, depressive symptoms and attachment style among patients living with HIV and Hepatitis C (HCV). The first study examined the psychometric properties of a brief HIV Stigma Scale (B-HSS) in a sample of adult patients living with HIV (PHA) (n=94). The second study developed and explored the psychometric properties of the HCV Stigma Scale in a sample of adult patients living with HCV (PHC) (n =92). Psychometric properties were evaluated with classical test theory and item response theory methodology. The third study explored whether illness-related stigma mediated the relationship between insecure attachment styles (anxious attachment or avoidant attachment) and depressive symptoms among PHA (n =72) and PHC (n=83). From June to December 2008, patients were recruited to participate in a questionnaire study at the outpatient clinics in The Ottawa Hospital. Findings indicated that the 9-item B-HSS is a reliable and valid measure of HIV stigma with items that are highly discriminatory, which indicates that items are highly effective at discriminating patients with different levels of stigma. The 9-item HCV Stigma Scale was also found to be reliable and valid with highly discriminatory items that effectively differentiate PHC. Construct validity for both scales was supported by relationships with theoretically related constructs: depression and quality of life. Among PHA, when HIV stigma was controlled the relationship between anxious attachment style and depression was not significant. However, the relationship between avoidant attachment style and depressive symptoms decreased but remained significant. Among PHC when HCV stigma was controlled the relationship between insecure attachment styles and depressive symptoms was not significant. Dissertation results emphasize the importance of identifying patients experiencing illness-related stigma and the relevance of addressing stigma and attachment style when treating depressive symptoms among PHA and PHC. HIV Stigma HCV Stigma Item Response Theory Attachment Theory
39	Establishing the protocol validity of an electronic standardised measuring instrument / Sebastiaan Rothmann Rothmann, Sebastiaan January 2009 (has links) Over the past few decades, the nature of work has undergone remarkable changes, resulting in a shift from manual demands to mental and emotional demands on employees. In order to manage these demands and optimise employee performance, organisations use well-being surveys to guide their interventions. Because these interventions have a drastic financial implication it is important to ensure the validity and reliability of the results. However, even if a validated measuring instrument is used, the problem remains that wellness audits might be reliable, valid and equivalent when the results of a group of people are analysed, but cannot be guaranteed for each individual. It is therefore important to determine the validity and reliability of individual measurements (i.e. protocol validity). However, little information exists concerning the efficiency of different methods to evaluate protocol validity. The general objective of this study was to establish an efficient, real-time method/indicator for determining protocol validity in web-based instruments. The study sample consisted of 14 592 participants from several industries in South Africa and was extracted from a work-related well-being survey archive. A protocol validity indicator that detects random responses was developed and evaluated. It was also investigated whether Item Response Theory (IRT) fit statistics have the potential to serve as protocol validity indicators and this was compared to the newly developed protocol validity indicator. The developed protocol validity indicator makes use of neural networks to predict whether cases have protocol validity. A neural network was trained on a large non-random sample and a computer-generated random sample. The neural network was then cross-validated to see whether posterior cases can be accurately classified as belonging to the random or non-random sample. The neural network proved to be effective in detecting 86,39% of the random responses and 85,85% of the non-random responses correctly. Analyses on the misclassified cases demonstrated that the neural network was accurate because non-random classified cases were in fact valid and reliable, while random classified cases showed a problematic factor structure and low internal consistency. Neural networks proved to be an effective technique for the detection of potential invalid and unreliable cases in electronic well-being surveys. Subsequently, the protocol validity detection capability of IRT fit statistics was investigated. The fit statistics were calculated for the study population and for random generated data with a uniform distribution. In both the study population and the random data, cases with higher outfit statistics showed problems with validity and reliability. When compared to the neural network technique, the fit statistics suggested that the neural network was more effective in classifying non-random cases than it was in classifying random cases. Overall, the fit statistics proved to be effective indicators of protocol invalidity (rather than validity) provided that some additional measures be imposed. Recommendations were made for the organisation as well as with a view to future research. / Thesis (M.Sc. (Human Resource Management))--North-West University, Potchefstroom Campus, 2010. Protocol validity Item response theory Neural networks Well-being instruments
40	The effects of examinee motivation on multiple-choice item calibration and test construction. Barneveld, Christina Van, January 2004 (has links) Thesis (Ph. D.)--University of Toronto, 2004. / Adviser: Ross Traub.

Search results