Global ETD Search

161	Assessing adaptation equivalence in cross -lingual and cross -cultural assessment using linear structural equations models Purwono, Urip 01 January 2004 (has links) Making a test available in more than one language versions has become a common practice in the fields of psychology and education. When comparisons of the populations taking the parent and the adapted versions of the test are to be made, the equivalence of the constructs of the tests must be established. Structural equations model (SEM) offers a unified approach for examining equivalences between the parent and adapted language version of a test by examining the equivalence of the constructs measured by the two versions of the test. While the procedures have the potential for yielding more direct information regarding whether the original and adapted version of an assessment instrument are equivalent, study investigating the power and type-I error rate of the procedures in the context of adaptation equivalence is not yet available. The present study is an attempt to fill this void. Three separate simulation studies were conducted to evaluate the effectiveness of the SEM approach for investigating test adaptation equivalence. In the first study the accuracy of the estimation procedure was investigated. In the second study, the Type-1 error rate of the procedure in identifying invariance in the parameters across two subgroups was investigated. In the third study, the power of the procedure in identifying differences in mean (Kappa) and structural (Lambda) parameters across two subgroups was investigated. The results of the first study indicated that the Kappa and Lambda parameters could be recovered with sufficient degree of accuracy with sample size in the order of 500. The Type I error rate for the Kappa and the lambda parameters were similar. With a sample size larger than 500, the Type I error rate approached the nominal levels. The power of the procedure in detecting differences increased with sample size and the magnitude of the difference in the parameters between the subgroups. With the kappa parameters, a sample of size 600 was required to detect a difference of .35 standardized units with a probability of .75. With the Lambda parameters, a difference of .2 in factor loading was detectable with a sample size of 300 with probability of .9. Educational evaluation
162	THE FIT OF EMPIRICAL DATA TO TWO LATENT TRAIT MODELS HUTTEN, LEAH R 01 January 1981 (has links) The study explored fit of empirical data to the Rasch and three-parameter logistic latent trait models focusing upon the relationship between deviations from latent trait model assumptions and fit. The study also investigated estimation precision for small sample and short test conditions and evaluated parameter estimation costs for the two latent trait models. Rasch and three-parameter abilities and item parameters were estimated for twenty-five 40-item tests having 1000 examinees. These estimated parameters were substituted for true parameters to make predictions about number-correct score distributions using a theorem by Lord (1980) equating ability with the conditional distribution of number correct scores. Predicted score distributions were compared with observed score distributions by Kolmogorov-Smirnov and Chi square measures, and with graphical techniques. The importance of three latent trait model assumptions: unidimensionality, equality of item discrimination indices, and no guessing were assessed with correlation analyses. Estimation precision for short 20-item tests and for small samples of 250 examinees were evaluated with correlation methods and by assessing average absolute differences between estimates. Simple summary statistics were gathered to evaluate computer cost and time for parameter estimation with each model. The Rasch and the three-parameter models both demonstrated close fit to the majority of data studied. Eighty percent of tests fit both models quite well. Only one predicted test distribution deviated significantly from the observed score distribution. Results obtained with the Chi square measure were less favorable toward the models than Kolmogorov-Smirnov assessments had been. This outcome was attributed to the apparent sensitivity of the Chi square statistic to lack of normality in score distributions. Graphic results clearly supported statistical measures of fit leading to the conclusion that latent trait models adequately describe empirical test sets. The Rasch model fit data, overall, as well as the three-parameter model. The average K-S statistic for 25 tests was 1.304 for the Rasch model and 1.289 for the three-parameter model. The latter model fit data better than the Rasch model for 65 percent of the tests, yet the differences in fit between the models were insignificant. The Chi square measure and graphical tests supported these results. Lack of unidimensionality was the primary cause for misfit of data to the models. Correlations between fit statistics and indices of unidimensionality were significant at the .05 probability level for the Rasch and three-parameter models. When item discrimination parameters were unequal, fit of data to both models was impaired, and when guessing was present, while not well estimated on samples of 1000, fit of data to both latent trait models tended to be distorted. Ability estimates from short 20-item tests were quite precise, especially for the Rasch model. Correlations between ability estimates from the 20-item and longer tests were .923 for the Rasch estimates, and .866 for the three-parameter estimates. Difficulty estimates made from small 250 examinee samples were also quite precise, but estimates of other item parameters from small samples tended not to be very accurate. Although small sample item discrimination estimates were reasonable, estimates of the guessing parameter were very poor. The results suggest that at least 1000 examinees are required to obtain precise estimates with the three-parameter model. The average cost for estimating Rasch item parameters and abilities was only $12.50 for 1000 examinees in contrast to $35.12 for the three-parameter model, but when item parameters were known in advance, and only abilities estimated, these cost differences disappeared. Educational evaluation
163	An empirical comparison of the Bookmark and modified Angoff standard setting methods and the impact on student classification Hauger, Jeffrey B 01 January 2007 (has links) No Child Left Behind has increased the importance of properly classifying students into performance level categories due to the ramifications associated with not making Annual Yearly Progress (AYP). States have the opportunity to create their own standards and conduct their own standard setting sessions. Two of the more popular methods used are the Angoff method and the Bookmark method. Reckase (2005) simulated both methods and found that the Bookmark method had negative bias associated with the method while the Angoff method did not produce any bias. This study simulated the Angoff and Bookmark methods similarly to Reckase's (2005) article and also added a different simulated bookmark method, which was used to simulate the Bookmark method more accurately. The study included six independent variables: standard setting method, cutscores, central tendency, number of panelists, item density, and bookmark placement. The second part of the study applied the results of the simulations to real data to determine the impact on student classification, based on the different conditions. Overall, the results of the simulation study indicated the Angoff simulated method was able to recover the parameters extremely well, while the second Bookmark simulated method recovered the item parameters better than the original Bookmark simulated method. However, in certain conditions, the second Bookmark simulated method was able to recover the item parameters as well as the Angoff method. The simulated cutscores were then used to place students into performance level categories based on students' ethnicity, gender, socioeconomic status, and interactions. The results indicated that the simulated Angoff method and the second Bookmark simulated method were most similar when the median was used as the central tendency for the Bookmark method and the panelists' error was large. The simulated Angoff method was the most robust method compared to the two simulated Bookmark methods. The implications and suggested future research are discussed. Educational evaluation
164	Exploring the impact of teachers' participation in an assessment -standards alignment study Martone, Andrea 01 January 2007 (has links) This study explored the impact of teachers' participation in an assessment standards alignment study as a way to gain a deeper understanding of an assessment, the underlying standards, and how these components relate to the participants' approach to instruction. Alignment research is one means to demonstrate the connection between assessment, standards, and instruction. If these components work together to deliver a consistent message about the topics about which students taught and assessed, students will have the opportunity to learn and demonstrate their acquired knowledge and skills. Six participants applied Norman Webb's alignment methodology to understand the degree of alignment between an assessment, the Massachusetts Adult Proficiency Test for Math (MAPT for Math), and state standards, the Massachusetts Adult Basic Education Curriculum Framework for Mathematics and Numeracy (Math ABE standards). Through item-objective matches, alignment was examined in terms of categorical concurrence, depth-of-knowledge consistency, range of knowledge correspondence, and balance of representation. The study also used observations, discussions, open-response survey questions, and a focus group discussion to understand how the alignment process influenced the participants' view of the assessment, the standards, and their approach to instruction. Results indicated that the MAPT for Math is well aligned to the Math ABE standards across three out of the four dimensions. Specific recommendations for improvements to the MAPT for Math and Math ABE standards are presented. The study also found that the alignment process influenced the participants' view of the standards, the assessment, and their approach to instruction. Additionally, the study highlighted ways to improve the alignment process to make the results more meaningful for teachers and test developers. This study indicated the value in ensuring an assessment is well aligned to the standards on which it is based. Findings also showed the value added when teachers are involved in an in-depth examination of an assessment and the standards on which that assessment is based. Teachers are the conduit through which the next generation is guided. Thus it is critical that teachers understand what they are being asked to teach their students and how that can be assessed on a well designed assessment. Educational evaluation
165	A comparison of item response theory true score equating and item response theory-based local equating Keller, Robert R. 01 January 2007 (has links) The need to compare students across different test administrations, or perhaps across different test forms within the same administration, plays a key role in most large-scale testing programs. In order to do this, these tests must be placed on the same scale. Placing test forms onto the same scale not only allows results from different test forms to be compared to each other, but also facilitates placing the results from different test scores onto a common reporting scale. The statistical method used to place these test scores onto a common metric is called equating. Estimated true equating, one of the conditional equating methods described by van der Linden (2000), has been shown to be a dramatic improvement over classical based equipercentile equating under some conditions (van der Linden, 2006). The purpose of the study is to investigate the relative performance of estimated true equating with IRT true score equating under a variety of conditions that are known to impact equating accuracy, namely: anchor test length, data misfit, scaling method, and examinee ability distribution, through simulation study. The results are evaluated based on root mean squared error (RMSE) and bias of the equating functions, as well as decision accuracy when placing examinees in to performance categories. A secondary research question of relative performance of the scaling methods is also investigated. The results indicate that estimated true equating shows tremendous promise with the dramatically lower bias and RMSE values when compared to IRT true score equating. However, this promise does not bear out when looking at examinee classification. Despite the lack of significant gains in the area of decision accuracy, this new equating method shows promise in its reduction of error attributable to the equating functions themselves, and therefore deserves further scrutiny. The results fail to indicate a clear choice for a scaling method for use with either equating method. Practitioners still must do their best to rely on the growing body of evidence, and consider the nature of their own testing programs, and the abilities of their examinee population when choosing a scaling method. Educational evaluation
166	Developing a monitoring and evaluation framework for the mentoring component of the Principals Academy Trust Kölzer, Joshua 15 March 2023 (has links) (PDF) The Principals Academy Trust (PAT) is a non-profit organisation, whose programme is focused on improving the leadership competencies of school principals in poor and marginalised communities in South Africa largely through mentoring. While PAT collects quantitative data to monitor the performance of the schools in their programme, currently no data is collected to measure the extent to which PAT's mentoring efforts are positively impacting the schools' culture or climate. This study sought to develop an M&E framework for the mentoring component of the PAT programme. The M&E framework is informed by an extensive programme theory evaluation. The framework is designed to enable PAT to monitor and evaluate the potential changes in school climate and culture as a result of the mentoring component of the PAT programme. For the purposes of this study, Donaldson's step-by-step model for conducting a programme theory evaluation was merged with Markiewicz & Patrick's step-by-step guide on how to develop an M&E framework. The aim of this approach was to ensure that the results of the theory evaluation provide the foundation for the M&E framework. A combination of desk research, focus groups with PAT's programme staff and a number of semi-structured interviews with the head mentor and the fundraising consultant for PAT, was used to create the M&E framework. The results of the theory evaluation indicate that according to recent social science research in the field of education, the causality assumed in PAT's programme theory is plausible. The theory evaluation concludes that it is plausible to assume positive impacts on school culture and climate through systematic mentoring of school principals. The results of the development of the M&E framework are presented as a complete monitoring plan, evaluation plan, data collection and management plan, data analysis and synthesis strategy, learning strategy and implementation plan. Programme Evaluation
167	The Essentials of Program Evaluation at a Geriatric Day Hospital Dutton, Tanya L. 03 1900 (has links) 1 volume Program Evaluation
168	A Study of the Impact of Funding on Growth and Development of Selected School and Colleges of Allied Health Dwyer, Kathleen Marie January 1983 (has links) No description available. Educational Evaluation
169	An evaluation of the new twelve-year school plan for South Carolina Hopson, Raymond W. January 1947 (has links) No description available. Educational Evaluation
170	The outcomes and impact of school based evaluation Groves, Robin Clive, n/a January 1983 (has links) This study concerns school based evaluation; evaluation of a school or some aspect of its operation which is carried out by the teachers and other interested members of the school community. When the decision to evaluate and the control of the evaluation are at the school level a complex, dynamic situation is created. The teachers in the school concurrently have roles as evaluators and as those being evaluated, as well as continuing in their other normal teachers' roles. The history of educational evaluation in the United States of America, the United Kingdom and Australia is traced. An outline is given of the developments in the more traditional methods based on measurement of achievement of objectives on the one hand, and on the 'informed judgement of experts' on the other. It is suggested that improvements in both methods have led to a constructive method of evaluation with its roots in both traditions. Some checklists and guidelines for planning evaluations are also reviewed. Interviews were carried out in an A.C.T. high school which had completed an evaluation almost a year earlier. Twenty people were interviewed: some teachers, some parents and a member of the Evaluation and Research Section of the A.C.T. Schools Office. All had been involved with or affected by the evaluation. If an evaluation is initiated and controlled at the school level many new complexities are introduced into the situation. The process of the evaluation becomes of paramount importance. The way the evaluation is initiated and planned, the way information is collected and analysed, and the way decisions are arrived at are uppermost in participants' minds. Probably the early stages are the most crucial in establishing the climate and structure for the evaluation, and in developing participants' skills. The effects on staff relationships, staff/parent relationships and the general climate of the school are what the participants are most aware of. There usually are outcomes of a school based evaluation arising from recommendations, but these often are more subtle than those of a traditional evaluation by outsiders. Changes may also occur during the evaluation, rather than at the end after the presentation of a report as was more traditionally the case. There is a place for school based evaluation in Australian schools, but it should be recognised as a complex process which may involve participants in new roles in an extremely dynamic situation. school based evaluation educational evaluation

Search results