Global ETD Search

261	Concept mapping in evaluation practice and theory: A synthesis of current empirical research. Rizzo Michelin, Linda L. January 1998 (has links) Concept mapping is a conceptualization process that can be used by individuals and groups to develop conceptual frameworks which can guide evaluations and planning (Trochim, 1989). In research, these processes display individual and group representations of concepts about particular domains, illustrating potential relationships among them (Miles, 1994). Cognitive mapping processes involve the acquisition, store, access and utilization of spacial knowledge (Golledge, 1986). Empirical research using concept mapping technology has proliferated within the past fifteen years. Investigation of this research has revealed the existence of a wide variation of domains of inquiry and applications of concept mapping. Using non-traditional meta-analytic research techniques employed in prior reviews by Cousins and associates (Cousins, 1996; Cousins & Earl, 1992; Cousins & Leithwood, 1986; Ross, in press) and others (e.g., Leithwood & Montgomery, 1982), the empirical research studies are explored with relevance to evaluation theory and practice. Emphasis on concept mapping process variations and use in evaluation is ordered. This study provides researchers and evaluators with valuable empirical basis from which to make choices regarding selection and applications of concept mapping. Education, Tests and Measurements.
262	Effects of pre-testing commercial pesticide applicators prior to engaging in a short adult education activity Hlatky, Robert M. January 1973 (has links) The purposes of this study were to determine the relationships of participant socio-economic characteristics to the post-test, to investigate the effects of pre-testing in a short-term adult education programme, and to assess the influence of pre-course utilization of the handbooks on pre-test and post-test scores. The study was carried out on a group of 324 commercial pesticide applicators who attended 16 individual short courses conducted in 1972 by the British Columbia Department of Agriculture as a means of upgrading the participants' knowledge of the safe and proper uses of pesticides. The design used was a modification of the pre-test/post-test control group type with 135 individuals assigned to the treatment condition and 189 assigned to the control. Three hypotheses were tested in the study. The hypothesis of primary concern attempted to determine whether pre-testing the participants significantly improved their post-test scores. A second hypothesis was tested to determine whether a relationship existed between the socio-economic variables and the post-test scores. A final hypothesis was tested to determine whether the intensity of pre-course handbook utilization significantly influenced pre-test and post-test mean scores. No differential effect due to significant treatment-control differences were observed in the variables: area of origin of participants, proportion of salary earned from pesticide application, previous attendance at BCDA sponsored short courses, previous attendance at related, non-BCDA short courses, and number of pesticide application certificates held. The control group were of significantly higher age, had a longer period of residence in Canada, and had more experience as pesticide applicators than the treatment group. The effects of each of these characteristics upon the post-test was negligible because of their low individual correlation with the post-test scores. The three variables; previous attendance at BCDA sponsored short courses, previous attendance at related non-BCDA short courses, and number of pesticide application certificates held, exhibited a significantly high degree of mutual inter-correlation. This indicated the variables were measuring a common factor such as a need to participate. Both educational level and pre-test scores significantly influenced the post-test mean score although the influence of the latter was definitely more pronounced. The intensity of handbook utilization positively influenced only the post-test mean score of those participants who received no pre-test. This indicated the pre-test was a better means of improving the post-test mean score than pre-course distribution of the handbooks. / Education, Faculty of / Graduate Educational tests and measurements
263	The construction of a criterion-referenced physical education knowledge test Wilson, Gail E. January 1980 (has links) Throughout the last two decades, physical educators have worked to develop a specific body of knowledge. Associated with the formation of this body of knowledge has been a trend by most physical educators to include a cognitive objective as one of the stated aims in their physical education, curricula. As a result, the need for adequate knowledge assessment instruments has become apparent. Although some assessment of knowledge in physical and health education has occurred since the late 1920's, the majority of tests which have been developed to date are directed towards the evaluation of knowledge in specific sports or activities. Relatively few tests are available that assess general knowledge concepts in physical education. As well, all of the knowledge tests that have been produced are norm-referenced' instruments. That is, they have been constructed for the purpose of ranking individuals and comparing differences among them. The purpose of this study was to design a criterion-referenced test which would assess the physical education knowledge of grade eleven high school students in British Columbia and which could function as a measurement instrument for the evaluation of groups or classes. As a criterion-referenced assessment tool, the knowledge test assesses the performance of individuals based on' objectives which had been previously formulated by the Learning Assessment Branch of the Ministry of Education in British Columbia. In order to prepare a table of specifications for the design of the test, the specific objectives to be measured were grouped into six subtest areas. Multiple-choice items were then constructed according to the requirements of the table of specifications. For the initial pilot administration of the test, two test forms, of 48 items each, were developed. Each of these forms included three of the six sub-test areas. One half of the 288 students to whom the first pilot was administered answered Form A while the remaining students answered Form B. Following the administration of pilot test 1, the results obtained were analysed by the Laboratory of Educational Research Test Analysis Package (LERTAP), and were subjectively reviewed by an advisory panel. As a result of these procedures, 70 items were retained for use on the second pilot test. This test was administered to 133 students and the results were again analysed subjectively and psychometrically. Thirty-eight items from pilot test 2 were considered acceptable for use on the final pilot test. In order to maintain adherence to the table of specifications, nine new items were developed and after approval by the advisory panel, were included on the third test form. This form was given to 800 grade eleven students and the responses of 250 randomly selected students were analysed by the LERTAP procedure. The analysis indicated that all items were psychometrically sound and the reliability of this form was estimated at .71. Thus, the items utilized during the third pilot administration constituted the final form of the knowledge test. The test is suitable for evaluating groups and the six sub-tests, as well as the total test, can be used to identify strengths and weaknesses within programs. / Education, Faculty of / Kinesiology, School of / Graduate Educational tests and measurements
264	Mitigating the Effects of Test Anxiety through a Relaxation Technique Called Sensory Activation Abbott, Marylynne 15 January 2017 (has links) <p> Test anxiety is a phenomenon which has been researched for decades. Student performance, goal attainment, and personal lives are all negatively affected by the multiple factors of test anxiety. This quantitative study was designed to determine if a particular relaxation technique, called sensory activation, could mitigate the symptoms and effects of test anxiety. The Test and Anxiety Examination Measure, developed by Brooks, Alshafei, and Taylor (2015), was used to measure test anxiety levels before and after implementation of the sensory activation relaxation technique. Two research questions guided the study using not only the overall test anxiety score from the Test and Anxiety Examination Measure, but also using the five subscale scores provided within the instrument. After collection and analysis of data, the results for research question one indicated a statistically significant positive difference in mean levels of overall test anxiety. Not only were overall mean test anxiety levels lowered, but findings for research question two showed significant decreases in worry and state anxiety subscale scores. Considering the sensory activation relaxation technique was used during the examination period, it is reasonable to assume its effectiveness would be limited to lowering state anxiety levels rather than trait anxiety levels. Also, results from prompt 10 of the Test and Examination Anxiety Measure (Brooks et al., 2015) indicated the sensory activation relaxation technique could serve as a possible deterrent to the “going blank” problem as described anecdotally by students. Instructors could introduce the sensory activation relaxation technique to their students prior to the first testing event in the course, thus producing the desired outcomes of better test performance and less anxiety. </p> Educational tests & measurements
265	Maintenance of vertical scales under conditions of item parameter drift and Rasch model-data misfit O'Neil, Timothy P 01 January 2010 (has links) With scant research to draw upon with respect to the maintenance of vertical scales over time, decisions around the creation and performance of vertical scales over time necessarily suffers due to the lack of information. Undetected item parameter drift (IPD) presents one of the greatest threats to scale maintenance within an item response theory (IRT) framework. There is also still an outstanding question as to the utility of the Rasch model as an underlying viable framework for establishing and maintaining vertical scales. Even so, this model is currently used for scaling many state assessment systems. Most criticisms of the Rasch model in this context have not involved simulation. And most have not acknowledged conditions in which the model may function sufficiently to justify its use in vertical scaling. To address these questions, vertical scales were created from real data using the Rasch and 3PL models. Ability estimates were then generated to simulate a second (Time 2) administration. These simulated data were placed onto the base vertical scales using a horizontal vertical scaling approach and a mean-mean transformation. To examine the effects of IPD on vertical scale maintenance, several conditions of IPD were simulated to occur within each set of linking items. In order to evaluate the viability of using the Rasch model within a vertical scaling context, data were generated and calibrated at Time 2 within each model (Rasch and 3PL) as well as across each model (Rasch data generataion/3PL calibration, and vice versa). Results pertaining the first question of the effect IPD has on vertical scale maintenance demonstrate that IPD has an effect directly related to percentage of drifting linking items, the magnitude of IPD exhibited, and the direction. With respect to the viability of using the Rasch model within a vertical scaling context, results suggest that the Rasch model is perfectly viable within a vertical scaling context in which the model is appropriate for the data. It is also clearly evident that where data involve varying discrimination and guessing, use of the Rasch model is inappropriate. Educational tests & measurements
266	Using a mixture IRT model to understand English learner performance on large-scale assessments Shea, Christine A 01 January 2013 (has links) The purpose of this study was to determine whether an eighth grade state-level math assessment contained items that function differentially (DIF) for English Learner students (EL) as compared to English Only students (EO) and if so, what factors might have caused DIF. To determine this, Differential Item Functioning (DIF) analysis was employed. Subsequently, a Mixture Item Response Theory Model (MIRTM) was fit to determine why items function differentially for EL examinees. Several additional methods were employed to examine what item level factors may have caused ELs difficulty. An item review by a linguist was conducted to determine what item characteristics may have caused ELs difficulty; multiple linear regression was performed to test whether identified difficult characteristics predict an item's chi-squared values; and distractor analysis was conducted to determine whether there were certain answer choices that were more attractive to ELs. Logistic regression was performed for each item to test whether the student background variables of poverty and first language or being an EL predicted item correctness. The DIF results using Lord's Chi-Squared test identified 4 items as having meaningful DIF >0.2 using the range-null hypothesis. Of those items, there were 2 items favoring the EO population that were identified as assessing the Data Analysis, Statistics and Probability strand of the state Math Standards. As well, there were 2 DIF items that favored the EL population that were identified as assessing the Number Sense and Operations strand of the state Math Standards. The length of the item as judged in the item review supported several items that were identified as DIF. The Mixture IRT Model was run using 3 conditions. It was found that with all three conditions, the overall latent class groupings did not match those of the manifest groups of EO and EL. To probe further into the results of the latent class groupings, the student background variables of poverty, language proficiency status or first language spoken were compared to the latent class groupings. In looking at these results, it was not evident that these student background variables better explain the latent class groupings. Educational tests & measurements
267	Effect of automatic item generation on ability estimates in a multistage test Colvin, Kimberly F 01 January 2014 (has links) In adaptive testing, including multistage adaptive testing (MST), the psychometric properties of the test items are needed to route the examinees through the test. However, if testing programs use items which are automatically generated at the time of administration there is no opportunity to calibrate the items therefore the items' psychometric properties need to be predicted. This simulation study evaluates the accuracy with which examinees' abilities can be estimated when automatically generated items, specifically, item clones, are used in MSTs. The behavior of the clones in this study was modeled according to the results of Sinharay and Johnson's (2008) investigation into item clones that were administered in an experimental section of the Graduate Record Examination (GRE). In the current study, as more clones were incorporated or when the clones varied greatly from the parent items, the examinees' abilities were not as accurately estimated. However, there were a number of promising conditions; for example, on a 600-point scale, the absolute bias was less than 10 points for most examinees when all items were simulated to be clones with small variation from their parent items or when all first stage items were simulated to have moderate variation from their parents and no items in the second stage were cloned items. Educational tests & measurements
268	A Validity Study of the American School Counselor Association (ASCA) National Model Readiness Self-Assessment Instrument McGannon, Wendy 01 January 2007 (has links) School counseling has great potential to help students achieve to high standards in the academic, career, and personal/social aspects of their lives (House & Martin, 1998). With the advent of No Child Left Behind (NCLB, 2001) the role of the school counselor is beginning to change. In response to the challenges and pressures to implement standards-based educational programs, the American School Counselor Association released “The ASCA National Model: A Framework for School Counseling Programs” (ASCA, 2003). The ASCA National Model was designed with an increased focus on both accountability and the use of data to make decisions and to increase student achievement. It is intended to ensure that all students are served by the school counseling program by using student data to advocate for equity, to facilitate student improvement, and to provide strategies for closing the achievement gap. The purpose of this study was to investigate the psychometric properties of an instrument designed to assess school districts' readiness to implement the ASCA National Model. Data were gathered from 693 respondents of a web-based version of the ASCA National Model Readiness Self-Assessment Instrument. Confirmatory factor analysis did not support the structure of the 7-factor model. Exploratory factor analysis produced a 3-factor model which was supported by confirmatory factor analyses, after creating variable parcels within each of the three factors. Based on the item loadings within each factor, the factors were labeled in the following manner: factor one was labeled School Counselor Characteristics, factor two was labeled District Conditions and factor three was labeled School Counseling Program Supports. Cross-validation of this model with an independent data sample of 363 respondents to the ASCA Readiness Instrument provided additional evidence to support the three factor model. The results of these analyses will be used to give school districts more concise score report information about necessary changes to support implementation of the ASCA National Model. These results provide evidence to support the interpretation of the scores that will be obtained from the ASCA Readiness Instrument. Educational tests & measurements
269	Meta-Analysis of Factor Analyses: Comparison of Univariate and Multivariate Approaches Using Correlation Matrices and Factor Loadings Unknown Date (has links) Currently, more sophisticated techniques such as factor analyses are frequently applied in primary research thus may need to be meta-analyzed. This topic has been given little attention in the past due to its complexity. Because factor analysis is becoming more popular in research in many areas including education, social work, social science, and so on, the study of methods for the meta-analysis of factor analyses is also becoming more important. The first main purpose of this dissertation is to compare the results of seven different approaches to doing meta-analysis of confirmatory factor analyses. Specifically, five approaches are based on univariate meta-analysis methods. The next two approaches use multivariate meta-analysis to obtain the results of factor loadings and the standard errors of factor loadings. The results from each approach are compared. Given the fact that factor analyses are commonly used in many areas, the second purpose of this dissertation is to explore the appropriate approach or approaches to use for the meta-analysis of factor analyses, especially Confirmatory Factor Analysis (CFA). When the average sample size was small, the results of IRD, WMC, WMFL, and GLS-MFL approaches showed better performance than those of UMC, MFL, and GLS-MC approaches to estimating parameters. With large average sample sizes (larger than 150), the performance to estimate the parameters across all seven approaches seemed to be similar in this dissertation. Based on my simulation results, researchers who want to conduct meta-analytic confirmatory factor analysis can apply any of these approaches to synthesize the results from primary studies it their studies have n > 150. / A Dissertation submitted to the Department of Educational Psychology and Learning Systems in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Summer Semester 2015. / June 9, 2015. / factor analysis, meta-analysis, multivariate meta-analysis, univariate meta-analysis / Includes bibliographical references. / Betsy J. Becker, Professor Directing Dissertation; Fred Huffer, University Representative; Insu Paek, Committee Member; Yanyun Yang, Committee Member. Educational tests and measurements Statistics
270	Comparison of kernel equating and item response theory equating methods Meng, Yu 01 January 2012 (has links) The kernel method of test equating is a unified approach to test equating with some advantages over traditional equating methods. Therefore, it is important to evaluate in a comprehensive way the usefulness and appropriateness of the Kernel equating (KE) method, as well as its advantages and disadvantages compared with several popular item response theory (IRT) equating techniques. The purpose of this study was to evaluate the accuracy and stability of KE and IRT true score equating by manipulating several common factors that are known to influence the equating results. Three equating methods (Kernel post-stratification equating, Stocking-Lord and Mean/Sigma) were compared with an established equating criterion. A wide variety of conditions were simulated to match realistic situations that reflected differences in sample size, anchor test length and, group ability differences. The systematic error and random error of equating were summarized with bias statistics and the standard error of equating (SEE), and compared across the methods. The overall better equating methods under specific conditions were recommended based on the root mean squared error (RMSE). The equating results revealed that, and as expected, in general, equating error decreased as the number of anchor items was increased and sample size was increased across all the methods. Aside from method effects, group differences in ability produced the greatest impact on equating error in this particular study. The accuracy and stability of each equating method depended on the portion of the score scale range where comparisons were being made. Overall, Kernel equating was shown to be more stable in most situations but not as accurate as IRT equating for the conditions studied. The interactions between pairs of factors investigated in this study seemed to be more influential and beneficial to IRT equating than for KE. Further practical recommendations were suggested for future study: for example, using alternate methods of data simulation to remove the advantage of the IRT equating methods. Educational tests & measurements

Search results