Global ETD Search

1	Meta-analysis of the predictive validity of Scholastic Aptitude Test (SAT) and American College Testing (ACT) scores for college GPA Curabay, Muhammet 04 January 2017 (has links) <p> The college admission systems of the United States require the Scholastic Aptitude Test (SAT) and American College Testing (ACT) examinations. Although, some resources suggest that SAT and ACT scores give some meaningful information about academic success, others disagree. The objective of this study was to determine whether there is significant predictive validity of SAT and ACT exams for college success. This study examined the effectiveness of SAT and ACT scores for predicting college students’ first year GPA scores with a meta-analytic approach. Most of the studies were retrieved from Academic Search Complete and ERIC databases, published between 1990 and 2016. In total, 60 effect sizes were obtained from 48 studies. The average correlation between test score and college GPA was 0.36 (95% confidence interval: .32, .39) using a random effects model. There was a significant positive relationship between exam score and college success. Moderators examined were publication status and exam type with no effect found for publication status. A significant effect of exam type was found, with a slightly higher average correlation for SAT compared to ACT score and college GPA. No publication bias was found in the study.</p> Educational tests & measurements
2	From Cribs to Crayons\| A Study on the Use of Universal Curriculum and Assessment of Preschool Students and Teachers in the Classroom Williams, Karen 01 December 2016 (has links) <p> Current research indicates there is a correlation between participating in an early childhood program and a student’s performance on future standardized measures, including the challenge of using early learning standards (Feldman, 2010). This research study focused on state initiatives, and student participation in an early childhood preschool model centered on the use of universal curriculum and assessment designed to measure student outcomes aligned to learning targets, outlined in state preschool curriculum standards. Research shows learning decreases for students who have not participated in an early childhood program, while those who have participated in some kind of early childhood program show progress (Heckman, 2011). Young children come to school with varying degrees of experiences, which may or may not enhance their learning. Educators are responsible for providing positive experiences and provide academic activities to develop academic awareness, social/emotional skills, in addition to displaying appropriate behavioral skills. Participation in preschool should also build a student’s level of independence and competency skills. This research study examined state initiatives and curriculum materials, and assessment tools related to the importance of early childhood education programming and teacher practices, and the impact of universal curriculum and assessment implemented in the classroom during the school year. In addition, it further explored teacher perspectives on educational programming, Louisiana’s early childhood initiatives, and the use of universal curriculum and assessment in their classroom.</p> Educational tests & measurements
3	Making test anxiety a laughing matter\| A quantitative study Repass, Jim T. 04 April 2017 (has links) <p> Relieving test anxiety actions range from relaxation exercises to prescription medication. Humor can be a simple method of test anxiety relief. The current study was used to determine if humor, in the form of a cartoon, placed on the splash page of an online exam improved the test scores of students who have high test anxiety. In the current study, 2 theories were used to guide the research. The interference theory by Ralf Schwarzer and Matthias Jerusalem indicated students have difficulty separating competing thoughts during an exam. In the adult learning theory by Malcolm Knowles, the learning of children and adults was differentiated, while explaining how adults learn. A quasi-experimental quantitative design was used to find a possible correlation between humor and test anxiety relief. The study sample comprised an equal number of students with high test anxiety and students with low test anxiety. The low test anxiety group comprised the control group. A 2-sample <i>t</i> test was used to search for a correlation between the cartoon and the exam scores. Intended benefits of the study included: (a) students with test anxiety find relief from test anxiety, (b) instructors achieve reliable assessments of students with test anxiety, and (c) confident, well-educated graduates. The current study results showed the opposite of expected results. The high test anxiety group did worse on the exam with the cartoon. The 2-sample <i> t</i> test showed a negative improvement of –6.222 between midterm and final exams for the high test anxiety group.</p> Educational tests & measurements
4	An Examination of the Impact of Residuals and Residual Covariance Structures on Scores for Next Generation, Mixed-Format, Online Assessments with the Existence of Potential Irrelevant Dimensions Under Various Calibration Strategies Bukhari, Nurliyana 23 August 2017 (has links) <p> In general, newer educational assessments are deemed more demanding challenges than students are currently prepared to face. Two types of factors may contribute to the test scores: (1) factors or dimensions that are of primary interest to the construct or test domain; and, (2) factors or dimensions that are irrelevant to the construct, causing residual covariance that may impede the assessment of psychometric characteristics and jeopardize the validity of the test scores, their interpretations, and intended uses. To date, researchers performing item response theory (IRT)-based model simulation research in educational measurement have not been able to generate data, which mirrors the complexity of real testing data due to difficulty in separating different types of errors from multiple sources and due to comparability issues across different psychometric models, estimators, and scaling choices.</p><p> Using the context of the next generation K-12 assessments, I employed a computer simulation to generate test data under six test configurations. Specifically, I generated tests that varied based on the sample size of examinees, the degree of correlation between four primary dimensions, the number of items per dimension, and the discrimination levels of the primary dimensions. I also explicitly modeled the potential nuisance dimensions in addition to the four primary dimensions of interest, for which (when two nuisance dimensions were modeled) I also used varying degrees of correlation. I used this approach for two purposes. First, I aimed to explore the effects that two calibration strategies have on the structure of residuals of such complex assessments when the nuisance dimensions are not explicitly modeled during the calibration processes and when tests differ in testing configurations. The two calibration models I used included a unidimensional IRT (UIRT) model and a multidimensional IRT (MIRT) model. For this test, both models only considered the four primary dimensions of interest. Second, I also wanted to examine the residual covariance structures when the six test configurations vary. The residual covariance in this case would indicate statistical dependencies due to unintended dimensionality. </p><p> I employed Luecht and Ackerman’s (2017) expected response function (ERF)-based residuals approach to evaluate the performance of the two calibration models and to prune the bias-induced residuals from the other measurement errors. Their approach provides four types of residuals that are comparable across different psychometric models and estimation methods, hence are ‘metric-neutral’. The four residuals are: (1) e0, which comprises the total residuals or total errors; (2) e1, the bias-induced residuals; (3) e2, the parameter-estimation residuals; and, (4) e3, the estimated model-data fit residuals.</p><p> With regard to my first purpose, I found that the MIRT model tends to produce less estimation error than the UIRT model on average (e2MIRT is less than e2UIRT) and tends to fit the data better than the UIRT model on average (e3MIRT is less than e3UIRT). With regard to my second research purpose, my analyses of the correlations of the bias-induced residuals provide evidence of the large impact of the presence of nuisance dimension regardless of its amount. On average, I found that the residual correlations increase with the presence of at least one nuisance dimension but tend to decrease with high item discriminations.</p><p> My findings shed light on the need to consider the choice of calibration model, especially when there are some intended and unintended indications of multidimensionality in the assessment. Essentially, I applied a cutting-edge technique based on the ERF-based residuals approach (Luecht & Ackerman, 2017) that permits measurement errors (systematic or random) to be cleanly partitioned, understood, examined, and interpreted—in-context and in relative to difference-that-matters criteria—regardless of the choice of scaling, calibration models, and estimation methods. For that purpose, I conducted my work based on the context of the complex reality of the next generation K-12 assessments and based on my effort to maintain adherence to the established educational measurement standards (American Educational Research Association (AERA), the American Psychological Association (APA), and the National Council on Measurement in Education (NCME), 1999, 2014); International Test Commission (ITC) (ITC, 2005a, 2005b, 2013a, 2013b, 2014, 2015)).</p><p> Educational tests & measurements
5	Mitigating the Effects of Test Anxiety through a Relaxation Technique Called Sensory Activation Abbott, Marylynne 15 January 2017 (has links) <p> Test anxiety is a phenomenon which has been researched for decades. Student performance, goal attainment, and personal lives are all negatively affected by the multiple factors of test anxiety. This quantitative study was designed to determine if a particular relaxation technique, called sensory activation, could mitigate the symptoms and effects of test anxiety. The Test and Anxiety Examination Measure, developed by Brooks, Alshafei, and Taylor (2015), was used to measure test anxiety levels before and after implementation of the sensory activation relaxation technique. Two research questions guided the study using not only the overall test anxiety score from the Test and Anxiety Examination Measure, but also using the five subscale scores provided within the instrument. After collection and analysis of data, the results for research question one indicated a statistically significant positive difference in mean levels of overall test anxiety. Not only were overall mean test anxiety levels lowered, but findings for research question two showed significant decreases in worry and state anxiety subscale scores. Considering the sensory activation relaxation technique was used during the examination period, it is reasonable to assume its effectiveness would be limited to lowering state anxiety levels rather than trait anxiety levels. Also, results from prompt 10 of the Test and Examination Anxiety Measure (Brooks et al., 2015) indicated the sensory activation relaxation technique could serve as a possible deterrent to the “going blank” problem as described anecdotally by students. Instructors could introduce the sensory activation relaxation technique to their students prior to the first testing event in the course, thus producing the desired outcomes of better test performance and less anxiety. </p> Educational tests & measurements
6	Maintenance of vertical scales under conditions of item parameter drift and Rasch model-data misfit O'Neil, Timothy P 01 January 2010 (has links) With scant research to draw upon with respect to the maintenance of vertical scales over time, decisions around the creation and performance of vertical scales over time necessarily suffers due to the lack of information. Undetected item parameter drift (IPD) presents one of the greatest threats to scale maintenance within an item response theory (IRT) framework. There is also still an outstanding question as to the utility of the Rasch model as an underlying viable framework for establishing and maintaining vertical scales. Even so, this model is currently used for scaling many state assessment systems. Most criticisms of the Rasch model in this context have not involved simulation. And most have not acknowledged conditions in which the model may function sufficiently to justify its use in vertical scaling. To address these questions, vertical scales were created from real data using the Rasch and 3PL models. Ability estimates were then generated to simulate a second (Time 2) administration. These simulated data were placed onto the base vertical scales using a horizontal vertical scaling approach and a mean-mean transformation. To examine the effects of IPD on vertical scale maintenance, several conditions of IPD were simulated to occur within each set of linking items. In order to evaluate the viability of using the Rasch model within a vertical scaling context, data were generated and calibrated at Time 2 within each model (Rasch and 3PL) as well as across each model (Rasch data generataion/3PL calibration, and vice versa). Results pertaining the first question of the effect IPD has on vertical scale maintenance demonstrate that IPD has an effect directly related to percentage of drifting linking items, the magnitude of IPD exhibited, and the direction. With respect to the viability of using the Rasch model within a vertical scaling context, results suggest that the Rasch model is perfectly viable within a vertical scaling context in which the model is appropriate for the data. It is also clearly evident that where data involve varying discrimination and guessing, use of the Rasch model is inappropriate. Educational tests & measurements
7	Using a mixture IRT model to understand English learner performance on large-scale assessments Shea, Christine A 01 January 2013 (has links) The purpose of this study was to determine whether an eighth grade state-level math assessment contained items that function differentially (DIF) for English Learner students (EL) as compared to English Only students (EO) and if so, what factors might have caused DIF. To determine this, Differential Item Functioning (DIF) analysis was employed. Subsequently, a Mixture Item Response Theory Model (MIRTM) was fit to determine why items function differentially for EL examinees. Several additional methods were employed to examine what item level factors may have caused ELs difficulty. An item review by a linguist was conducted to determine what item characteristics may have caused ELs difficulty; multiple linear regression was performed to test whether identified difficult characteristics predict an item's chi-squared values; and distractor analysis was conducted to determine whether there were certain answer choices that were more attractive to ELs. Logistic regression was performed for each item to test whether the student background variables of poverty and first language or being an EL predicted item correctness. The DIF results using Lord's Chi-Squared test identified 4 items as having meaningful DIF >0.2 using the range-null hypothesis. Of those items, there were 2 items favoring the EO population that were identified as assessing the Data Analysis, Statistics and Probability strand of the state Math Standards. As well, there were 2 DIF items that favored the EL population that were identified as assessing the Number Sense and Operations strand of the state Math Standards. The length of the item as judged in the item review supported several items that were identified as DIF. The Mixture IRT Model was run using 3 conditions. It was found that with all three conditions, the overall latent class groupings did not match those of the manifest groups of EO and EL. To probe further into the results of the latent class groupings, the student background variables of poverty, language proficiency status or first language spoken were compared to the latent class groupings. In looking at these results, it was not evident that these student background variables better explain the latent class groupings. Educational tests & measurements
8	Effect of automatic item generation on ability estimates in a multistage test Colvin, Kimberly F 01 January 2014 (has links) In adaptive testing, including multistage adaptive testing (MST), the psychometric properties of the test items are needed to route the examinees through the test. However, if testing programs use items which are automatically generated at the time of administration there is no opportunity to calibrate the items therefore the items' psychometric properties need to be predicted. This simulation study evaluates the accuracy with which examinees' abilities can be estimated when automatically generated items, specifically, item clones, are used in MSTs. The behavior of the clones in this study was modeled according to the results of Sinharay and Johnson's (2008) investigation into item clones that were administered in an experimental section of the Graduate Record Examination (GRE). In the current study, as more clones were incorporated or when the clones varied greatly from the parent items, the examinees' abilities were not as accurately estimated. However, there were a number of promising conditions; for example, on a 600-point scale, the absolute bias was less than 10 points for most examinees when all items were simulated to be clones with small variation from their parent items or when all first stage items were simulated to have moderate variation from their parents and no items in the second stage were cloned items. Educational tests & measurements
9	A Validity Study of the American School Counselor Association (ASCA) National Model Readiness Self-Assessment Instrument McGannon, Wendy 01 January 2007 (has links) School counseling has great potential to help students achieve to high standards in the academic, career, and personal/social aspects of their lives (House & Martin, 1998). With the advent of No Child Left Behind (NCLB, 2001) the role of the school counselor is beginning to change. In response to the challenges and pressures to implement standards-based educational programs, the American School Counselor Association released “The ASCA National Model: A Framework for School Counseling Programs” (ASCA, 2003). The ASCA National Model was designed with an increased focus on both accountability and the use of data to make decisions and to increase student achievement. It is intended to ensure that all students are served by the school counseling program by using student data to advocate for equity, to facilitate student improvement, and to provide strategies for closing the achievement gap. The purpose of this study was to investigate the psychometric properties of an instrument designed to assess school districts' readiness to implement the ASCA National Model. Data were gathered from 693 respondents of a web-based version of the ASCA National Model Readiness Self-Assessment Instrument. Confirmatory factor analysis did not support the structure of the 7-factor model. Exploratory factor analysis produced a 3-factor model which was supported by confirmatory factor analyses, after creating variable parcels within each of the three factors. Based on the item loadings within each factor, the factors were labeled in the following manner: factor one was labeled School Counselor Characteristics, factor two was labeled District Conditions and factor three was labeled School Counseling Program Supports. Cross-validation of this model with an independent data sample of 363 respondents to the ASCA Readiness Instrument provided additional evidence to support the three factor model. The results of these analyses will be used to give school districts more concise score report information about necessary changes to support implementation of the ASCA National Model. These results provide evidence to support the interpretation of the scores that will be obtained from the ASCA Readiness Instrument. Educational tests & measurements
10	Comparison of kernel equating and item response theory equating methods Meng, Yu 01 January 2012 (has links) The kernel method of test equating is a unified approach to test equating with some advantages over traditional equating methods. Therefore, it is important to evaluate in a comprehensive way the usefulness and appropriateness of the Kernel equating (KE) method, as well as its advantages and disadvantages compared with several popular item response theory (IRT) equating techniques. The purpose of this study was to evaluate the accuracy and stability of KE and IRT true score equating by manipulating several common factors that are known to influence the equating results. Three equating methods (Kernel post-stratification equating, Stocking-Lord and Mean/Sigma) were compared with an established equating criterion. A wide variety of conditions were simulated to match realistic situations that reflected differences in sample size, anchor test length and, group ability differences. The systematic error and random error of equating were summarized with bias statistics and the standard error of equating (SEE), and compared across the methods. The overall better equating methods under specific conditions were recommended based on the root mean squared error (RMSE). The equating results revealed that, and as expected, in general, equating error decreased as the number of anchor items was increased and sample size was increased across all the methods. Aside from method effects, group differences in ability produced the greatest impact on equating error in this particular study. The accuracy and stability of each equating method depended on the portion of the score scale range where comparisons were being made. Overall, Kernel equating was shown to be more stable in most situations but not as accurate as IRT equating for the conditions studied. The interactions between pairs of factors investigated in this study seemed to be more influential and beneficial to IRT equating than for KE. Further practical recommendations were suggested for future study: for example, using alternate methods of data simulation to remove the advantage of the IRT equating methods. Educational tests & measurements

Page generated in 0.2413 seconds