Spelling suggestions: "subject:"educational tests& measurements"" "subject:"cducational tests& measurements""
131 |
NCLEX Success First Attempt| An Exploratory Study of PassPoint and Comparative Analysis of Traditional Testing Versus Computerized Adaptive TestingSingh, Onkar 06 December 2017 (has links)
<p> Schools of nursing around the United States take multiple measures to prepare nursing students for safe practice in today’s complex healthcare system. One area in which schools of nursing continue to struggle is the first-attempt pass rates of NCLEX-RN. Despite various ways of preparing nursing student graduates, the NCLEX-RN first-attempt pass rates for United States’ nursng schools remain sub-optimal. Because many of the efforts to increase first-attempt NCLEX-RN pass rates have been inadequate and new ways of preparing nursing students remain underexplored, the purpose of this study was to explore a computerized adaptive testing program, PassPoint, and identify any predictors for NCLEX-RN success on first attempt. The purpose was also to compare and analyze the computerized adaptive testing program, PassPoint, to a traditional preparatory testing method, Kaplan, in relation to NCLEX-RN first-attempt success in an associate degree nursing program in the midwestern United States. After employing a retrospective correlation design, a number of statistically significant relationships were noted.</p><p>
|
132 |
Mathematics Formative Assessment System| Testing the theory of action based on the results of a randomized field trialLaVenia, Mark 13 October 2016 (has links)
<p> The purpose of the current study was to test the theory of action hypothesized for the Mathematics Formative Assessment System (MFAS) based on results from a large-scale randomized field trial. Using a multilevel structural equation modeling analytic approach with multiple latent response variables decomposed across student, teacher, and school levels of clustering, the current study found evidence of effects of MFAS that were consistent with the MFAS theory of action. First, assignment to the treatment condition was associated with higher mean student mathematics performance and a higher prevalence of small group instruction compared to schools assigned to the control condition—both of which are outcomes hypothesized to result from MFAS use. Also, a positive association between teacher-level mathematics knowledge for teaching and student mathematics performance was found in the current study, which is consistent with the interrelation of constructs specified in the MFAS theory of action. However, evidence of the particular linkages of MFAS use→teacher knowledge→classroom practice→student mathematics performance and the putative cascade of effects that would substantiate the mechanisms of change posited in the MFAS theory of action were not detected in the current study. Thus, positive effects of MFAS on teacher and student outcomes were substantiated; however, as to how the effects of MFAS on teachers transfer to improved outcomes for students remains to be empirically demonstrated. Based on my review of the results from the current study and consideration of the literature on formative assessment as it relates to the design of MFAS tasks and rubrics, I discuss a proposed modification to the theory of action that specifies the addition of a direct path from MFAS use to student mathematics performance, in addition to the indirect path currently specified.</p>
|
133 |
Development of a brief rating scale for the formative assessment of positive behaviorsCressey, James M 01 January 2010 (has links)
In order to provide effective social, emotional, and behavioral supports to all students, there is a need for formative assessment tools that can help determine the responsiveness of students to intervention. Schoolwide positive behavior support (SWPBS) is one framework that can provide evidence-based intervention within a 3-tiered model to reach students at all levels of risk. This dissertation begins the process of developing a brief, teacher-completed rating scale, intended to be used with students in grades K-8 for the formative assessment of positive classroom behavior. An item pool of 93 positively worded rating scale items was drawn from or adapted from existing rating scales. Teachers (n = 142) rated the importance of each item to their concept of “positive classroom behavior.” This survey yielded 30 positively worded items for inclusion on the pilot rating scale. The pilot scale was used by teachers to rate students in two samples drawn from general education K-8 classrooms: a universal tier group of randomly selected students (n = 80) and a targeted tier group of students with mild to moderate behavior problems (n = 82). Pilot scale ratings were significantly higher in the universal group than the targeted group by about one standard deviation, with no significant group by gender interaction. Strong results were found for the split-half reliability (.94) and the internal consistency (.98) of the pilot scale. All but two items showed medium to large item-total correlations (> .5). Factor analysis indicated a unidimensional factor structure, with 59.87% of the variance accounted for by a single factor, and high item loadings (> .4) from 26 of the 30 factors. The unidimensional factor structure of the rating scale indicates its promise for potential use as a general outcome measure (GOM), with items reflecting a range of social, emotional, and behavioral competencies. Future research is suggested in order to continue development and revision of the rating scale with a larger, more diverse sample, and to begin exploring its suitability for screening and formative assessment purposes.
|
134 |
Evaluating the validity of MCAS scores as an indicator of teacher effectivenessCopella, Jenna M 01 January 2013 (has links)
The Massachusetts Department of Secondary and Elementary Education (DESE) has implemented an Educator Evaluation Framework that requires MCAS scores be used as a significant indicator of teacher effectiveness when available. This decision has implications for thousands of Massachusetts public school teachers. To date, DESE has not provided evidence to support the validity of using MCAS scores to make interpretations about teacher effectiveness. A review of the literature reveals much variation in the degree to which teachers use state-adopted content standards to plan instruction. The findings in the literature warrant investigation into teacher practice among Massachusetts public school teachers. The research questions for this study will be: 1.) Are there variations in the degree to which Massachusetts public school teachers use the Curriculum Frameworks to plan Math instruction?; and 2.) Is MCAS as an instrument sensitive enough to reflect variations in teacher practice in the student's scores? A survey of Massachusetts public school principals and Math teachers, grades three through eight, investigated the research questions. Survey results revealed that Massachusetts teachers use the Curriculum Frameworks to plan instruction to varying degrees. Survey results also suggest a lack of relationship between teacher practice related to the use of the Curriculum Frameworks and student MCAS scores. These findings suggest MCAS scores may not be an appropriate indicator of teacher effectiveness; however, there are limitations to the study that require further investigation into these questions.
|
135 |
A procedure for developing a common metric in item response theory when parameter posterior distributions are knownBaldwin, Peter 01 January 2008 (has links)
Because item response theory (IRT) models are arbitrarily identified, independently estimated parameters must be transformed to a common metric before they can be compared. To accomplish this, the transformation constants must be estimated and because these estimates are imperfect, there is a propagation of error effect when transforming parameter estimates. However, this error propagation is typically ignored and estimates of the transformation constants are treated as true when transforming parameter estimates to a common metric. To address this shortcoming, a procedure is proposed and evaluated that accounts for the uncertainty in the transformation constants when adjusting for differences in metric. This procedure utilizes random draws from model parameter posterior distributions, which are available when IRT models are estimated using Markov chain Monte Carlo (MCMC) methods. Given two test forms with model parameter vectors Λ Y and ΛX, the proposed procedure works by sampling the posterior of ΛY and Λ X, estimating the transformation constants using these two samples, and transforming sample X to the scale of sample Y. This process is repeated N times, where N is the desired number of transformed posterior draws. A simulation study is conducted to evaluate the feasibility and success of the proposed strategy compared to the traditional strategy of treated scaling constants estimates as error-free. Results were evaluated by comparing the observed coverage probabilities of the transformed posteriors to their expectation. The proposed strategy yielded equal or superior coverage probabilities compared to the traditional strategy for 140 of the 144 comparisons made in this study (97%). Conditions included four methods of estimated the scaling constants and three anchor lengths.
|
136 |
A Bayesian testlet response model with covariates: A simulation study and two applicationsBaldwin, Su G 01 January 2008 (has links)
Understanding the relationship between person, item, and testlet covariates and person, item, and testlet parameters may offer considerable benefits to both test development and test validation efforts. The Bayesian TRT models proposed by Wainer, Bradlow, and Wang (2007) offer a unified structure within which model parameters may be estimated simultaneously with model parameter covariates. This unified approach represents an important advantage of these models: theoretically correct modeling of the relationship between covariates and their respective model parameters. Analogous analyses can be performed via conventional post-hoc regression methods, however, the fully Bayesian framework offers an important advantage over the conventional post-hoc methods by reflecting the uncertainty of the model parameters when estimating their relationship to covariates. The purpose of this study was twofold. First was to conduct a basic simulation study to investigate the accuracy and effectiveness of the Bayesian TRT approach in estimating the relationship of covariates to their respective model parameters. Additionally, the Bayesian TRT results were compared to post-hoc regression results, where the dependent variable was the point estimate of the model parameter of interest. Second, an empirical study applied the Bayesian TRT model to two real data sets: the Step 3 component of the United States Medical Licensing Examination (USMLE), and the Posttraumatic Growth Inventory (PTGI) by Tedeschi and Calhoun (1996). The findings of both simulation and empirical studies suggest that the Bayesian TRT performs very similarly to the post-hoc approach. Detailed discussion is provided and potential future studies are suggested in chapter 5.
|
137 |
The utility of the Individual Reading Evaluation and Diagnostic (iREAD) Inventory, a specific reading skills assessment, for treatment design and implementationKoerner, Andrew J 01 January 2008 (has links)
This study was conducted to assess the effectiveness of the Individualized Reading Evaluation and Diagnosis (iRead) Inventory for accurately assessing specific decoding sub-skill weaknesses and for informing the development of targeted interventions to improve the reading abilities of students. The iRead Inventory is a curriculum-based, specific skills mastery measurement tool for assessing specific decoding weaknesses. Students read word lists targeted to specific vowel combinations to determine weaknesses with particular combinations. The study assessed whether the iRead Inventory could distinguish specific decoding sub-skill weaknesses for students and whether the iRead Inventory was effective in supporting the development of interventions to improve those decoding weaknesses. Students were screened for dysfluency and three students were identified as having primarily decoding issues were selected for the intervention phase of the study. The intervention phase of the study involved using a multiple baseline, randomization design with the three participants receiving interventions beginning at randomly selected times. The iRead Inventory was utilized to identify specific vowel combination difficulties for intervention and the participants were provided direct, sequential instruction targeted to the identified specific decoding weaknesses. The participants' reading progress was monitored using Reading-CBM (R-CBM) and Nonsense Word Fluency (NWF) measures. In addition, their progress with learning the specific sub-skills was monitored using the iRead Inventory. The iRead Inventory was found to reliably assess specific decoding deficits. Interventions that were developed using the iRead Inventory were shown to improve the decoding abilities of all the participants. The two participants who received interventions earlier showed gains in oral reading skills and mastered a number of specific vowel combination decoding skills. The participant who began interventions last showed less gain in both abilities. In addition, there seemed to be a learning curve phenomenon whereby participants did not exhibit gains associated with the interventions until approximately two and one half weeks after interventions were initiated. Further research can include assessing the reliability of the iRead Inventory, researching its utility for designing interventions for a broader population, and assessing the implications of a potential learning curve phenomenon for making educational decisions.
|
138 |
Relationship of self-efficacy beliefs of urban public school students to performance on a high-stakes mathematics testAfolabi, Kolajo A 01 January 2010 (has links)
The purpose of this study was to examine the relationship of self-efficacy for Enlisting Social Resources, Self-Regulatory Efficacy, self-efficacy for Self-Regulated Learning, and self-efficacy for Academic Achievement (Bandura's Children's Self-Efficacy Scale, 2006) of urban public school students to performance on the high stakes Massachusetts Comprehensive Assessment System (MCAS) math test. A survey questionnaire was administered to eighty three participants and the data, analyzed using linear regression, conformed to the assumptions of Independence, Linearity, Normality, and Homoscedasticity. Self-Regulatory Efficacy, Academic Achievement, and Socio-economic Status were statistically significant bivariate predictors of performance on MCAS math test. Self-Regulatory Efficacy was the only consistent statistically significant predictor of MCAS math performance. Gender interaction with Self-Regulatory Efficacy was statistically significant in isolation but was not when other variables were accounted for.
|
139 |
Measuring teacher effectiveness using student test scoresSoto, Amanda Corby 01 January 2013 (has links)
Comparisons within states of school performance or student growth, as well as teacher effectiveness, have become commonplace. Since the advent of the Growth Model Pilot Program in 2005 many states have adopted growth models for both evaluative (to measure teacher performance or for accountability) and formative (to guide instructional practice, curricular or programmatic choices) purposes. Growth model data, as applied to school accountability and teacher evaluation, is generally used as a mechanism to determine whether teachers and schools are functioning to move students toward curricular proficiency and mastery. Teacher evaluation based on growth data is an increasingly popular practice in the states, and the introduction of cross-state assessment consortia in 2014 will introduce data that could support this approach to teacher evaluation on a larger scale. For the first time, students in consortium member states will be taking shared assessments and being held accountable for shared curricular standards – setting the stage to quantify and compare teacher effectiveness based on student test scores across states. States' voluntary adoption of the Common Core State Standards and participation in assessment consortia speaks to a new level of support for collaboration in the interest of improved student achievement. The possibility of using these data to build effectiveness and growth models that cross state lines is appealing, as states and schools might be interested in demonstrating their progress toward full student proficiency based on the CCSS. By utilizing consortium assessment data in place of within-state assessment data for teacher evaluation, it would be possible to describe the performance of one state's teachers in reference to the performance of their own students, teachers in other states, and the consortium as a whole. In order to examine what might happen if states adopt a cross-state evaluation model, the consistency of teacher effectiveness rankings based on the Student Growth Percentile (or SGP) model and a value added model are compared for teachers in two states, Massachusetts and Washington D.C., both members of the Partnership for Assessment of Readiness for College and Career (PARCC) assessment consortium The teachers will be first evaluated based on their students within their state, and again when that state is situated within a sample representing students in the other member states. The purpose of the current study is to explore the reliability of teacher effectiveness classifications, as well as the validity of inferences made from student test scores to guide teacher evaluation. The results indicate that two of the models currently in use, SGPs and a covariate adjusted value added model, do not provide particularly reliable results in estimating teacher effectiveness with more than half of the teacher being inconsistently classified in the consortium setting. The validity of the model inferences is also called into question as neither model demonstrates a strong correlation with student test score change as estimated by a value table. The results are outlined and discussed in relation to each model's reliability and validity, along with the implications for the use of these models in making high-stakes decisions about teacher performance.
|
140 |
Evaluating the validity of accommodations for English learners through evidence based on response processesCrotts, Katrina M 01 January 2013 (has links)
English learners (ELs) represent one of the fastest growing student populations in the United States. Given that language can serve as a barrier in EL performance, test accommodations are provided to help level the playing field and allow ELs to better demonstrate their true performance level. Test accommodations on the computer offer the ability to collect new types of data difficult to obtain via paper-and-pencil tests. Specifically, these data can be used as additional sources of validity evidence when examining test accommodations. To date, limited research has examined computer-based accommodations, thus limiting these additional sources of validity evidence. The purpose of this study was to evaluate the validity of computer-based test accommodations on high school History and Math assessments using evidence based on response processes, specifically accommodation use and response time. Two direct linguistic accommodations, non-ELs, two EL groups, and five research questions were investigated in this study. Accommodation use results indicated significant differences in use across the three student groups, with ELs using accommodations more frequently than non-ELs. However, there were still high percentages of all three groups not accessing any accommodations on individual items. Accommodation use was more common on History than on Math, and decreased as the assessment progressed. Results suggest future research focus on students actually using the accommodations when conducting research on the effectiveness of accommodations. Response time results showed ELs taking longer to process test items as compared to non-ELs regardless of receiving test accommodations. Receiving accommodations significantly impacted processing time for some of the items on History, but not on Math. Similarly, History showed a relationship between the number of accommodations on test items and response time, but Math did not. These results suggested that the Math content knowledge may have played a larger role in response time than the accommodations. Positive relationships between test performance and response time were found in both subject areas. The most common predictors of both accommodation use and response time across both subject areas were sex, Hispanic status, and socioeconomic status. Implications of the results and suggestions for future research are discussed.
|
Page generated in 0.2154 seconds