1 |
Measuring Multidimensional Science Learning: Item Design, Scoring, and Psychometric ConsiderationsCastle, Courtney January 2018 (has links)
Thesis advisor: Henry Braun / The Next Generation Science Standards propose a multidimensional model of science learning, comprised of Core Disciplinary Ideas, Science and Engineering Practices, and Crosscutting Concepts (NGSS Lead States, 2013). Accordingly, there is a need for student assessment aligned with the new standards. Creating assessments that validly and reliably measure multidimensional science ability is a challenge for the measurement community (Pellegrino, et al., 2014). Multidimensional assessment tasks may need to go beyond typical item designs of standalone multiple-choice and short-answer items. Furthermore, scoring and modeling of student performance should account for the multidimensionality of the construct. This research contributes to knowledge about best practices for multidimensional science assessment by exploring three areas of interest: 1) item design, 2) scoring rubrics, and 3) measurement models. This study investigated multidimensional scaffolding and response format by comparing alternative item designs on an elementary assessment of matter. Item variations had a different number of item prompts and/or response formats. Observations about student cognition and performance were collected during cognitive interviews and a pilot test. Items were scored using a holistic rubric and a multidimensional rubric, and interrater agreement was examined. Assessment data was scaled with multidimensional scores and holistic scores, using unidimensional and multidimensional Rasch models, and model-data fit was compared. Results showed that scaffolding is associated with more thorough responses, especially among low ability students. Students tended to utilize different cognitive processes to respond to selected-response items and constructed-response items, and were more likely to respond to selected-response arguments. Interrater agreement was highest when the structure of the item aligned with the structure of the scoring rubric. Holistic scores provided similar reliability and precision as multidimensional scores, but item and person fit was poorer. Multidimensional subscales had lower reliability, less precise student estimates than the unidimensional model, and interdimensional correlations were high. However, the multidimensional rubric and model provide nuanced information about student performance and better fit to the response data. Recommendations about optimal combinations of scaffolding, rubric, and measurement models are made for teachers, policymakers, and researchers. / Thesis (PhD) — Boston College, 2018. / Submitted to: Boston College. Lynch School of Education. / Discipline: Educational Research, Measurement and Evaluation.
|
2 |
Comparison of Energy Usage and Response Time for Web Frameworksde Mander, Felicia, Gren, Wilhelm January 2023 (has links)
Background. Environmental sustainability and reducing energy consumption are important and relevant topics today. Energy consumption by data centres is constantly increasing. One factor that could be affecting this is what web frameworks are being used. Objectives. We wanted to investigate whether there is a difference in energy consumption depending on the selected web framework for an API web server. An improvement should not come with overhanging negative side effects. Therefore, energy usage was to be contrasted with response time. In addition, we wanted to see how the choice of response format affects these metrics. In the case of finding any considerable impact on energy usage, without compromising the response time, the goal was to communicate this in order to increase the awareness among software developers. Methods. A literature review was done in order to gather existing information on how to conduct an experiment measuring software energy consumption. We evaluated available tools for measuring consumed application energy. An experiment then compared four popular web frameworks in regard to both energy usage, and response time. Django, Express, Laravel, and Spring Boot were selected for the experiment. Metrics measured were energy usage and response time. The experiment was executed with three different amounts of concurrent requests vusers = {10, 100, 250}. Results. The literature study resulted in a selection of software tools for measuring software energy consumption to choose from. The tool perf was chosen for the experiment. In the experiment, the response format was shown to affect the response time, but not the energy consumption. Increasing the amount of concurrent users made for larger differences between frameworks, both regarding energy usage and response time. Express and Spring Boot show the best performance in both regards for all amounts of concurrent requests. Conclusions. Express and Spring Boot are the clear winners out of the four compared frameworks. Both in terms of energy usage and response time, they had the best results. Django is not a web framework to recommend if response time is of importance.
|
3 |
Effects of retrieval and articulation on memoryLarsson Sundqvist, Max January 2017 (has links)
Many would agree that learning occurs when new information is stored in memory. Therefore, most learning efforts typically focus on encoding processes, such as additional study or other forms of repetition. However, as I will outline in this thesis, there are other means by which to improve memory, such as retrieval practice in the form of tests. Testing memory has a reinforcing effect on memory, and it improves retention more than an equal amount of repeated study – referred to as the testing effect – and it has been assumed that retrieval processes drive this effect. Recently, however, this assumption has been called into question because of findings that suggest that articulation, that is, the act of providing an explicit response on a memory test, may play a role in determining the magnitude of the testing effect. Therefore, in three studies, I have examined the effects of retrieval and articulation on later retention, in an attempt to ascertain whether the testing effect is entirely driven by retrieval, or if there are additive effects of articulation. I have also explored possible boundary conditions that may determine when, and if, the effects of retrieval and articulation become selective with respect to memory performance. In all three studies, participants studied paired associates and were tested in a cued recall paradigm after a short (~5 min) and a long (1 week) retention interval, and retrieval was either covert (i.e., responses were retrieved but not articulated) or overt (i.e., responses were retrieved and articulated). In Study I, I demonstrated that uninstructed covert retrieval practice (by means of delayed judgments of learning) produced a testing effect (i.e., improved memory relative to a study-only condition) similar to that of explicit testing, which supports the idea that the testing effect is mainly the result of retrieval processes. In study II, I compared memory performance for covert and overt testing, and found partial support for a relative efficacy in favor of overt retrieval, compared to covert retrieval, although the effect size was small. In Study III, I further explored the distinction between different response formats (i.e., covert retrieval vs. various forms of overt testing), specifically handwriting and keyboard typing. I also examined the relative efficacy of covert versus overt retrieval as a function of list order (i.e., whether covert and overt retrieval is practiced in blocks or random order) and its manipulation within or between subjects. The results of Study III were inconclusive insofar as a relative efficacy of covert versus overt retrieval, with respect to later retention, could not be demonstrated reliably. The list order manipulations did not appear to affect covert and overt retrieval selectively. More importantly, in cases where a relative efficacy was found, the effect size was again small. Taken together, the three studies that of thesis indicate that the benefit of testing memory appears to be almost entirely the result of retrieval processes, and that articulation alone adds very little – if anything – to the magnitude of the testing effect, at least in cued-recall paradigms. These findings are discussed in terms of their theoretical implications, as well as their importance for the development of optimal teaching and learning practices in educational settings. / <p>At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Accepted.</p>
|
Page generated in 0.09 seconds