Global ETD Search

31	序貫方法於電腦化效標參照測驗之應用 / Sequential Methods in Computerized Criterion-referenced Test 李佳紋, Lee, Chia-Wen Unknown Date (has links) 在一場競爭性的考試中，我們如何決定要錄取或是淘汰這個考生？傳統的紙筆測驗方式固定題目總數，考生回答相同的題目，60分以上為及格。隨著電腦科技的快速發展，測驗型式也由紙筆轉換成電腦操作，也就是電腦化測驗。所謂電腦化效標參照測驗（computerized criterion-referenced test）即是把考生能力分成兩個以上的程度區間，藉由考生的答題狀況來判斷考生應歸屬於哪個區間。這種測驗方式與傳統測驗不同的是：電腦化測驗是依據考生的答題表現來給題，考生能力越偏離分段點（thresholds），需要的題數就越少；越接近分段點，需要的題數就越多。在這篇論文中，我們運用兩個參數的羅吉斯模型（two-parameter logistic model）來估計考生之於試題的答對機率。藉由電腦模擬來探討結合貝它保護（beta-protection）方法和適性測驗對平均測驗題數及誤判率（亦即考生真正的能力與電腦判斷的區間不同）的影響。在模擬過程中，我們也介紹了試題參數的選擇情形，估計考生能力的方法以及在貝它保護下，停止選題的規則。根據這些原則，電腦模擬結果證明使用適性測驗加上貝它保護方法能夠有效地控制誤判率在規定的範圍內，程度不同的考生也能控制有不同的測驗題數。 / In a traditional Paper-and-Pencil (p-and-p) test, all examinees have same test items and the number of items is fixed. The examinee fails or passes the exam depends on if his/her test score exceeds a predetermined scores, say, 60 out of 100. However, with the rapid advancement of modern computer technology, the test form has been converted from p-and-p to computer terminal. Computerized criterion-referenced classify the examinees into more than two categories according to his/her answers to the items. It differs from the conventional standardized test in that the selection of test items is tailored to each examinee’s ability level. Typically, those examinees with high ability or low ability will have shorter average test length (ATL) than examinees with ability that close to thresholds. In this thesis, we assume that the probability of choosing correct response to an item follows a two-parameter logistic (2-PL) model. Our goal is to study the performance of ATL and misclassification rate (MR) using beta-protection method and adaptive sequential item selection. On the simulation procedures, we also introduce the selection rule of item parameter, the methods used to estimate an examinee’s ability, and the stopping rule with beta-protection. Simulation results show that using adaptive test and beta-protection method can control the MR within specified level and the number of test items required depends on the examinee’s ability. 電腦化效標參照測驗試題反應理論貝它保護 Computerized Criterion-referenced Test Item Response Theory beta-protection
32	Influence of Item Response Theory and Type of Judge on a Standard Set Using the Iterative Angoff Standard Setting Method Hamberlin, Melanie Kidd 08 1900 (has links) The purpose of this investigation was to determine the influence of item response theory and different types of judges on a standard. The iterative Angoff standard setting method was employed by all judges to determine a cut-off score for a public school district-wide criterion-reformed test. The analysis of variance of the effect of judge type and standard setting method on the central tendency of the standard revealed the existence of an ordinal interaction between judge type and method. Without any knowledge of p-values, one judge group set an unrealistic standard. A significant disordinal interaction was found concerning the effect of judge type and standard setting method on the variance of the standard. A positive covariance was detected between judges' minimum pass level estimates and empirical item information. With both p-values and b-values, judge groups had mean minimum pass levels that were positively correlated (ranging from .77 to .86), regardless of the type of information given to the judges. No differences in correlations were detected between different judge types or different methods. The generalizability coefficients and phi indices for 12 judges included in any method or judge type were acceptable (ranging from .77 to .99). The generalizability coefficient and phi index for all 24 judges were quite high (.99 and .96, respectively). Angoff standard setting method item response theory educational tests and measurements Criterion-referenced tests -- Standards. Item response theory.
33	Predicting Sixth Grade Performance on Criterion-Referenced Reading Tests with Third Grade Test Scores Gallacher, Michael Sean 11 July 2008 (has links) (PDF) This study analyzed the correlation between students' third grade reading ability and sixth grade reading ability. The data were collected from an urban school district, and the participants were students whose records contained information from their third grade school year and their sixth grade school year. The Utah English Language Arts Criterion-Referenced Tests (ELA-CRT) administered in third and sixth grade were used to determine reading ability. Additional demographic data, including race, gender, special education identification, free/reduced lunch, and English Language Learner (ELL), was assessed and controlled for in the data analysis and provided important information concerning the overall findings. Analysis revealed that third grade reading scores had a strong predictive value on sixth grade reading scores. Certain demographic variables carried statistically significant correlations with sixth grade reading performance including race, special education identification, free/reduced lunch, and ELL identification. However, when analyzed together and considering the statistical weight each other, only third grade reading performance, free/reduced lunch, and ELL identification held significant correlations. reading test scores Utah criterion-referenced tests No Child Left Behind Act third grade reading performance sixth grade reading performance Counseling Psychology Special Education and Teaching
34	Assessing EFL student writing in a Swedish context / Likvärdig bedömning : Bedömning av skrivförmågan hos elever med engelska som främmande språk ur ett svenskt perspektiv Mattsson, Fredrik January 2023 (has links) The purpose of this study is to examine the validity and reliability of summative assessment of EFL student writing in a Swedish context. Three teachers have assessed the same four student essays from the English 6 course in Swedish upper secondary school. In addition to grading each essay, the teachers have indicated the extent of conformity to the grading criteria in terms of flow, structure, cohesion, adaptation to purpose, clarity, and variation. The analyzed data show a variation in assessment criteria interpretation, affecting assessment validity and reliability, and questioning the assessment equivalence of the Swedish criterion-referenced gradingsystem. / Syftet med denna studie är att undersöka validiteten och reliabiliteten hos summativa bedömningar av studentuppsatser ur ett svenskt perspektiv. Tre lärare har bedömt samma fyra studentuppsatser från engelska 6. Förutom att betygsätta varje uppsats har lärarna angett graden av överensstämmelse med betygskriterierna: flöde, struktur, sammanhållning, anpassning till syfte, tydlighet och variation. De analyserade data visar en variation i tolkning av betygskriterier, vilket påverka rbedömningens validitet och reliabilitet och ifrågasätter bedömningslikvärdigheten i det svenska mål-relaterade betygssystemet. Assessment summative assessment validity reliability criterion-referenced grading system equivalence assessment criteria Bedömning summativ bedömning validitet reliabilitet mål-relaterat betygsystem likvärdighet bedömningskriterier Educational Sciences Utbildningsvetenskap
35	Systematic criterion-referenced test development in an English-language program Kumazawa, Takaaki January 2011 (has links) Although classroom assessment is one of the most frequent practices carried out by teachers in all educational programs, limited research has been conducted to investigate the dependability and validity of criterion-referenced tests (CRTs). The main purpose of this study is to develop a criterion-referenced test for first-year Japanese university students in a general English program. To this end, four research questions are formulated: (a) To what extent do the criterion-referenced items function effectively?; (b) To what extent do the facets of persons, items, sections, classes, and subtests contribute to the total score variation in two CRT forms?; (c) To what extent are two CRT forms dependable when administered as pretests and posttests?; and (d) To what extent are two CRT forms valid when administered as pretests and posttests? Two CRT forms made up of vocabulary (k = 25), listening (k = 20), and reading (k = 25) subtests were administered to 249 students using a counterbalanced design. Criterion-referenced item analyses showed that most items were working well for criterion-referenced purposes. Both univariate and multivariate generalizability studies indicated that the most of the variance was accounted for by the interaction effect, followed by the items effect, and then by the persons effect. FACETS analyses showed the separation for all the facets accounted for in the analyses and showed that item separation was greater than person separation. This indicated that the students' ability estimates were similar due to their having taken a placement test, whose results were used to form proficiency-based classes. Both univariate and multivariate decision studies indicated that the CRT forms were moderately to highly dependable. The content validity of the CRT forms was supported because the test content was strongly linked to what was taught in class. The construct validity was supported mainly because a fair amount of score gain was observed. This study elucidates how the statistical analyses used in this study can be applied to CRT development, and how CRT development can be carried out as part of curriculum development. / Educational Administration Educational Tests and Measurements Educational Evaluation Educational Administration Achievement Test Criterion-referenced Test Diagnostic Test Generalizability Theory Many Faceted Rasch Model Test Development
36	Relativa betyg : några empiriska studier och en teoretisk genomgång i ett historiskt perspektiv / Group-referenced marks : some empirical studies and a theoretical survey from a historical point of view Andersson, Håkan January 1991 (has links) Denna avhandling, som i huvudsak är resultatet av ett projekt finansierat av Skolöverstyrelsen, består av fem delstudier (I-V) utförda under åren 1977-1991 samt en sammanfattande analysdel (VI). Avhandlingens syfte är att studera det unika svenska relativa betygssystemet, som infördes på försök i folkskolan i början av 1940-talet för att senare permanentas och även införas i enhets- och grundskolan samt slutligen även i gymnasiet. Det relativa betygssystemet beskrivs enligt följande indelning: framväxt och avveckling (V), funktioner (I), effekter och sidoeffekter (III) samt användning och behov (II och IV). I de empiriska studierna har elever, lärare, arbetsgivare och arbetstagare fått ge sina synpunkter på de relativa betygen. I delstudie V analyseras utvecklingen av det relativa betygssystemet med hjälp av offentliga utredningar, remissvar från elev-, lärar-, aibetsgivar- och arbetstagarorganisationer samt också via riksdagstryck.De relativa betygen beskrivs som starkt knutna till urvalet till högre studier och som ett försök att tillskapa ett urvalsinstrument med möjlighet till jämförbarhet och större rättvisa. Betygen har visat sig spela liten roll vid urvalet till olika arbeten. Betygen fyller en "körkortsfunktion" genom att ange inriktning och linje. I övrigt speglar betygen i huvudsak förmåga att tillgodogöra sig teoretiskt stoff. Vid urvalet till olika arbeten beskrivs en utveckling från formella till informella meriter i form av vissa personlighetsegenskaper, t ex samarbetsförmåga, flexibilitet och utåtriktad läggning. Arbetslivserfarenhet, referenser och personlighetsegenskaper betyder mer vid anställningar än skolbetygen.I avhandlingen anges såväl mättekniska, informationstekniska som socialpsykologiska förklaringar till att det relativa betygssystemet är på väg att avvecklas. Betänkligheter riktas mot ett eventuellt införande av målrelaterade betyg p g a styrningsriskerna för elever och lärare, samt också mot riskerna för en ökad kontroll och ett ökat beroende av avnämarna. I avhandlingen noteras skillnader mellan praktiska och teoretiska linjer när det gäller synen på betyg. Som tänkbara förklaringar anges användningen och behovet av betyg liksom också närheten och kopplingen till näringslivet. Betygens officiella funktioner som informations-, motivations- och urvalsinstrument har gradvis minskat. Tidigare har frågor om styrning och kontroll kommit i bakgrunden i förhållande till de officiella funktionerna. Om målrelaterade betyg införs och om betygens roll som urvalsinstrument försvinner, torde betygens styrnings- och kontrollfunktioner behöva diskuteras och motiveringar till att över huvud taget ha kvar betyg i skolan lyftas fram. / This dissertation, which is mainly the result of a project financed by the National Board of Education, consists of five substudies (I-V) carried out between 1977 and 1991, and a summary analysis (VI). The aim of the dissertation is to study group-referenced marks which are unique for Sweden. In the 1940s they were introduced on trial into elementary school, where they were later permanent, and they were also introduced into comprehensive school, nine-year compulsory school and finally also into upper secondary school. The description of group-referenced marks is divided into the following substudies: development and phase-out (V), functions (I), effects and side-effects (III), and use and requirements (II and IV). In the empirical studies, students, teachers, employers and employees have been asked to give their opinions of group-referenced marks. Substudy V analyses the development of group-referenced marks through official reports, through reactions to these reports from student, teacher, employers' and employees' organizations, and also through official parliamentary publications.Group-referenced marks are described as closely connected with the selection to higher education and as an attempt to construct an instrument of selection offering possibilities of comparability and greater justice. Marks have proved to be of little consequence for employment. They function as a "driving licence" by indicating direction and course programme. Marks reflect, above all, the ability to assimilate theoretical subject-matter. The selection to various employments manifests a development from formal to informal merits in the form of certain qualities, e. g. the ability to co-operate, flexibility and extrovert behaviour. Work experience, references and personal qualities are more important than marks for employmentThe dissertation indicates measurement technological, information technological and socio-psychological explanations of withdrawing group-referenced marks. The dissertation also expresses apprehensions about the potential introduction of criterion-referenced marks owing to the steering effects for pupils and teachers, as well as about the risks of increasing control and dependence on potential employers. Differences between practical and theoretical course programmes regarding attitudes to marks can also be observed. These differences can perhaps be explained by the use and needs of marks as well as by the nearness and connection to industry and commerce. The importance of the official functions of marks as information, motivation and selection instruments has gradually been reduced. Problems of steering and control used to be subordinate to the official functions. If criterion-referenced marks are introduced and if marks lose their selection function, the steering and control functions of marks should be discussed and the motives for preserving marks in school should be presented. / digitalisering@umu Group-referenced marks norm-referenced marks criterion-referenced marks marks grades grading selection employment qualifications merits Relativa betyg grupprelaterade betyg normrelaterade betyg målrelaterade betyg kriterierelaterade betyg betyg urval anställningsmeriter meriter
37	Predicting Performance on Criterion-Referenced Reading Tests with Benchmark Assessments Dyson, Kaitlyn Nicole 17 July 2008 (has links) (PDF) The current research study investigates the predictive value of two frequently-used benchmark reading assessments: Developmental Reading Assessment (DRA) and the Dynamic Indicators of Basic Early Literacy Skills (DIBELS). With an increasing emphasis on high-stakes testing to measure reading proficiency, benchmark assessments may assist in predicting end-of-year performance on high-stakes testing. Utah's high-stakes measurement of end-of-year reading achievement is the English Language Arts Criterion-Referenced Test (ELA-CRT). A Utah urban school district provided data for students who completed the DRA, DIBELS, and ELA-CRT in the 2005-2006 school year. The primary purpose of the study was to determine the accuracy to which the Fall administrations of the DRA and the DIBELS predicted performance on the ELA-CRT. Supplementary analysis also included cross-sectional data for the DIBELS. Results indicated that both Fall administrations of the DRA and the DIBELS were statistically significant in predicting performance on the ELA-CRT. Students who were high risk on the benchmark assessments were less likely to score proficiently on the ELA-CRT. Also, demographic factors did not appear to affect individual performance on the ELA-CRT. Important implications include the utility of data collected from benchmark assessments to address immediate interventions for students at risk of failing end-of-year, high-stakes testing. criterion referenced testing educational assessment benchmark testing formative assessment high stakes testing NCLB Reading First Early Reading First progress monitoring DIBELS DRA Utah ELA-CRT curriculum-based assessment Counseling Psychology Special Education and Teaching
38	Educators' experience of the implementation of outcomes-based education in grade nine Ghanchi Badasie, Razia Banoo 30 November 2005 (has links) This research focuses on educators' experience of implementing outcomes-based education in grade nine in secondary schools in South Africa. Two schools were chosen as settings for the qualitative research project. Three focus groups with 20 educators, two focus groups with 14 managers and seven personal interviews were conducted. Twelve classrooms were also observed where grade nine learners were being taught. Findings indicated that some educators found the experience of implementing OBE positive in that it improved their repertoire of facilitating and assessing skills. The reasons for citing OBE as a negative experience were given as the following: an increased workload, poor training and lack of follow-up by the Department and the school management team's degree of involvement. Recommendations were made on how to ease the burden on educators implementing OBE in their classrooms and to empower school managers to manage the implementation of OBE within their respective areas of responsibility. / Educational Studies / M. Ed. (Education Management) General Education and Training [GET] Outcomes Outcomes-based assessment [OBA] Learning areas Criterion-referenced Portfolios School management District management Grade nine Outcomes-based education [OBE] 370.110968 Education, Secondary -- South Africa Educational change -- South Africa
39	Educators' experience of the implementation of outcomes-based education in grade nine Ghanchi Badasie, Razia Banoo 30 November 2005 (has links) This research focuses on educators' experience of implementing outcomes-based education in grade nine in secondary schools in South Africa. Two schools were chosen as settings for the qualitative research project. Three focus groups with 20 educators, two focus groups with 14 managers and seven personal interviews were conducted. Twelve classrooms were also observed where grade nine learners were being taught. Findings indicated that some educators found the experience of implementing OBE positive in that it improved their repertoire of facilitating and assessing skills. The reasons for citing OBE as a negative experience were given as the following: an increased workload, poor training and lack of follow-up by the Department and the school management team's degree of involvement. Recommendations were made on how to ease the burden on educators implementing OBE in their classrooms and to empower school managers to manage the implementation of OBE within their respective areas of responsibility. / Educational Studies / M. Ed. (Education Management) General Education and Training [GET] Outcomes Outcomes-based assessment [OBA] Learning areas Criterion-referenced Portfolios School management District management Grade nine Outcomes-based education [OBE] 370.110968 Education, Secondary -- South Africa Educational change -- South Africa

Search results