• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • 1
  • 1
  • Tagged with
  • 8
  • 8
  • 5
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Beta observed score and true score equating methods

Wang, Shichao 01 August 2019 (has links)
Equating is a statistical process that is used to adjust scores on test forms so that scores on the forms can be used interchangeably. This dissertation offered intensive investigation of beta true and observed score methods by comparing them to existing traditional and IRT equating methods under multiple designs and various conditions using real data, pseudo-test data and simulated data. Weighted and conditional bias, standard error of equating and root mean squared error were used to evaluate the accuracy of equating results obtained from the pseudo data and simulated data analyses. The single group equipercentile equating based on large sample sizes was used as the criterion equating. Overall, results showed that of the methods examined, the IRT methods performed best, followed by the chained equipercentile methods. Results from beta methods presented different trends from traditional and IRT methods for both the random group and common item nonequivalent groups designs. Beta true scores methods were less sensitive to group difference compared to traditional methods. The length of common items played an important role in the stability of results of beta true score methods.
2

Sensitivity to Growth over Time in Pre-Post Norm-Referenced Tests

Peters, Wole 02 October 2013 (has links)
There is very little in the literature about the sensitivity of norm-referenced tests to growth of diverse groups of test takers, particularly low-achieving test takers, who operate at the lowest 15 percentile of their peers. To bridge the knowledge gap, this study examined the sensitivity to growth of norm-referenced achievement tests. The purpose of the study is to determine the sensitivity of norm-referenced test to the growth of low-achieving students in prekindergarten through 12th grade. Four analysis were performed to test eight identified norm-referenced test for their sensitivity to the growth of students who perform at approximately the 15th percentile or below of their grade peers. Results of the analyses suggested that two of the eight tests are adequate for use with low-achieving students within a norm period. The other six tests showed lack of precision and appeared not to be suitable for measuring progress of low -achieving students.
3

Simple structure MIRT equating for multidimensional tests

Kim, Stella Yun 01 May 2018 (has links)
Equating is a statistical process used to accomplish score comparability so that the scores from the different test forms can be used interchangeably. One of the most widely used equating procedures is unidimensional item response theory (UIRT) equating, which requires a set of assumptions about the data structure. In particular, the essence of UIRT rests on the unidimensionality assumption, which requires that a test measures only a single ability. However, this assumption is not likely to be fulfilled for many real data such as mixed-format tests or tests composed of several content subdomains: failure to satisfy the assumption threatens the accuracy of the estimated equating relationships. The main purpose of this dissertation was to contribute to the literature on multidimensional item response theory (MIRT) equating by developing a theoretical and conceptual framework for true-score equating using a simple-structure MIRT model (SS-MIRT). SS-MIRT has several advantages over other complex MIRT models such as improved efficiency in estimation and a straightforward interpretability. In this dissertation, the performance of the SS-MIRT true-score equating procedure (SMT) was examined and evaluated through four studies using different data types: (1) real data, (2) simulated data, (3) pseudo forms data, and (4) intact single form data with identity equating. Besides SMT, four competitors were included in the analyses in order to assess the relative benefits of SMT over the other procedures: (a) equipercentile equating with presmoothing, (b) UIRT true-score equating, (c) UIRT observed-score equating, and (d) SS-MIRT observed-score equating. In general, the proposed SMT procedure behaved similarly to the existing procedures. Also, SMT showed more accurate equating results compared to the traditional UIRT equating. Better performance of SMT over UIRT true-score equating was consistently observed across the three studies that employed different criterion relationships with different datasets, which strongly supports the benefit of a multidimensional approach to equating with multidimensional data.
4

A FRAMEWORK FOR PSYCHOMETRIC ANALYSIS OF STUDENT PERFORMANCE ACROSS TIME: AN ILLUSTRATION WITH NATIONAL EDUCATIONAL LONGITUDINAL STUDY DATA

Hart, Raymond C., Jr 04 May 2007 (has links)
No description available.
5

國民中學基本學力測驗量尺分數之研究

蔡雅瑩, Tsai, Ya-Ying Unknown Date (has links)
國中基本學力測驗自民國九十年實施,今年邁入第五年,學力測驗量尺分數之理論基礎,主要以Lord所提出之強真分數模式(Strong True Score Model)為主,以適當的模式對考生答對題數資料進行轉換,進一步計算出現在的量尺分數;然而國內針對學力測驗之量尺分數所做之相關研究甚少,因此本研究在第一部份先描述量尺分數轉換之理論,並詳細描述資料轉換之過程與方法。 目前學力測驗小組所採用之模式有上限模式與下限模式兩種,以考生答對題數資料估計出此兩種模式,以較低的均方差決定採用之模式,再接下來進行數據轉換之工作,然而此兩種模式都只適用於描述單峰分佈的考生答對題數之分配,英語科在歷年的學測資料,皆呈現之雙峰分佈之情形,故本研究提出混合雙峰模式,該模式較現有學力測驗之上限模式、下限模式更能清楚描述考生答對題數之分配,且具有較低之均方差。 本研究亦對量尺分數之轉換過程提出新的見解,本研究資料顯示在指定量尺平均數時,採用平均數之加權數,會比現行各科量尺指定平均分數皆固定為30分來得好,且具有較低之量尺標準差。 最後本研究總結上述方法與結論,對量尺分數的計算方式提出新的流程,以供學力測驗小組以及學者專家作為研究之參考。
6

Evaluating the error of measurement due to categorical scaling with a measurement invariance approach to confirmatory factor analysis

Olson, Brent 05 1900 (has links)
It has previously been determined that using 3 or 4 points on a categorized response scale will fail to produce a continuous distribution of scores. However, there is no evidence, thus far, revealing the number of scale points that may indeed possess an approximate or sufficiently continuous distribution. This study provides the evidence to suggest the level of categorization in discrete scales that makes them directly comparable to continuous scales in terms of their measurement properties. To do this, we first introduced a novel procedure for simulating discretely scaled data that was both informed and validated through the principles of the Classical True Score Model. Second, we employed a measurement invariance (MI) approach to confirmatory factor analysis (CFA) in order to directly compare the measurement quality of continuously scaled factor models to that of discretely scaled models. The simulated design conditions of the study varied with respect to item-specific variance (low, moderate, high), random error variance (none, moderate, high), and discrete scale categorization (number of scale points ranged from 3 to 101). A population analogue approach was taken with respect to sample size (N = 10,000). We concluded that there are conditions under which response scales with 11 to 15 scale points can reproduce the measurement properties of a continuous scale. Using response scales with more than 15 points may be, for the most part, unnecessary. Scales having from 3 to 10 points introduce a significant level of measurement error, and caution should be taken when employing such scales. The implications of this research and future directions are discussed.
7

Evaluating the error of measurement due to categorical scaling with a measurement invariance approach to confirmatory factor analysis

Olson, Brent 05 1900 (has links)
It has previously been determined that using 3 or 4 points on a categorized response scale will fail to produce a continuous distribution of scores. However, there is no evidence, thus far, revealing the number of scale points that may indeed possess an approximate or sufficiently continuous distribution. This study provides the evidence to suggest the level of categorization in discrete scales that makes them directly comparable to continuous scales in terms of their measurement properties. To do this, we first introduced a novel procedure for simulating discretely scaled data that was both informed and validated through the principles of the Classical True Score Model. Second, we employed a measurement invariance (MI) approach to confirmatory factor analysis (CFA) in order to directly compare the measurement quality of continuously scaled factor models to that of discretely scaled models. The simulated design conditions of the study varied with respect to item-specific variance (low, moderate, high), random error variance (none, moderate, high), and discrete scale categorization (number of scale points ranged from 3 to 101). A population analogue approach was taken with respect to sample size (N = 10,000). We concluded that there are conditions under which response scales with 11 to 15 scale points can reproduce the measurement properties of a continuous scale. Using response scales with more than 15 points may be, for the most part, unnecessary. Scales having from 3 to 10 points introduce a significant level of measurement error, and caution should be taken when employing such scales. The implications of this research and future directions are discussed.
8

Evaluating the error of measurement due to categorical scaling with a measurement invariance approach to confirmatory factor analysis

Olson, Brent 05 1900 (has links)
It has previously been determined that using 3 or 4 points on a categorized response scale will fail to produce a continuous distribution of scores. However, there is no evidence, thus far, revealing the number of scale points that may indeed possess an approximate or sufficiently continuous distribution. This study provides the evidence to suggest the level of categorization in discrete scales that makes them directly comparable to continuous scales in terms of their measurement properties. To do this, we first introduced a novel procedure for simulating discretely scaled data that was both informed and validated through the principles of the Classical True Score Model. Second, we employed a measurement invariance (MI) approach to confirmatory factor analysis (CFA) in order to directly compare the measurement quality of continuously scaled factor models to that of discretely scaled models. The simulated design conditions of the study varied with respect to item-specific variance (low, moderate, high), random error variance (none, moderate, high), and discrete scale categorization (number of scale points ranged from 3 to 101). A population analogue approach was taken with respect to sample size (N = 10,000). We concluded that there are conditions under which response scales with 11 to 15 scale points can reproduce the measurement properties of a continuous scale. Using response scales with more than 15 points may be, for the most part, unnecessary. Scales having from 3 to 10 points introduce a significant level of measurement error, and caution should be taken when employing such scales. The implications of this research and future directions are discussed. / Education, Faculty of / Educational and Counselling Psychology, and Special Education (ECPS), Department of / Graduate

Page generated in 0.0458 seconds