Spelling suggestions: "subject:"educationization -- tests anda remeasurements"" "subject:"educationization -- tests anda aremeasurements""
11 |
THE EFFECT OF ITEM DIFFICULTY DISTRIBUTION SHAPE ON THE PRECISION OF MEASUREMENT AT A PASSING SCORE (RASCH MODEL, TARGETING, ITEM RESPONSE THEORY, CLASSIFICATION ERRORS, MASTERY TESTING)Unknown Date (has links)
This study sought to determine the loss in measurement precision at a passing ability and at other abilities of interest resulting from the use of nonoptimal, but reasonable and relevant, item difficulty distribution shapes. / Five alternatives to the optimal peaked distribution shape were compared on the basis of their precision relative to the optimal peaked distribution. The distribution shapes were represented by ten tests built in two ways. One set of tests was constructed using item difficulties from an existing minimum competency mastery test. These tests will be referred to as real tests. The other five tests were generated from a simulated, infinitely large item bank. / The relative precision curves produced by the different alternative item combinations were compared to determine which distribution shape generated the greatest precision in the region of the passing ability. As an empirical approach to the question, actual person-item responses were used to estimate abilities and mastery level on each of the five real tests. Mastery classifications by the original long test were used to identify the misclassifications made by each of the real tests. / The distributions centered on the passing score yielded similar error rates, but they differed in the pattern of classification errors made. The number of false passes and false fails were related to whether the test's area of maximum precision was above or below the passing ability. This implies that the types of classification errors made, as well as their number, may to some extent be controlled by the builder of a test. / Source: Dissertation Abstracts International, Volume: 47-01, Section: A, page: 0159. / Thesis (Ph.D.)--The Florida State University, 1985.
|
12 |
THE EFFECTS OF VARIABLE ENTRY ON BIAS AND INFORMATION OF THE BAYESIAN ADAPTIVE TESTING PROCEDUREUnknown Date (has links)
This study investigated the effects of a fixed and variable entry procedure on bias and information of a Bayesian adaptive test. It was found that neither the fixed nor the variable entry procedure produced biased ability estimates on the average. Both procedures did produce, however, biased ability estimates at the extremes of the ability distribution. Both procedures produced peaked and asymmetric information curves, rather than ideal flat curves. Relative efficiency curves indicated that at no point along the ability continuum was one procedure more efficient than the other. The two procedures chose different item subsets for administration. In almost half the cases, the variable entry procedure required more items to reach termination. / Source: Dissertation Abstracts International, Volume: 47-08, Section: A, page: 3013. / Thesis (Ph.D.)--The Florida State University, 1986.
|
13 |
THE EFFECT OF SAMPLE SIZE ON ERROR PRODUCED BY TUCKER AND RASCH EQUATING METHODS UNDER COMMON ITEMS NONRANDOM GROUPS DESIGNUnknown Date (has links)
The purpose of the present study was twofold: (1) to determine the relationship between sample size and equating error produced by the Tucker and Rasch methods; and, (2) to compare the efficiency of the two methods when utilizing small sample sizes. The aim was to examine equating error at selected points on the raw score scale corresponding to the 20th, 40th, 60th, and 80th percentiles, as well as the average error over all examinees and all score points, using five sample sizes of 25, 50, 75, 100, and 500. / The results of the study indicated that the relationship between equating error and sample size was approximately linear and negative. The Rasch method generally produced slightly more error and bias than the Tucker method when using small sample sizes. For the data used in the study, the expected value of equating error for the Rasch method is reduced with higher selected scores, whereas for the Tucker method, it increases as the selected scores deviate from the average score. The minimum number of examinees for equating with the two methods as well as further investigations were suggested. / Source: Dissertation Abstracts International, Volume: 47-12, Section: A, page: 4369. / Thesis (Ph.D.)--The Florida State University, 1986.
|
14 |
THE DEVELOPMENT OF AN INSTRUMENT TO MEASURE TRADITIONAL AND CONTEMPORARY WORK VALUES AND THEIR RELATIONSHIP TO JOB SATISFACTION IN UNIVERSITY FACULTY AND ADMINISTRATORSUnknown Date (has links)
The primary purpose of this investigation was the development and testing of a measurement instrument for the assessment of traditional and contemporary work-related values in university faculty and administrators. Additionally, an analysis of the subscales from the Worrell Contemporary Work Values Inventory (WCWVI), Traditional and Contemporary and subscales from the Minnesota Satisfaction Questionnaire (MSQ), Intrinsic and Extrinsic was completed from both a correlational and causal perspective. Subjects for the study were faculty and administrators (N = 243) representing five universities in Eastern Canada. Each subject completed a socio-demographic Data Sheet, the Minnesota Satisfaction Questionnaire, and the recently designed Worrell Contemporary Work Values Inventory. The dependent variables in the study were the subscales from the WCWVI and the MSQ. The independent variables were categorical in nature and consisted of administrative responsibilities (two levels) and number of years of work experience (four levels). / To determine the reliability of the items on the WCWVI, factor analysis, item analysis and test-retest correlations were carried out. coefficient alphas on the final version of the WCWVI were (.79) Total, (.73) Traditional, and (.77) Contemporary. Test-retest reliability coefficients over three months were (.76) Total, (.73) Traditional, and (.72) Contemporary. The construct and content validities of the WCWVI Traditional and Contemporary subscales were of a questionable nature due to the significant positive correlation (.39) between the subscales. The investigator had designed the WCWVI subscales in an inverse pattern. High average means on one scale and low average means on the other scale would have helped to support the construct validity of the WCWVI. However, both scales reported high grand means, Traditional (29.70) and Contemporary (37.70). / Hypotheses relative to the investigation were analyzed using Pearson Product Moment correlations and two-factor ANOVA. A significant positive correlation (.62) was found between perceived Traditional work-related values and perceived Intrinsic job satisfaction. A 2 x 4 ANOVA revealed that administrators perceived higher average Traditional work-related values than non-administrators with 11-15 years of work experience. Both administrators and non-administrators reported perceived higher average Intrinsic job satisfaction with more years of work experience. Conclusions based upon the research findings of this study were that the WCWVI did not measure Traditional and Contemporary work-related values as predicted. As well, there was no consistent relationship between the WCWVI subscales and MSQ subscales. / Source: Dissertation Abstracts International, Volume: 48-02, Section: A, page: 0374. / Thesis (Ph.D.)--The Florida State University, 1987.
|
15 |
An investigation of the impact of category collapsing on convergence and the information function in polychotomous item response theoryUnknown Date (has links)
This study investigated how the collapsing of categories impacted the information function and the convergence of the iterative item calibration in polychotomous Item Response Theory. The scores of 1000 examinees on twenty-four twelve-item performance assessment test batteries were simulated. The experimental factors were direction of category collapsing (upward and downward), three levels of item difficulty ($-$1.0, 0.0, and 1.0), three levels of item discrimination (0.4, 0.9, and 1.6), and two levels of inter-rater reliability (.90 and.95). PARSCALE was used to calibrate the tests and provide the information data. Factorial repeated measures analysis was used on the three experimental designs for maximum item information, total item information, and EM-cycles. / The results demonstrated that (1) combining raters' evaluations reduced the effective item discrimination of an item and increased the range of step difficulties, but left item difficulty essentially unchanged; (2) overall, the collapsing of categories increased item discrimination, reduced the number of EM-cycles and total information, and left item difficulty and maximum information essentially unchanged; and (3) within the limited range of inter-rater reliability studied, the "high" level of inter-rater reliability did not provide more information than the "low" level. / All three designs were interactive, with significant two-way interactions in the information designs and a three-way interaction in the EM-cycles design. Within the experimental criteria of statistical significance and practical importance, it was demonstrated that the "high" levels of the experimental factors contributed to the few significant reductions in information and the number of EM-cycles that were detected. The direction of collapsing only had a significant effect in the EM-cycles design. / Because performance assessment batteries typically are composed of items which span the range of the item characteristics used as experimental factors in this study, it was concluded that the overall effects are more germane to practical applications than the few significant effects. Based on these results, practitioners who find it useful to collapse categories under the conditions considered here may do so without any expected adverse effects on maximum and total information. / Source: Dissertation Abstracts International, Volume: 55-03, Section: A, page: 0541. / Director: Richard Tate. / Thesis (Ph.D.)--The Florida State University, 1994.
|
16 |
THE EFFECTS OF TEST SPEEDEDNESS AND CONTEXT ON RASCH MODEL PARAMETERSUnknown Date (has links)
This study investigated the effects of test speededness and context of items on Rasch model parameters. Two questions were formulated for this investigation: "To what extent are the Rasch model parameters affected by partially speeded test administration?" and "To what extent are the Rasch model item difficulty parameters affected by changes in item position and experimental calibration situation?" / The investigations were conducted using data from a mandatory statewide basic skills examination of eighth graders. The effects of test speededness were examined by comparing the Rasch model parameters of samples which were specifically constructed to exhibit various levels of test speededness. Context effects were investigated by longitudinal comparisons of item calibration values obtained in different item positions and experimental versus regular situations. / This research has revealed that the item parameters of a partially speeded test were significantly changed at different levels of speededness. No demonstration of the effects on ability estimates was found. The analyses of context effects showed that item difficulty estimates were affected by changes in item position and calibration situation. / Source: Dissertation Abstracts International, Volume: 43-12, Section: A, page: 3885. / Thesis (Ph.D.)--The Florida State University, 1982.
|
17 |
THE EFFECT OF IMMEDIATE ITEM FEEDBACK ON THE RELIABILITY AND VALIDITY OF VERBAL ABILITY TEST SCORESUnknown Date (has links)
Source: Dissertation Abstracts International, Volume: 40-10, Section: A, page: 5410. / Thesis (Ph.D.)--The Florida State University, 1979.
|
18 |
FACTORS AFFECTING THE COMMUNITY PLACEMENT OF MENTALLY RETARDED PEOPLE: A TEST OF THE DEVELOPMENTAL MODELUnknown Date (has links)
The purpose of this study was to examine the effect of the Developmental and Deviance Models on services to mentally retarded individuals by addressing the following questions: Do client characteristics such as level of intelligence or skill levels affect the placement of mentally retarded individuals in group homes in the community? How completely are their identified service needs met in community settings? What types of variables affect the extent to which services, documented as needed, are provided? What types of variables affect the extent to which provided services are congruent with the goals in the individual habilitation plans? How often, once placed do group home residents move to residential settings of less or greater restrictiveness? / A total of 477 group home residents were randomly selected from group homes stratified by size (the number of residents: 4-6, 7-10, 11-20, and more than 20 residents). For each client, an instrument was completed which dealt with client characteristics, the adequacy of the community service delivery system and the rate and type of movement experienced by these clients once placed in a group home. / The results indicate that 41% of the sample are severely and profoundly retarded and that these individuals possess minimal, if any, ambulation or communication abilities or self-care skills (bathing, toileting, self-feeding, drinking and dressing). / More services were provided to these individuals than were documented to meet treatment goals of the ihp. Eighty-seven percent of the 1,512 services documented as needed to meet treatment goals in the ihp were provided. A total of 1,945 services were actually received by group home residents, many not documented to meet ihp treatment goals. The service district and the group home size in which the individual lived were the only variables to significantly explain the variance in the amount of ihp-related services which were provided. The service district, level of intelligence and age were significant in explaining the percentage of received services which were congruent with ihp treatment goals. / Client movement was minimal, moves from group homes to institutions decreased as the admission's law made institutional avoidance possible. Moves to settings less restrictive than a group home increased as funding for community support services increased. Too few moves were made to settings more restrictive than group homes to make further analysis possible. / Although this study was only a preliminary examination of the effect of the developmental and deviance models on services affecting the mentally retarded, the developmental model is enjoying at least moderate influence. Services, based on identified need, are provided to an almost complete extent; only administrative differences in service districts and home sizes affect the provision of services. Many unskilled individuals, formerly confined to institutions, are living in the community. In all, this study reflects to a surprising degree, the influence of a progressive perspective on the services of mentally retarded residents of group homes. / Further study is needed to examine the habilitation planning process to determine whether the treatment goals are adequately developed for mentally retarded individuals. Additional investigations should also be undertaken to determine what the service district variable is measuring; it is likely that this variable could be measuring resources, resource utilization, and the differing management structures in the field. / Source: Dissertation Abstracts International, Volume: 41-10, Section: A, page: 4372. / Thesis (Ph.D.)--The Florida State University, 1980.
|
19 |
THE DIAGNOSTIC STUDY AS A TOOL FOR IMPROVING SUPERVISORY PRACTICES BY THE INTERNAL CONSULTANTUnknown Date (has links)
In this study the technique of using the diagnostic study as a tool for improving supervisory effectiveness was investigated. A specially designed diagnostic study was administered to a group of industrial supervisors to determine what, if any, effects the knowledge of perceived supervisory behavior by the subordinates had in changing the behavior of the supervisor as measured by a second administration of the same diagnostic study. / A group of 672 industrial supervisors with different levels of responsibilities constituted the population used in the study. This group was first asked to rate the behavior of their supervisors who then were given the results of the subordinate's ratings. With the exception of an interpretive session explaining the meanings of the ratings, no formal treatment was made to bring about any change in the behavior of the supervisors. Three months later the diagnostic study was repeated to determine if any change in behavior, as perceived by the subordinates, had taken place on the part of the supervisors. / Using a test-retest design, change in behavior of the supervisors as perceived by subordinates was measured. Changes in behavior are attributed to the knowledge gained by the supervisors as a result of the initial diagnostic study. Within the limitations of the study the following conclusions seem justified. (1) The diagnostic study provided useful criteria as base measures for evaluating supervisory behavior. (2) The diagnostic study provided a measurement of behavior in relation to these criteria. (3) The record of variance of actual performance from standards identified areas of supervisory behavior needing change. (4) Once areas of supervisory behavior that varied from standards as established by superintendents were identified by the supervisor, behavior patterns changed. / Source: Dissertation Abstracts International, Volume: 41-10, Section: A, page: 4372. / Thesis (Educat.D.)--The Florida State University, 1980.
|
20 |
AN APPLICATION OF THE RASCH MODEL TO INVESTIGATE ITEM BIAS IN THE TESTS FROM THE JOINT HIGHER EDUCATION ENTRANCE EXAMINATION IN THAILANDUnknown Date (has links)
The primary purpose of this study was to apply the Rasch measurement model to investigate sex-item bias in Chemistry, Biology and English tests from the Joint Higher Education Entrance Examination (JHEEE) in Thailand. This research proposed to examine three questions. (1)Does each test measure the same things for both males and females? (2)Are test items within each test potentially biased against either males or females? (3)If potential sex-biased items exist, what are the content characteristics of the items, identified by the Rasch model as being potentially biased? / To answer these questions, male and female students were randomly selected from the population of examinees who took the tests in science area in the 1979 JHEEE. The test data obtained from the selected samples were analyzed by the BICAL program. / The definition of item bias in this study was based on item difficulty and item fit statistics derived from calibrations done by the Rasch model. The criteria for identifying discrepant items were two standard deviations of the differences of item difficulties and of fit statistics which were computed from the random halves of male and female samples. Each criterion was used to establish a band on the 45(DEGREES) line of the scattergrams for difficulty values and fit mean squares of the total male and female samples. Items which fell outside the band were declared as potentially biased items and their content characteristics were examined. / The major findings from this study indicated that: (1)Each test measured approximately the same things for both males and females. The empirical evidence was supported by the patterns of the scattergrams and high correlation coefficients of the item difficulties. (2)There were some items within each test identified as potentially biased against males or females. The English test produced the largest number of potentially biased items, followed by the Biology and Chemistry tests. (3)The content analyses of potentially biased items were not absolutely conclusive. However, the results indicated close agreement between the content characteristics of those items identified as potentially biased by the item difficulty criterion and the judges' ratings relative to sex bias. / The reseacher suggested that the potentially biased items which were identified by the item difficulty criterion and also confirmed by the judges' evaluation should be removed or rewritten so as to eliminate the bias. Also, the researcher recommended the use of the item difficulty criterion to explore the potentially biased items in other entrance tests using other samples having different characteristics. / Source: Dissertation Abstracts International, Volume: 42-03, Section: A, page: 1112. / Thesis (Ph.D.)--The Florida State University, 1981.
|
Page generated in 0.1428 seconds