Spelling suggestions: "subject:"educationization -- tests anda remeasurements"" "subject:"educationization -- tests anda aremeasurements""
101 |
Resource allocation and student achievement: A microlevel impact study of differential resource inputs on student achievement outcomes.Hurley, Noel P. January 1995 (has links)
This study examined the relationships between resource allocation and student achievement using a modified version of a conceptual model designed by Bulcock (1989) within a general model proposed by Guthrie (1988). Five research questions were developed from a review of literature to investigate the relationship between microlevel student input variables and student output variables--both cognitive and affective. The mediating effects of the student perceptions of the quality of school life on student achievement outcomes were also examined. Multiple regression analyses were utilized and data were analyzed at both the individual and school levels. Models were used to investigate the indirect effects of the quality of school life on student achievement outcomes. Substantively meaningful relationships were identified between linguistic resources, language usage and reading outcomes; socioeconomic level, gender, linguistic resources, language usage, and mathematics achievement; gender, student attitudes, and student well-being. All grade eight Newfoundland students (10,146) were the subjects of the study. Participants in the study completed the Canadian Test of Basic Skills (CTBS) and the Bulcock Attitudinal Inventory (BAI). Females scored higher than males on every test of the CTBS and also had more favourable attitudes towards school as measured using the BAI. Urban students outperformed rural students by the equivalent of nearly one year on the CTBS scores. A variable was constructed to test Bernstein's (1961) theory of language discontinuity. Bernstein contended that the further an individual's language code departed from the standard language code in use in that society, the greater the difficulty that person would have in learning. The language code variable was constructed using the language usage score from the CTBS to create a continuous variable. This language code variable proved to be highly explanatory in that it explained a large percentage of the variance in reading achievement outcomes and in mathematics achievement outcomes. The measure for students' perceptions toward their schooling experiences explained a large percentage of the variance of student well-being. Two other noteworthy findings in the present study arose from relationships identified between mathematics achievement and independent variables. A strong relationship was identified between mathematics achievement and socioeconomic level. In general, the higher one's socioeconomic level the greater were the outcome measures in mathematics achievement. Indirect effects analyses produced a significant relationship between gender and mathematics achievement that favoured girls. The construction of the educational production function in the present study proved to be an accurate model. The present study contributed to research in several ways. This is one of the first studies that has employed Quality of School Life indicators as developed in the BAI in an educational production function model. A second contribution was the inclusion of microlevel student linguistic resources as predictors of cognitive achievement outcomes. The third contribution of the present study was the high percentage of variance of cognitive achievement outcomes explained by the modified Bulcock model.
|
102 |
The effect of nonnormal ability distributions on IRT parameter estimation using full-information and limited-information methods.Boulet, John R. January 1996 (has links)
The relationship between nonlinear factor analysis (FA) models and Item Response Theory (IRT) models has been well established. Furthermore, in terms of modern measurement theory, the use of nonlinear FA models to describe item-trait relationships is currently becoming more popular and may offer some statistical and/or computational advantages in the analysis of item response data. Both limited-information (LI) and full-information (FI) nonlinear FA models can be used to derive the familiar IRT parameter estimates. In general, the two approaches (LI and FI) are distinguished simply by the extent to which they use information in the data matrix of examinee (subject) responses. The focus of this study was to compare the accuracy and efficiency of IRT parameter estimates (i.e., item difficulty, item discrimination) using both LI and FI nonlinear FA models. A Monte Carlo study was employed to investigate the precision and stability of parameter estimates in situations where (a) the manifest variables (test items) are binary and there is a single underlying normally distributed latent variable and (b) the manifest variables are binary and there is a single underlying latent variable that is not normally distributed. In addition, parameter recovery was explored under various simulated test lengths (number of items) and sample sizes (number of examinees). The results of the study suggest that, for conditions involving a normally distributed latent variable, the limited-information approach incorporated in the NOHARM computer program generally provides more accurate and stable parameter estimates than the theoretically preferred FI estimator incorporated in the TESTFACT computer program. For situations involving a nonnormal distribution of the latent trait, or ability, FI estimation provided a marginally better calibration of the 2-parameter logistic response model. Both estimators were, however, prone to producing item values that were outside of feasible ranges, resulting in poor goodness-of-fit of the estimates. Furthermore, based on the conditions modelled in the study, neither the sample size, the test length, nor the sample size/test length ratio were important in terms of explaining between-program differences in the recovery of the item parameters.
|
103 |
Adult learners' perspectives on screening reading ability for patient teaching.Brez, Sharon. January 1995 (has links)
The expectation of greater individual responsibility for health promotion practices and decision making in hospitals is dependent upon knowledgeable consumers. The heavy reliance on printed material for both gathering and disseminating information in hospitals has led to recommendations that literacy screening tests be considered to enhance the efficacy of patient teaching interventions for the significant number of adults with low literacy skills. A qualitative case study design was used to investigate the response of adults with low literacy skills to literacy screening. Data were collected through in depth interviews including an experience using the Rapid Estimate of Adult Literacy in Medicine (REALM) word recognition tool. Analysis was achieved using a constant comparison technique. A conceptual model of response to screening was developed and compared to the Health Belief Model and Knox's Proficiency Theory of adult learning. While all participants supported the principle of screening in the context of the hospital, response to the REALM experience was variable. Factors found to influence responses to screening included perceived risks of illiteracy exposure, perceived risks of non-disclosure during hospitalization and the attribution of characteristics to the hospital leading to it's designation as a "special" place. Specific responses to the REALM were found to be further influenced by a set of individual historic factors. The results have lead to several recommendations for health care professionals considering utilization of literacy screening instruments.
|
104 |
A comparison of item parameter estimates and ICCs produced with TESTGRAF and BILOG under different test lengths and sample sizes.Patsula, Liane. January 1995 (has links)
There are many procedures used to estimate IRT parameters; however, among the most popular techniques are those used in the LOGIST and BILOG computer programs. LOGIST requires large numbers of examinees and items (in the order of 1000 or more examinees and 40 or more items) for stable 3PL model parameter estimates. BILOG is a more recent estimation program and, in general, requires smaller numbers of examinees and items than LOGIST for stable 3PL model parameter estimates. It also has been found that, regardless of sample size and test length, BILOG estimates tend to be uniformly more or at least as accurate as LOGIST estimates. For this reason, BILOG is now used as the standard to which new estimation programs are compared. The purpose of this study was to examine the effects of varying sample size (N = 100, 250, 500, and 1000) and test length (20- and 40-item tests) on the accuracy and consistency of 3PL model item parameter estimates and ICCs obtained from TESTGRAF and BILOG. Overall, TESTGRAF seemed to perform better or just as well as BILOG. Where large bias effect sizes existed, in all but one case, TESTGRAF was more accurate than BILOG. TESTGRAF was slightly less accurate than BILOG in estimating the $P(\theta$)'s at high ability levels. Where large efficiency effect sizes existed, in all but two cases, TESTGRAF was more consistent than BILOG. TESTGRAF was slightly less consistent than BILOG in estimating the a parameter with a sample size of 1000 and in estimating the c parameter at all sample sizes. (Abstract shortened by UMI.)
|
105 |
A presentation and comparison of some new statistical techniques in the analysis of polytomous differential item functioning: A Monte Carlo investigation.Mâsse, Louise C. January 1994 (has links)
There is a need to develop and investigate methods which can assess the Item Response Differences (IRD) found in all the options of an item. In this study, such an investigation was referred to as Polytomous Differential Item Functioning (PDIF). The purpose of this study was to present and investigate the performance of four new approaches in the assessment of PDIF. The four approaches are a MANOVA (MCO) and a MANCOVA (MCA) approach applied to categorical dependent variables, a Polytomous Logistic Regression (PLR) approach, and an ANOVA analysis based on the item responses quantified by Dual Scaling (DS). In this study the effectiveness of these approaches (MCA, MCO, PLR, and DS) as well as the Log-Linear (LOG) approach of Mellenbergh (1982) were assessed under various conditions of test length, sample size, item difficulty, and the amount and location of PDIF. A two-parameter polytomous logistic regression model was used to generate the data. In this study, only uniform PDIF was introduced in the alternatives of the item. The type of PDIF simulated (e.g. uniform) in this study did not allow for a direct comparison of the nonuniform test of hypothesis between the Logistic (LOG and PLR) approaches and the MAN(C)OVA (MCA and MCO) approaches because the Logistic approaches test for a difference in logits while the MAN(C)OVA approaches test for a difference in proportions. It was shown in this study that varying the probability of choosing the alternatives resulted in uniform logit differences which did not only translate into uniform differences in proportions but also translated into nonuniform differences in proportions. These differences affected the interpretation of the PDIF results because the test of nonuniform PDIF for the Logistic procedures corresponded to a valid test of the null hypothesis while the MAN(C)OVA results for nonuniform PDIF had to be adjusted in order to yield a test which approximated a true test of the null hypothesis. The results of this study lend some optimism to the employment of the MCA and PLR approaches. (Abstract shortened by UMI.)
|
106 |
The relationship between teacher training in measurement and classroom assessment procedures in Kenya's secondary schools.Chirchir, Andrew K. January 1995 (has links)
The purpose of this study was to determine teacher use of measurement principles and the factors influencing this use in the assessment of student achievement in the Riftvalley Province (Kenya). Given that most of the assessment in the classroom consists of instruments developed by teachers, a first step in exploring the utility of measurement principles is to investigate the use of these principles in specific assessment areas. This could lead to the determination and the improvement of the fit between measurement training and teacher classroom assessment practices. The study was designed to provide information on teacher use of measurement principles by considering whether teachers had received training in educational measurement principles, how important they perceived these principles to be, and how often they used the principles in the assessment of student achievement. The study was also designed to determine factors influencing the use of measurement principles in schools. The results show that teachers have been trained in the principles of educational measurement. However, there is some indication that measurement training did not effectively address the assessment concerns of many classroom teachers. Teachers do not feel adequately prepared in test construction, marking, and the reporting of student assessment results. The results on the importance of measurement principles provide some clear indication that teachers attach much importance to the principles for test construction, test administration, marking, and the reporting of student assessment results. Teacher interviews revealed that teachers are overwhelmed by the demands associated with Kenya's 8-4-4 system of education. On the basis of the study findings, suggestions were made for improving teacher training in measurement and for further research. (Abstract shortened by UMI.)
|
107 |
The performance of the Mantel-Haenszel and logistic regression dif detection procedures across sample size and effect size: A Monte Carlo study.Hadley, Patrick. January 1995 (has links)
In recent years, public attention has become focused on the issue of test and item bias in standardized tests. Since the 1980's, the Mantel-Haenszel (Holland & Thayer, 1986) and Logistic Regression procedures (Swaminathan & Rogers, 1990) have been developed to detect item bias, or differential item functioning (dif). In this study the effectiveness of the MH and LR procedures was compared under a variety of conditions, using simulated data. The ability of the MH and LR to detect dif was tested at sample sizes of 100/100, 200/200, 400/400, 600/600, and 800/800. The simulated test had 66 items, the first 33 items with item discrimination ("a") set at 0.80, the second 33 items with "a" set at 1.20. The pseudo-guessing parameter ("c") was 0.15 for all items. The item difficulty ("b") parameter ranged from $-$2.00 to 2.00 in increments of 0.125 for the first 33 items, and again for the second 33 items. Both the MH and LRU detected dif with a high degree of success whenever sample size was large (600 or more), especially when effect size, no matter how measured, was also large. The LRU outperformed the MH marginally under almost every condition of the study. However, the LRU also had a higher false-positive rate than the MH, a finding consistent with previous studies (Pang et al., 1994, Tian et al., 1994a, 1994b). Since the "a" and "b" parameters which underly the computation of the three measures of effect size used in the study are not always determinable in data derived from real world test administrations, it may be that the $\Delta\sb{\rm MH}$ is the best available measure of effect size in real world test items. (Abstract shortened by UMI.)
|
108 |
The performance of the Mantel-Haenszel and logistic regression DIF identification procedures with real data.Tian, Fang. January 1994 (has links)
Numerous statistical methods have been proposed for detecting differential item functioning (DIF). Among them, methods based on item response theory (IRT) are theoretically preferred but very complicated and expensive to implement. As an alternative, the Mantel-Haenszel (MH) procedure has emerged as one of the most popular procedures because of its ease of implementation, relatively small sample size requirement, and associated test of significance. In addition, it provides a measure of the amount and direction of DIF. However, the MH procedure is not designed for and therefore not very effective in detecting nonuniform DIF. As an extension of the MH procedure, a more general DIF detection method, a logistic regression procedure (LR) has been shown to be powerful in detecting both uniform and nonuniform DIF. The purpose of this study is to examine the consistency of the MH and LR procedures and their agreement in the identification of DIF across sample size and criterion when using real examinee data. (Abstract shortened by UMI.)
|
109 |
Étude des critères d'évaluation d'un stage d'enseignement.Charland, Julie Marie Lise. January 1996 (has links)
Cette these porte sur les criteres d'evaluation des stages d'enseignement en Ontario francais. La recension des ecrits comprend des recherches sur certains aspects de cette evaluation ainsi que des analyses de formulaires d'evaluation de stages employes par differentes universites. Il ressort de la recension des ecrits qu'il n'y a pas de consensus sur les criteres de reussite d'un stage d'enseignement. Consequemment, la presente etude pose les questions de recherche suivantes: (1) Quels sont les criteres d'evaluation consideres par le panel d'experts comme essentiels a la reussite d'un stage d'enseignement? (2) Quels sont les criteres d'evaluation consideres par le panel d'experts comme complementaires a la reussite d'un stage d'enseignement? Pour y repondre, une technique Delphi a ete employee. L'objectif etait d'obtenir un consensus parmi des experts de quatre groupes: les enseignants associes, les etudiants mai tres, les superviseurs universitaires et les administrateurs. Les deux premieres rondes ont permis aux participants de chaque groupe de suggerer des criteres et d'en juger l'importance au sein de leur groupe alors qu'au troisieme et dernier tour, les criteres suggeres par tous les groupes etaient presentes a l'ensemble des participants. Au terme de l'exercice, il y a eu un consensus sur 32 criteres juges essentiels et quatre juges complementaires. Ces criteres sont repartis sous quatre themes: la facon d'enseigner, la gestion de la classe, les responsabilites professionnelles et les relations interpersonnelles. Quatorze des criteres identifies par les experts n'avaient pas ete releves dans les recherches anterieures ni dans les formulaires consultes.
|
110 |
Evaluation of a mental skills training program implemented by an elementary classroom teacher.Bonadie, Jenelle N. January 1995 (has links)
The purpose of this study was to implement Orlick's (1993) mental skills/life skills training program and assess the extent to which children (1) learned to relax themselves at will, (2) successfully implemented stress control strategies, and (3) improved the frequency of their highlights (any simple pleasure, joy, or other positive experience that improves the quality of one's day). Two intact classes of grade 2 children took part in the study. One class served as the experimental group, while the other class served as the control group. The usual classroom teacher delivered the program 4 to 5 times per week, for 9 consecutive weeks. Each intervention session was 10 to 15 min in length. Significant positive effects were found in the experimental group with respect to the children's abilities to lower their heart rates at will and successfully implement relaxation and stress control strategies in their daily lives. Children in the experimental group also experienced a significant increase in the frequency of their highlights over the course of the intervention period. The results suggested that children in grade 2 can (1) learn to relax themselves at will as measured by heart rate, (2) successfully implement stress control strategies, and (3) improve the frequency of their highlight experiences when the usual classroom teacher delivers Orlick's (1993) mental skills training program.
|
Page generated in 0.1675 seconds