Global ETD Search

71	Statistical Inference for Diagnostic Classification Models Xu, Gongjun January 2013 (has links) Diagnostic classification models (DCM) are an important recent development in educational and psychological testing. Instead of an overall test score, a diagnostic test provides each subject with a profile detailing the concepts and skills (often called "attributes") that he/she has mastered. Central to many DCMs is the so-called Q-matrix, an incidence matrix specifying the item-attribute relationship. It is common practice for the Q-matrix to be specified by experts when items are written, rather than through data-driven calibration. Such a non-empirical approach may lead to misspecification of the Q-matrix and substantial lack of model fit, resulting in erroneous interpretation of testing results. This motivates our study and we consider the identifiability, estimation, and hypothesis testing of the Q-matrix. In addition, we study the identifiability of diagnostic model parameters under a known Q-matrix. The first part of this thesis is concerned with estimation of the Q-matrix. In particular, we present definitive answers to the learnability of the Q-matrix for one of the most commonly used models, the DINA model, by specifying a set of sufficient conditions under which the Q-matrix is identifiable up to an explicitly defined equivalence class. We also present the corresponding data-driven construction of the Q-matrix. The results and analysis strategies are general in the sense that they can be further extended to other diagnostic models. The second part of the thesis focuses on statistical validation of the Q-matrix. The purpose of this study is to provide a statistical procedure to help decide whether to accept the Q-matrix provided by the experts. Statistically, this problem can be formulated as a pure significance testing problem with null hypothesis H0 : Q = Q0, where Q0 is the candidate Q-matrix. We propose a test statistic that measures the consistency of observed data with the proposed Q-matrix. Theoretical properties of the test statistic are studied. In addition, we conduct simulation studies to show the performance of the proposed procedure. The third part of this thesis is concerned with the identifiability of the diagnostic model parameters when the Q-matrix is correctly specified. Identifiability is a prerequisite for statistical inference, such as parameter estimation and hypothesis testing. We present sufficient and necessary conditions under which the model parameters are identifiable from the response data. Statistics Educational tests and measurements
72	Dealing with Sparse Rater Scoring of Constructed Responses within a Framework of a Latent Class Signal Detection Model Kim, Sunhee January 2013 (has links) In many assessment situations that use a constructed-response (CR) item, an examinee's response is evaluated by only one rater, which is called a single rater design. For example, in a classroom assessment practice, only one teacher grades each student's performance. While single rater designs are the most cost-effective method among all rater designs, the lack of a second rater causes difficulties with respect to how the scores should be used and evaluated. For example, one cannot assess rater reliability or rater effects when there is only one rater. The present study explores possible solutions for the issues that arise in sparse rater designs within the context of a latent class version of signal detection theory (LC-SDT) that has been previously used for rater scoring. This approach provides a model for rater cognition in CR scoring (DeCarlo, 2005; 2008; 2010) and offers measures of rater reliability and various rater effects. The following potential solutions to rater sparseness were examined: 1) the use of parameter restrictions to yield an identified model, 2) the use of informative priors in a Bayesian approach, and 3) the use of back readings (e.g., partially available 2nd rater observations), which are available in some large scale assessments. Simulations and analyses of real-world data are conducted to examine the performance of these approaches. Simulation results showed that using parameter constraints allows one to detect various rater effects that are of concern in practice. The Bayesian approach also gave useful results, although estimation of some of the parameters was poor and the standard deviations of the parameter posteriors were large, except when the sample size was large. Using back-reading scores gave an identified model and simulations showed that the results were generally acceptable, in terms of parameter estimation, except for small sample sizes. The paper also examines the utility of the approaches as applicable to the PIRLS USA reliability data. The results show some similarities and differences between parameter estimates obtained with posterior mode estimation and with Bayesian estimation. Sensitivity analyses revealed that rater parameter estimates are sensitive to the specification of the priors, as also found in the simulation results with smaller sample sizes. Educational tests and measurements
73	An Item Response Theory Approach to Causal Inference in the Presence of a Pre-intervention Assessment Marini, Jessica January 2013 (has links) This research develops a form of causal inference based on Item Response Theory (IRT) to combat bias that occurs when existing causal inference methods are used under certain scenarios. When a pre-test is administered, prior to a treatment decision, bias can occur in causal inferences about the decision's effect on the outcome. This new IRT based method uses item-level information, treatment placement, and the outcome to produce estimates of each subject's ability in the chosen domain. Examining a causal inference research question in an IRT model-based framework becomes a model-based way to match subjects on estimates of their true ability. This model-based matching allows inferences to be made about a subject's performance as if they had been in the opposite treatment group. The IRT method is developed to combat existing methods' downfalls such as relying on conditional independence between pre-test scores and outcomes. Using simulation, the IRT method is compared to existing methods under two different model scenarios in terms of Type I and Type II errors. Then the method's parameter recovery is analyzed followed by accuracy of treatment effect evaluation. The IRT method is shown to out perform existing methods in an ability-based scenario. Finally, the IRT method is applied to real data assessing the impact of advanced STEM in high school on a students choice of major, and compared to existing alternative approaches. Educational tests and measurements Statistics
74	Examining the Impact of Examinee-Selected Constructed Response Items in the Context of a Hierarchical Rater Signal Detection Model Patterson, Brian Francis January 2013 (has links) Research into the relatively rarely used examinee-selected item assessment designs has revealed certain challenges. This study aims to more comprehensively re-examine the key issues around examinee-selected items under a modern model for constructed-response scoring. Specifically, data were simulated under the hierarchical rater model with signal detection theory rater components (HRM-SDT; DeCarlo, Kim, and Johnson, 2011) and a variety of examinee-item selection mechanisms were considered. These conditions varied from the hypothetical baseline condition--where examinees choose randomly and with equal frequency from a pair of item prompts--to the perhaps more realistic and certainly more troublesome condition where examinees select items based on the very subject-area proficiency that the instrument intends to measure. While good examinee, item, and rater parameter recovery was apparent in the former condition for the HRM-SDT, serious issues with item and rater parameter estimation were apparent in the latter. Additional conditions were considered, as well as competing psychometric models for the estimation of examinee proficiency. Finally, practical implications of using examinee-selected item designs are given, as well as future directions for research. Educational tests and measurements
75	Analyzing Hierarchical Data with the DINA-HC Approach Zhang, Jianzhou January 2015 (has links) Cognitive Diagnostic Models (CDMs) are a class of models developed in order to diagnose the cognitive attributes of examinees. They have received increasing attention in recent years because of the need of more specific attribute and item related information. A particular cognitive diagnostic model, namely, the hierarchical deterministic, input, noisy ‘and’ gate model with convergent attribute hierarchy (DINA-HC) is proposed to handle situations when the attributes have a convergent hierarchy. Su (2013) first introduced the model as the deterministic, input, noisy ‘and’ gate with hierarchy (DINA-H) and retrofitted The Trends in International Mathematics and Science Study (TIMSS) data utilizing this model with linear and unstructured hierarchies. Leighton, Girl, and Hunka (1999) and Kuhn (2001) introduced four forms of hierarchical structures (Linear, Convergent, Divergent, and Unstructured) by assuming the interrelated competencies of the cognitive skills. Specifically, the convergent hierarchy is one of the four hierarchies (Leighton, Gierl & Hunka, 2004) and it was used to describe the attributes that have a convergent structure. One of the features of this model is that it can incorporate the hierarchical structures of the cognitive skills in the model estimation process (Su, 2013). The advantage of the DINA-HC over the Deterministic, input, noisy ‘and’ gate (DINA) model (Junker & Sijtsma, 2001) is that it will reduce the number of parameters as well as the latent classes by imposing the particular attribute hierarchy. This model follows the specification of the DINA except that it will pre-specify the attribute profiles by utilizing the convergent attribute hierarchies. Only certain possible attribute pattern will be allowed depending on the particular convergent hierarchy. Properties regarding the DINA-HC and DINA are examined and compared through the simulation and empirical study. Specifically, the attribute profile pattern classification accuracy, the model and item fit are compared between the DINA-HC and DINA under different conditions when the attributes have convergent hierarchies. This study indicates that the DINA-HC provides better model fit, less biased parameter estimates and higher attribute profile classification accuracy than the DINA when the attributes have a convergent hierarchy. The sample size, the number of attributes, and the test length have been shown to have an effect on the parameter estimates. The DINA model has better model fit than the DINA-HC when the attributes are not dependent on each other. Educational tests and measurements Statistics
76	Posterior Predictive Model Checks in Cognitive Diagnostic Models Park, Jung Yeon January 2015 (has links) Cognitive diagnostic models (CDMs; DiBello, Roussos, & Stout, 2007) have received increasing attention in educational measurement for the purpose of diagnosing strengths and weaknesses of examinees’ latent attributes. And yet, despite the current popularity of a number of diagnostic models, research seeking to assess model-data fit has been limited. The current study applied one of the Bayesian model checking methods, namely the posterior predictive model check method (PPMC; Rubin, 1984), to its investigation of model misfit. We employed the technique in order to assess the model-data misfit from various diagnostic models, using real data and conducting two simulation studies. An important issue when it comes to the application of PPMC is choice of discrepancy measure. This study examines the performance of three discrepancy measures utilized to assess different aspects of model misfit: observed total-scores distribution, association of item pairs, and correlation between attribute pairs as adequate measures of the diagnostic models. Educational tests and measurements
77	A longitudinal study to determine the stanine stability of a group's test-score performance in the elementary school Corcoran, John E January 1958 (has links) Thesis (Ed.D.)--Boston University Educational tests and measurements
78	Impact of Violations of Measurement Invariance in Longitudinal Mediation Modeling Unknown Date (has links) Research has shown that cross-sectional mediation analysis cannot accurately reflect a true longitudinal mediated effect. To investigate longitudinal mediated effects, different longitudinal mediation models have been proposed and these models focus on different research questions related to longitudinal mediation. When fitting mediation models to longitudinal data, the assumption of longitudinal measurement invariance is usually made. However, the consequences of violating this assumption have not been thoroughly studied in mediation analysis. No studies have examined issues of measurement non-invariance in a latent cross-lagged panel mediation (LCPM) model with three or more measurement occasions. The goal of the current study is to investigate the impact of violations of measurement invariance on longitudinal mediation analysis. The focal model in the study is the LCPM model suggested by Cole and Maxwell (2003). This model can be used to examine mediated effects among the latent predictor, mediator, and outcome variables across time. In addition, it can account for measurement error and allow for the evaluation of longitudinal measurement invariance. Simulation methods were used and the investigation was performed using population covariance matrices and sample data generated under various conditions. Eight design factors were considered for data generation: sample size, proportion of non-invariant items, position of latent factors with non-invariant items, type of non-invariant parameters, magnitude of non-invariance, pattern of non-invariance, size of the direct effect, and size of the mediated effect. Results from population investigation were evaluated based on overall model fit and the calculated direct and mediated effects; results from finite sample analysis were evaluated in terms of convergence and inadmissible solutions, overall model fit, bias/relative bias, coverage rates, and statistical power/type I error rates. In general, results obtained from finite sample analysis were consistent with those from the population investigation, with respect to both model fit and parameter estimation. The type I error rate of the mediated effects was inflated under the non-invariant conditions with small sample size (200); power of the direct and mediated effects was excellent (1.0 or close to 1.0) across all investigated conditions. Type I error rates based on the chi-square statistic test were seriously inflated under the invariant conditions, especially when the sample size was relatively small. Power for detecting model misspecifications due to longitudinal non-invariance was excellent across all investigated conditions. Fit indices (CFI, TLI, RMSEA, and SRMR) were not sensitive in detecting misspecifications caused by violations of measurement invariance in the investigated LCPM model. Study results also showed that as the magnitude of non-invariance, the proportion of non-invariant items, and the number of positions of latent variables with non-invariant items increased, estimation of the direct and mediated effects tended to be less accurate. The decreasing pattern of change in item parameters over measurement occasions resulted in the least accurate estimates of the direct and mediated effects. Parameter estimates were fairly accurate under the conditions of the decreasing and then increasing pattern and the mixed pattern of change in item parameters. Findings from this study can help empirical researchers better understand the potential impact of violating measurement invariance on longitudinal mediation analysis using the LCPM model. / A Dissertation submitted to the Department of Educational Psychology and Learning Systems in partial fulfillment of the requirements for the degree of Doctor of Philosophy. / Spring Semester 2019. / March 6, 2019. / invariance, longitudinal, measurement, modeling, statistics / Includes bibliographical references. / Yanyun Yang, Professor Co-Directing Dissertation; Qian Zhang, Professor Co-Directing Dissertation; Fred W. Huffer, University Representative; Betsy J. Becker, Committee Member. Educational tests and measurements Statistics
79	THE CLASSIFICATION OF STUDENTS WITH RESPECT TO ACHIEVEMENT, WITH IMPLICATIONS FOR STATE-WIDE ASSESSMENT Unknown Date (has links) Source: Dissertation Abstracts International, Volume: 38-05, Section: A, page: 2727. / Thesis (Ph.D.)--The Florida State University, 1977. Education, Tests and Measurements
80	The effects of alternate testing strategies on student achievement Unknown Date (has links) The present study compared the effects of three classroom testing strategies on student achievement. The strategies varied with respect to both the detail in feedback provided students after each unit test and in the availability of a retest. Within one strategy, students were informed only of their total test score and had no opportunity to take a retest. Within a second strategy, students were provided scores on each skill assessed by the test and allowed to take a retest one week later. Within the third strategy, students were provided detailed feedback concerning the nature of problems they had experienced with each skill, in addition to the scores on each skill and the option of taking a retest. / The study was conducted in the context of an introductory graduate statistics course. Students were randomly assigned to one of the three testing strategies for the duration of the term. In contrast to previous research within mastery learning, the curriculum and delivery of instruction were held constant across treatment conditions. / The achievement of students in the respective groups was contrasted on two summative exams. The first exam measured the exact skills assessed by the unit test and retests. The second exam measured a more generic set of skills and was designed to test students' ability to generalize their knowledge. No significant differences in achievement on either exam were observed between treatment conditions. The findings of this study suggest that when instructional time and objectives are held constant, simply providing students with detailed feedback regarding their performance on the test and the opportunity to take a retest does not represent sufficient action to improve student achievement. / Source: Dissertation Abstracts International, Volume: 52-11, Section: A, page: 3898. / Major Professor: Albert C. Oosterhof. / Thesis (Ph.D.)--The Florida State University, 1991. Education, Tests and Measurements

Search results