Spelling suggestions: "subject:"tem desponse 1heory"" "subject:"tem desponse btheory""
281 |
Experimentos de microarrays e teoria da resposta ao item / Microarryas experiments and Iten Response TheoryNeves, Carlos Eduardo 25 February 2010 (has links)
Recentemente desenvolvida, a biotecnologia denominada por Microarrays permite o monitoramento simultâneo dos valores de expressão gênica de centenas de milhares de genes, fator este que traz uma nova interpretação aos resultados obtidos em pesquisas desenvolvidas nas mais diversas áreas do conhecimento incluindo, por exemplo, a Farmacologia e Medicina, uma vez que os resultados obtidos são interpretados ao nível molecular. Contudo, apesar de muita tecnologia ser empregada à técnica de Microarrays, sua aplicação ainda ocasiona algumas complicações decorrentes, por exemplo, das inúmeras fontes de variação existentes, da escala das respostas ou da natural dificuldade de se analisar uma grande quantidade de fragmentos genéticos avaliados sob poucas unidades experimentais. Frente a estas complicações, atualmente, muitas são as propostas metodológicas de análises estatísticas para atenuar ou eliminar os problemas inerentes à técnica de Microarrays e propiciar a extração de resultados mais confiáveis a partir dos valores de expressão gênica, porém muitos desafios ainda persistem. Sob esta colocação, o presente trabalho procurou explorar duas metodologias de análise estatística alternativas no que diz respeito a seus conceitos, embora ambas tenham sido contextualizadas ao problema de Microarrays e aplicadas para se atingir o mesmo objetivo: possibilitar a identificação dos genes diferencialmente expressos sob distintas condições experimentais. A primeira metodologia consistiu da aplicação de Modelos de Análise de Variância de efeitos fixos com a adoção de modificações nas estatísticas de teste, metodologias de correções para múltiplos testes e a construção de gráficos vulcão. Já, a segunda metodologia consistiu da contextualização e aplicação da Teoria da Resposta ao Item TRI aos experimentos de Microarrays, abordagem esta pouco explorada na análise deste tipo de dado, mas a qual possibilita a seleção de genes diferencialmente expressos a partir de uma medida latente estimada para cada gene e a construção de uma escala para as categorias de resposta de expressão gênica. A motivação para este trabalho originou de um experimento de Microarrays com ratos congênicos disponibilizado pelo Laboratório de Cardiologia e Genética Molecular do Instituto do Coração (InCor-USP) cujo objetivo é identificar genes associados à hipertensão. / Recently developed, the biotechnology denominated Microarrays permits a simultaneous monitoring of the gene expression values of hundred thousands of genes; fact that introduces a new interpretation of the results obtained in researches developed in many distinct areas including, for example, Pharmacology and Medicine, once the obtained results are read according to the molecular level. However, despite the fact that much technology is used in the Microarrays technique, its application still causes some implications, for example, the countless sources of existing variance, the scale of answers or the natural difficulty in analyzing a large number of genetic fragments measured by few experimental units. Facing such complications, a lot of methodologies were suggested in order to reduce or eliminate the problems caused by the Microarrays technique and also foster the obtainment of more reliable results from the gene expression values, yet many challenges still persist. Under this perspective, the present work aimed at exploring two alternative methodologies regarding concepts, despite both were contextualized according to the Microarrays problem and applied with the same objective: enabling the identification of the genes differently expressed under different experimental conditions. The first methodology was composed by the application of Analysis of Variance Models of fixed effects with changes in the test statistics, correction methodologies for multiple tests and volcano plot. The second methodology consisted of the contextualization and application of the Item Response Theory IRT towards the Microarrays experiments, being this one not much explored in analysis that use this kind of data, but enabling the selection of genes differently expressed from an estimated latent trait for each gene and the construction of a scale for the categories of gene expression answers. The motivation for the present work came from an experiment of Microarrays with congenic mice made available by the Cardiology and Molecular Genetics Laboratory of the Heart Institute (InCor-USP) that aimed at identifying genes associated with hypertension.
|
282 |
An Examination of the Psychometric Properties of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors: An Item Response Theory ApproachMoulton, Sara E. 01 December 2016 (has links)
This research study examined the psychometric properties of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors (SRSS-IE) using Item Response Theory (IRT) methods among a sample of 2,122 middle school students. The SRSS-IE is a recently revised screening instrument aimed at identifying students who are potentially at risk for emotional and behavioral disorders (EBD). There are two studies included in this research. Study 1 utilized the Nominal Response and Generalized Partial Credit models of IRT to evaluate items from the SRSS-IE in terms of the degree to which the response options for each item functioned as intended by the scale developers and how well those response options discriminated among students who exhibited varying levels of EBD risk. Results from this first study indicated that the four response option configurations of the items on the SRSS-IE may not adequately discriminate among the frequency of externalizing and internalizing behaviors demonstrated by middle school students. Recommendations for item response option revisions or scale scoring revisions are discussed in this study. In study 2, differential item functioning (DIF) and differential step functioning (DSF) methods were used to examine differences in item and response option functioning according to student gender variables. Additionally, test information functions (TIFs) were used to determine whether preliminary recommendations for cut scores differ by gender. Results of this second study indicate that two of the items on the SRSS-IE systematically favor males over females and one item systematically favors females over males. Additionally, examination of TIFs demonstrated different degrees of measurement precision at various levels of theta for males and females on both the externalizing and internalizing constructs. Implications of these results are discussed in relation to possible revisions of the SRSS-IE items, cut scores, or scale scoring procedures.
|
283 |
ASSESSING THE MODEL FIT OF MULTIDIMENSIONAL ITEM RESPONSE THEORY MODELS WITH POLYTOMOUS RESPONSES USING LIMITED-INFORMATION STATISTICSLi, Caihong Rosina 01 January 2019 (has links)
Under item response theory, three types of limited information goodness-of-fit test statistics – M2, Mord, and C2 – have been proposed to assess model-data fit when data are sparse. However, the evaluation of the performance of these GOF statistics under multidimensional item response theory (MIRT) models with polytomous data is limited. The current study showed that M2 and C2 were well-calibrated under true model conditions and were powerful under misspecified model conditions. Mord were not well-calibrated when the number of response categories was more than three. RMSEA2 and RMSEAC2 are good tools to evaluate approximate fit.
The second study aimed to evaluate the psychometric properties of the Religious Commitment Inventory-10 (RCI-10; Worthington et al., 2003) within the IRT framework and estimate C2 and its RMSEA to assess global model-fit. Results showed that the RCI-10 was best represented by a bifactor model. The scores from the RCI-10 could be scored as unidimensional notwithstanding the presence of multidimensionality. Two-factor correlational solution should not be used. Study two also showed that religious commitment is a risk factor of intimate partner violence, whereas spirituality was a protecting factor from the violence. More alcohol was related with more abusive behaviors. Implications of the two studies were discussed.
|
284 |
Evaluation of Post-Deployment PTSD Screening of Marines Returning From a Combat DeploymentHall, Erika L. 01 January 2015 (has links)
The purpose of this quantitative study was to examine whether the post-deployment screening instrument currently utilized to assess active-duty Marines for symptoms of PTSD upon their return from a combat deployment can be solely relied upon to accurately assess for PTSD. Additionally, this study sought to compare the number of Marines who have sought trauma-related mental health treatment based on their answers on the Post-Deployment Health Assessment (PDHA) to the number who have sought trauma-related mental health treatment based on their answers on their PTSD Checklist - Military Version (PCL-M). The participants in this study were comprised of a sample of active-duty Marines that had recently returned from a combat deployment. A quantitative secondary data analysis used Item Response Theory (IRT) to examine the answers provided by the participants on both the PDHA and PCL-M. Both instruments proved to be effective when assessing symptoms of PTSD and the participants identified as having symptoms of PTSD were referred for mental health services as required. According to the results, more Marines were identified as having symptoms of PTSD using both assessment instruments (PDHA and PCL-M) compared to those identified using just the PDHA. The result was a better understanding of predictors of Marines who may later develop PTSD. The results of this study can also assist the Marine Corps with its post-deployment screening for symptoms of PTSD which in turn can provide appropriate mental health referrals for Marines if deemed appropriate.
|
285 |
序貫方法於電腦化效標參照測驗之應用 / Sequential Methods in Computerized Criterion-referenced Test李佳紋, Lee, Chia-Wen Unknown Date (has links)
在一場競爭性的考試中,我們如何決定要錄取或是淘汰這個考生?傳統的紙筆測驗方式固定題目總數,考生回答相同的題目,60分以上為及格。隨著電腦科技的快速發展,測驗型式也由紙筆轉換成電腦操作,也就是電腦化測驗。所謂電腦化效標參照測驗(computerized criterion-referenced test)即是把考生能力分成兩個以上的程度區間,藉由考生的答題狀況來判斷考生應歸屬於哪個區間。這種測驗方式與傳統測驗不同的是:電腦化測驗是依據考生的答題表現來給題,考生能力越偏離分段點(thresholds),需要的題數就越少;越接近分段點,需要的題數就越多。
在這篇論文中,我們運用兩個參數的羅吉斯模型(two-parameter logistic model)來估計考生之於試題的答對機率。藉由電腦模擬來探討結合貝它保護(beta-protection)方法和適性測驗對平均測驗題數及誤判率(亦即考生真正的能力與電腦判斷的區間不同)的影響。在模擬過程中,我們也介紹了試題參數的選擇情形,估計考生能力的方法以及在貝它保護下,停止選題的規則。根據這些原則,電腦模擬結果證明使用適性測驗加上貝它保護方法能夠有效地控制誤判率在規定的範圍內,程度不同的考生也能控制有不同的測驗題數。 / In a traditional Paper-and-Pencil (p-and-p) test, all examinees have same test items and the number of items is fixed. The examinee fails or passes the exam depends on if his/her test score exceeds a predetermined scores, say, 60 out of 100. However, with the rapid advancement of modern computer technology, the test form has been converted from p-and-p to computer terminal. Computerized criterion-referenced classify the examinees into more than two categories according to his/her answers to the items. It differs from the conventional standardized test in that the selection of test items is tailored to each examinee’s ability level. Typically, those examinees with high ability or low ability will have shorter average test length (ATL) than examinees with ability that close to thresholds.
In this thesis, we assume that the probability of choosing correct response to an item follows a two-parameter logistic (2-PL) model. Our goal is to study the performance of ATL and misclassification rate (MR) using beta-protection method and adaptive sequential item selection. On the simulation procedures, we also introduce the selection rule of item parameter, the methods used to estimate an examinee’s ability, and the stopping rule with beta-protection. Simulation results show that using adaptive test and beta-protection method can control the MR within specified level and the number of test items required depends on the examinee’s ability.
|
286 |
Observed score equating with covariatesBränberg, Kenny January 2010 (has links)
In test score equating the focus is on the problem of finding the relationship between the scales of different test forms. This can be done only if data are collected in such a way that the effect of differences in ability between groups taking different test forms can be separated from the effect of differences in test form difficulty. In standard equating procedures this problem has been solved by using common examinees or common items. With common examinees, as in the equivalent groups design, the single group design, and the counterbalanced design, the examinees taking the test forms are either exactly the same, i.e., each examinee takes both test forms, or random samples from the same population. Common items (anchor items) are usually used when the samples taking the different test forms are assumed to come from different populations. The thesis consists of four papers and the main theme in three of these papers is the use of covariates, i.e., background variables correlated with the test scores, in observed score equating. We show how covariates can be used to adjust for systematic differences between samples in a non-equivalent groups design when there are no anchor items. We also show how covariates can be used to decrease the equating error in an equivalent groups design or in a non-equivalent groups design. The first paper, Paper I, is the only paper where the focus is on something else than the incorporation of covariates in equating. The paper is an introduction to test score equating, and the author's thoughts on the foundation of test score equating. There are a number of different definitions of test score equating in the literature. Some of these definitions are presented and the similarities and differences between them are discussed. An attempt is also made to clarify the connection between the definitions and the most commonly used equating functions. In Paper II a model is proposed for observed score linear equating with background variables. The idea presented in the paper is to adjust for systematic differences in ability between groups in a non-equivalent groups design by using information from background variables correlated with the observed test scores. It is assumed that conditional on the background variables the two samples can be seen as random samples from the same population. The background variables are used to explain the systematic differences in ability between the populations. The proposed model consists of a linear regression model connecting the observed scores with the background variables and a linear equating function connecting observed scores on one test forms to observed scores on the other test form. Maximum likelihood estimators of the model parameters are derived, using an assumption of normally distributed test scores, and data from two administrations of the Swedish Scholastic Assessment Test are used to illustrate the use of the model. In Paper III we use the model presented in Paper II with two different data collection designs: the non-equivalent groups design (with and without anchor items) and the equivalent groups design. Simulated data are used to examine the effect - in terms of bias, variance and mean squared error - on the estimators, of including covariates. With the equivalent groups design the results show that using covariates can increase the accuracy of the equating. With the non-equivalent groups design the results show that using an anchor test together with covariates is the most efficient way of reducing the mean squared error of the estimators. Furthermore, with no anchor test, the background variables can be used to adjust for the systematic differences between the populations and produce unbiased estimators of the equating relationship, provided that the “right” variables are used, i.e., the variables explaining those differences. In Paper IV we explore the idea of using covariates as a substitute for an anchor test with a non-equivalent groups design in the framework of Kernel Equating. Kernel Equating can be seen as a method including five different steps: presmoothing, estimation of score probabilities, continuization, equating, and calculating the standard error of equating. For each of these steps we give the theoretical results when observations on covariates are used as a substitute for scores on an anchor test. It is shown that we can use the method developed for Post-Stratification Equating in the non-equivalent groups with anchor test design, but with observations on the covariates instead of scores on an anchor test. The method is illustrated using data from the Swedish Scholastic Assessment Test.
|
287 |
Detection and Classification of DIF Types Using Parametric and Nonparametric Methods: A comparison of the IRT-Likelihood Ratio Test, Crossing-SIBTEST, and Logistic Regression ProceduresLopez, Gabriel E. 01 January 2012 (has links)
The purpose of this investigation was to compare the efficacy of three methods for detecting differential item functioning (DIF). The performance of the crossing simultaneous item bias test (CSIBTEST), the item response theory likelihood ratio test (IRT-LR), and logistic regression (LOGREG) was examined across a range of experimental conditions including different test lengths, sample sizes, DIF and differential test functioning (DTF) magnitudes, and mean differences in the underlying trait distributions of comparison groups, herein referred to as the reference and focal groups. In addition, each procedure was implemented using both an all-other anchor approach, in which the IRT-LR baseline model, CSIBEST matching subtest, and LOGREG trait estimate were based on all test items except for the one under study, and a constant anchor approach, in which the baseline model, matching subtest, and trait estimate were based on a predefined subset of DIF-free items. Response data for the reference and focal groups were generated using known item parameters based on the three-parameter logistic item response theory model (3-PLM). Various types of DIF were simulated by shifting the generating item parameters of select items to achieve desired DIF and DTF magnitudes based on the area between the groups' item response functions. Power, Type I error, and Type III error rates were computed for each experimental condition based on 100 replications and effects analyzed via ANOVA. Results indicated that the procedures varied in efficacy, with LOGREG when implemented using an all-other approach providing the best balance of power and Type I error rate. However, none of the procedures were effective at identifying the type of DIF that was simulated.
|
288 |
混合試題與受試者模型於試題差異功能分析之研究 / A Mixture Items-and-Examinees Model Analysis on Differential Item Functioning黃馨瑩, Huang, Hsin Ying Unknown Date (has links)
依據「多層次混合試題反應理論」與「隨機試題混合模型」,本研究提出「混合試題與受試者模型」。本研究旨在評估此模型在不同樣本數、不同試題差異功能的試題數下,偵測試題差異功能的表現,以及其參數回復性情形。研究結果顯示,「混合試題與受試者模型」在樣本數大、試題差異功能試題數較多之情境下,具有正確的參數回復性,能正確判斷出試題是否存在試題差異功能,且具有良好的難度估計值,並能將樣本正確地分群,其也與「隨機試題混合模型」的估計表現頗為相近。建議未來可將「混合試題與受試者模型」應用於大型教育資料庫相關研究上,並加入其他變項後進一步探討。 / Drawing upon the framework of the multilevel mixture item response theory model and the random item mixture model, the study attempts to propose one model, called the mixture items and examinees model(MIE model). The purpose of this study was to assess the respective performances of the model on different sample-sizes and differential item functioning (DIF) items. Particularly, the study assessed the model performances in the detection of DIF items, and the accurate parameters recovery. The results of the study revealed that with large sample-sizes and more DIF items, the MIE model had the good parameters recovery, the accurate detection of the DIF items, the good estimate of the item difficulty, and the accurate classifications of the sub-samples. These model performances appeared similar to those of the random item mixture model. The findings suggest that future studies should apply the MIE model to the analyses on large-scale education databases, and should add more variables to the MIE model.
|
289 |
The Differential Item Functioning (dif) Analysis Of Mathematics Items In The International Assessment ProgramsYildirim, Huseyin Husnu 01 April 2006 (has links) (PDF)
Cross-cultural studies, like TIMSS and PISA 2003, are being conducted since 1960s with an idea that these assessments can provide a broad perspective for evaluating and improving education. In addition countries can assess their relative positions in mathematics achievement among their competitors in the global world. However, because of the different cultural and language settings of different countries, these international tests may not be functioning as expected across all the countries. Thus, tests may not be equivalent, or fair, linguistically and culturally across the participating countries. In this conte! ! xt, the present study aimed at assessing the equivalence of mathematics items of TIMSS 1999 and PISA 2003 across cultures and languages, to fin! d out if mathematics achievement possesses any culture specifi! c aspect s. For this purpose, the present study assessed Turkish and English versions of TIMSS 1999 and PISA 2003 mathematics items with respect to, (a) psychometric characteristics of items, and (b) possible sources of Differential Item Functioning (DIF) between these two versions. The study used Restricted Factor Analysis, Mantel-Haenzsel Statistics and Item Response Theory Likelihood Ratio methodologies to determine DIF items. The results revealed that there were adaptation problems in both TIMSS and PISA studies. However it was still possible to determine a subtest of items functioning fairly between cultures, to form a basis for a cross-cultural comparison. In PISA, there was a high rate of agreement among the DIF methodologies used. However, in TIMSS, the agree! ment ra! te decreased considerably possibly because the rate o! f differ e! ntially functioning items within TIMSS was higher, and differential guessing and differential discriminating were also issues in the test. The study! also revealed that items requiring competencies of reproduction of practiced knowledge, knowledge of facts, performance of routine procedures, application of technical skills were less likely to be biased against Turkish students with respect to American students at the same ability level. On the other hand, items requiring students to communicate mathematically, items where various results must be compared, and items that had real-world context were less likely to be in favor of Turkish students.
|
290 |
Estudo sobre construção de escalas com base na Teoria da Resposta ao Item: avaliação de proficiência em conteúdos matemáticos básicos / Study on scale construction based on Item Response Theory: assessment of proficiency in basic mathematical contentsFujii, Tânia Robaskiewicz Coneglian 07 May 2018 (has links)
Submitted by TÂNIA ROBASKIEWICZ CONEGLIAN FUJII (taniaconeglian@hotmail.com) on 2018-07-05T02:38:14Z
No. of bitstreams: 1
dissertação_tânia_fujii.pdf: 9736559 bytes, checksum: 96a18fba83fc563e110d05ccc897d764 (MD5) / Approved for entry into archive by ALESSANDRA KUBA OSHIRO ASSUNÇÃO (alessandra@fct.unesp.br) on 2018-07-05T12:58:24Z (GMT) No. of bitstreams: 1
fujii_trc_me_prud.pdf: 9736559 bytes, checksum: 96a18fba83fc563e110d05ccc897d764 (MD5) / Made available in DSpace on 2018-07-05T12:58:24Z (GMT). No. of bitstreams: 1
fujii_trc_me_prud.pdf: 9736559 bytes, checksum: 96a18fba83fc563e110d05ccc897d764 (MD5)
Previous issue date: 2018-05-07 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Neste trabalho realizou-se um estudo sobre construção de escalas, com base na Teoria da Resposta ao Item (TRI), resultando na construção e interpretação pedagógica de uma escala de conhecimento para medir a proficiência em conteúdos matemáticos, necessários para o acompanhamento das disciplinas de cálculo e similares dos ingressantes nos cursos da área de exatas. O modelo matemático adotado nesta pesquisa foi o logístico unidimensional de três parâmetros. A estimação dos parâmetros dos itens e das proficiências dos respondentes foi feita sob enfoque bayesiano, utilizando-se o amostrador de Gibbs, algoritmo da classe dos Métodos de Monte Carlo via Cadeia de Markov (MCMC), implementado via software OpenBUGS (Bayesian inference Using Gibbs Sampling), direcionado para análise bayesiana de modelos complexos. O software BILOG-MG também foi utilizado para comparação dos resultados. O instrumento utilizado para a medida do conhecimento consistiu em uma prova composta por trinta e seis itens de múltipla escolha, cada um com cinco alternativas, sendo somente uma a correta. Os itens foram elaborados com base em uma matriz de referência construída para este fim, dividida em três temas, sendo estes “espaço e forma”, “grandezas e medidas” e “números e operações/álgebra e funções”. Cada tema é composto por competências e cada competência descreve uma habilidade que se deseja medir. Para a construção da escala proposta, optou-se por adotar uma escala com média 250 e desvio padrão 50. Nesta escala foram selecionados níveis para serem interpretados em um intervalo de 75 a 425. Para interpretação da escala proposta, foram comparados alguns métodos de posicionamento de itens âncora nos níveis selecionados. Buscando a interpretação da escala, em toda a sua amplitude, optou-se por utilizar a análise de agrupamentos hierárquicos para segmentar a escala em grupos, ou seja, em faixas de proficiência. A escala foi dividida em cinco grupos, cada grupo caracterizado com base nos itens posicionados como âncora, a partir de suas probabilidades de resposta correta e de seus valores para o parâmetro de discriminação. Embora os resultados sejam consistentes, apontam para a necessidade de um processo contínuo de aprimoramento do banco de questões e da escala de proficiência. / In this work, a study was carried out on the construction of scales, based on the Item Response Theory (IRT), resulting in the construction and pedagogical interpretation of a scale of knowledge to measure the proficiency in mathematical contents, necessary for the follow-up of Calculus and similar subjects of the students in the courses of the Exact Sciences Area. The mathematical model adopted in this research was the three parameters one-dimensional logistic. The parameters estimation of the items and proficiencies of the respondents was done using a Bayesian approach using the Gibbs sampler, Monte Carlo Methods via Markov Chain algorithm (MCMC), implemented using OpenBUGS software (Bayesian inference Using Gibbs Sampling), directed to Bayesian analysis of complex models. The BILOG-MG software was also used to compare the results. The instrument used for the measurement of knowledge consisted of a test composed of thirty-six multiple choice items, each with five alternatives, with only one correct. The items were elaborated based on a reference matrix constructed for this purpose, divided in three themes, being these “space and form”, “quantities and measures” and “numbers and operations/ algebra and functions". Each subject is composed of competencies and each competency describes a skill that one wishes to measure. In order to construct the proposed scale, we chose to adopt a scale with a mean of 250 and standard deviation of 50. In this scale, we selected levels to be interpreted in a range of 75 to 425. For the interpretation of the proposed scale, some methods of positioning anchor items at the selected levels were compared. In order to interpret the scale in all its amplitude, it was decided to use hierarchical groupings analysis to segment the scale into groups, that is, in skill bands. The scale was divided into five groups, each group was characterized based on the items positioned as anchor, from their correct response probabilities and their values for the discrimination parameter. Although the results are consistent, they point to the need for an ongoing upgrading process of questions bank and proficiency scale.
|
Page generated in 0.0498 seconds