Global ETD Search

71	Developing a validation process for an adaptive computer-based spoken English language test Underhill, Nic January 2000 (has links) This thesis explores the implications for language test validation of developments in language teaching and testing methodology, test validity and computer-based delivery. It identifies a range of features that tests may now exhibit in novel combinations, and concludes that these combinations of factors favour a continuing process of validation for such tests. It proposes such a model designed around a series of cycles drawing on diverse sources of data. The research uses the Five Star test, a private commercial test designed for use in a specific cultural context, as an exemplar of a larger class of tests exhibiting some or all of these features. A range of validation activities on the Five Star test is reported and analysed from two quite different sources, an independent expert panel that scrutinised the test task by task and an analysis of 460 test results using item-response theory (IRT). The validation activities are critically evaluated for the purpose of the model, which is then applied to the Five Star test. A historical overview of language teaching and testing methodology reveals the communicative approach to be the dominant paradigm, but suggests that there is no clear consensus about the key features of this approach or how they combine. It has been applied incompletely to language testing, and important aspects of the approach are identified which remain problematic, especially for the assessment of spoken language. They include the constructs of authenticity, interaction and topicality whose status in the literature is reviewed and determinability in test events discussed. The evolution of validity in the broader field of educational and psychological testing informs the development of validation in language testing and a transition is identified away from validity as a one-time activity attaching to the test instrument towards validation as a continuing process that informs the interpretation of test results. In test delivery, this research reports on the validation issues raised by computer-based adaptive testing, particularly with respect to test instruments such as the Five Star test that combine direct face-to-face interaction with computer-based delivery. In the light of the theoretical issues raised and the application of the model to the Five Star test, some implications of the model for use in other test environments are presented critically and recommendations made for its development. 370
72	Análise de questionários com itens constrangedores / Analysis of questionnaire with embarrassing items Mariana Cúri 11 August 2006 (has links) As pesquisas científicas na área da Psiquiatria freqüentemente avaliam características subjetivas de indivíduos como, por exemplo, depressão, ansiedade e fobias. Os dados são coletados através de questionários, cujos itens tentam identificar a presença ou ausência de certos sintomas associados à morbidade psiquiátrica de interesse. Alguns desses itens, entretanto, podem provocar constrangimento em parte dos indivíduos respondedores por abordarem características ou comportamentos socialmente questionáveis ou, até, ilegais. Um modelo da teoria de resposta ao item é proposto neste trabalho visando diferenciar a relação entre a probabilidade de presença do sintoma e a gravidade da morbidade de indivíduos constrangidos e não constrangidos. Itens que necessitam dessa diferenciação são chamados \\textbf{itens com comportamento diferencial}. Adicionalmente, o modelo permite assumir que indivíduos constrangidos em responder um item possam vir a mentir em suas respostas, no sentido de omitir a presença de um sintoma. Aplicações do modelo proposto a dados simulados para questionários com 20 itens mostraram que as estimativas dos parâmetros são próximas aos seus verdadeiros valores. A qualidade das estimativas piora com a diminuição da amostra de indivíduos, com o aumento do número de itens com comportamento diferencial e, principalmente, com o aumento do número de itens com comportamento diferencial suscetíveis à mentira. A aplicação do modelo a um conjunto de dados reais, coletados para avaliar depressão em adolescentes, ilustra a diferença do padrão de resposta do item ``crises de choro\" entre homens e mulheres. / Psychiatric scientific research often evaluate subjective characteristics of the individual such as depression, anxiety and phobias. Data are collected through questionnaires with items that try to identify the presence or absence of certain symptoms associated with the psychiatric disease. Some of these items though could make some people embarrassed since they are related to questionable or even illegal social behaviors. The item response theory model proposed within this work envisions to differentiate the relationship between the probability of the symptom presence and the gravity of the disease of embarrassed and non-embarrassed individuals. Items that need this differentiation are called differential item functioning (dif). Additionally, the model has the assumption that individuals embarrassed with one particular item could lie across other answers to omit a possible condition. Applications of the proposed model to simulated data for a 20-item questionnaire have showed that parameter estimates of the proposed model are close to their real values. The estimate accuracy gets worse as the number of individuals decreases, the number of dif increases, and especially as the number of dif susceptible to lying increases. The application of the model to a group of real data, collected to evaluate teenager depression, shows the difference in the probability of \"crying crisis\" presence between men and women.
73	The development and evaluation of Africanised items for multicultural cognitive assessment Bekwa, Nomvuyo Nomfusi 01 1900 (has links) Nothing in life is to be feared, it is only to be understood. Now is the time to understand more, so that we may fear less. Marie Curie Debates about how best to test people from different contexts and backgrounds continue to hold the spotlight of testing and assessment. In an effort to contribute to the debates, the purpose of the study was to develop and evaluate the viability and utility of nonverbal figural reasoning ability items that were developed based on inspirations from African cultural artefacts such as African material prints, art, decorations, beadwork, paintings, et cetera. The research was conducted in two phases, with phase 1 focused on the development of the new items, while phase 2 was used to evaluate the new items. The aims of the study were to develop items inspired by African art and cultural artefacts in order to measure general nonverbal figural reasoning ability; to evaluate the viability of the items in terms of their appropriateness in representing the African art and cultural artefacts, specifically to determine the face and content validity of the items from a cultural perspective; and to evaluate the utility of the items in terms of their psychometric properties. These elements were investigated using the exploratory sequential mixed method research design with quantitative embedded in phase 2. For sampling purposes, the sequential mixed method sampling design and non-probability sampling strategies were used, specifically the purposive and convenience sampling methods. The data collection methods that were used included interviews with a cultural expert and colour-blind person, open-ended questionnaires completed by school learners and test administration to a group of 946 participants undergoing a sponsored basic career-related training and guidance programme. Content analysis was used for the qualitative data while statistical analysis mainly based on the Rasch model was utilised for quantitative data. The results of phase 1 were positive and provided support for further development of the new items, and based on this feedback, 200 new items were developed. This final pool of items was then used for phase 2 – the evaluation of the new items. The v statistical analysis of the new items indicated acceptable psychometric properties of the general reasoning (“g” or fluid ability) construct. The item difficulty values (pvalues) for the new items were determined using classical test theory (CTT) analysis and ranged from 0.06 (most difficult item) to 0.91 (easiest item). Rasch analysis showed that the new items were unidimensional and that they were adequately targeted to the level of ability of the participants, although there were elements that would need to be improved. The reliability of the new items was determined using the Cronbach alpha reliability coefficient (α) and the person separation index (PSI), and both methods indicated similar indices of internal consistency (α = 0.97; PSI = 0.96). Gender-related differential item functioning (DIF) was investigated, and the majority of the new items did not indicate any significant differences between the gender groups. Construct validity was determined from the relationship between the new items and the Learning Potential Computerised Adaptive Test (LPCAT), which uses traditional item formats to measure fluid ability. The correlation results for the total score of the new items and the pre- and post-tests were 0.616 and 0.712 respectively. The new items were thus confirmed to be measuring fluid ability using nonverbal figural reasoning ability items. Overall, the results were satisfactory in indicating the viability and utility of the new items. The main limitation of the research was that because the sample was not representative of the South African population, there were limited for generalisation. This led to a further limitation, namely that it was not possible to conduct important analysis on DIF for various other subgroups. Further research has been recommended to build on this initiative. / Industrial and Organisational Psychology Multicultural cognitive assessment Fluid ability Item development African art and cultural artefacts Culture-fair assessment Item analysis Rasch analysis Realibility Validity Item difficulty Differential item functioning
74	The State of our Toolbox: A Meta-analysis of Reliability Measurement Precision Duniewicz, Krzysztof 20 November 2012 (has links) My study investigated internal consistency estimates of psychometric surveys as an operationalization of the state of measurement precision of constructs in industrial and organizational (I/O) psychology. Analyses were conducted of samples used in research articles published in the Journal of Applied Psychology between 1975 and 2010 in five year intervals (K = 934) from 480 articles yielding 1427 coefficients. Articles and their respective samples were coded for test-taker characteristics (e.g., age, gender, and ethnicity), research settings (e.g., lab and field studies), and actual tests (e.g., number of items and scale anchor points). A reliability and inter-item correlations depository was developed for I/O variables and construct groups. Personality measures had significantly lower inter-item correlations than other construct groups. Also, internal consistency estimates and reporting practices were evaluated over time, demonstrating an improvement in measurement precision and missing data. reliability inter-item correlations meta-analysis
75	Teoria e a prática de um teste adaptativo informatizado / Theory and practice of computerized adaptive testing Gilberto Pereira Sassi 10 April 2012 (has links) O objetivo deste trabalho é apresentar os conceitos relacionados a Teste Adaptativo Informatizado, ou abreviadamente TAI, para o modelo logístico unidimensional da Teoria de Resposta ao Item. Utilizamos a abordagem bayesiana para a estimação do parâmetro de interesse, chamado de traço latente ou habilidade. Apresentamos os principais algoritmos de seleção de itens em TAI e realizamos estudos de simulação para comparar o desempenho deles. Para comparação, usamos aproximações numéricas para o Erro Quadrático Médio e para o Vício e também calculamos o tempo médio para o TAI selecionar um item. Além disso, apresentamos como instalar e usar a implementação de TAI desenvolvida neste projeto chamada de TAI2U, que foi desenvolvido no VBA-Excel usando uma interface com o R / The main of this work is to introduce the subjects related to Computerized Adaptive Testing, or breafly CAT, for the unidimensional three-parameter logistic model of Item Response Theory. We use bayesian approach to estimate the parameter of interest. We present several item selection algorithms and we perform simulations comparing them. The comparisons are made in terms of the mean square error, bias of the trait estimates, the average time for item selection and the average length of test. Furthermore, we show how to install e use the CAT implementation of this work called built in MIcrosoft Excel - VBA using interface with the statistical package R Algoritmo de seleção de item Modelo logístico unidimensional Teoria de resposta ao item Teste adaptativo informatizado Computerized adaptive testing Item response theory Item selection algorithms
76	Critical Thinking & Test-Item Writing Merriman, Carolyn S. 01 April 2008 (has links) No description available. critical thinking test item writing College of Nursing
77	3PL and 4PL Multiprocess Models Derickson, Ryan 24 May 2022 (has links) No description available. Statistics IRT Item Response Theory Psychometrics
78	Linking IRTree Estimates of Within-Person Variability in Personality to Job Performance Stevenor, Brent A. 11 August 2023 (has links) No description available. Psychology Personality Item Response Theory Job Performance
79	Examination of the Application of Item Response Theory to the Angoff Standard Setting Procedure Clauser, Jerome Cody 01 September 2013 (has links) Establishing valid and reliable passing scores is a vital activity for any examination used to make classification decisions. Although there are many different approaches to setting passing scores, this thesis is focused specifically on the Angoff standard setting method. The Angoff method is a test-centric classical test theory based approach to estimating performance standards. In the Angoff method each judge estimates the proportion of minimally competent examinees who will answer each item correctly. These values are summed across items and averages across judges to arrive at a recommended passing score. Unfortunately, research has shown that the Angoff method has a number of limitations which have the potential to undermine both the validity and reliability of the resulting standard. Many of the limitations of the Angoff method can be linked to its grounding in classical test theory. The purpose of this study is to determine if the limitations of the Angoff could be mitigated by a transition to an item response theory (IRT) framework. Item response theory is a modern measurement model for relating examinees' latent ability to their observed test performance. Theoretically the transition to an IRT-based Angoff method could result in more accurate, stable, and efficient passing scores. The methodology for the study was divided into three studies designed to assess the potential advantages of using an IRT-based Angoff method. Study one examined the effect of allowing judges to skip unfamiliar items during the ratings process. The goal of this study was to detect if passing scores are artificially biased due to deficits in the content experts' specific item level content knowledge. Study two explored the potential benefit of setting passing scores on an adaptively selected subset of test items. This study attempted to leverage IRT's score invariance property to more efficiently estimate passing scores. Finally study three compared IRT-based standards to traditional Angoff standards using a simulation study. The goal of this study was to determine if passing scores set using the IRT Angoff method had greater stability and accuracy than those set using the common True Score Angoff method. Together these three studies examined the potential advantages of an IRT-based approach to setting passing scores. The results indicate that the IRT Angoff method does not produce more reliable passing score than the common Angoff method. The transition to the IRT-based approach, however, does effectively ameliorate two sources of systematic error in the common Angoff method. The first source of error is brought on by requiring that all judges rate all items and the second source is introduced during the transition from test to scaled score passing scores. By eliminating these sources of error the IRT-based method allows for accurate and unbiased estimation of the judges' true opinion of the ability of the minimally capable examinee. Although all of the theoretical benefits of the IRT Angoff method could not be demonstrated empirically, the results of this thesis are extremely encouraging. The IRT Angoff method was shown to eliminate two sources of systematic error resulting in more accurate passing scores. In addition this thesis provides a strong foundation for a variety of studies with the potential to aid in the selection, training, and evaluation of content experts. Overall findings from this thesis suggest that the application of IRT to the Angoff standard setting method has the potential to offer significantly more valid passing scores. Item Response Theory Standard Setting Education
80	Applying Longitudinal IRT Models to Small Samples for Scale Evaluation Keum, EunHee 09 August 2016 (has links) No description available. Quantitative Psychology Psychology Item response theory

Search results