Global ETD Search

11	Prompting for progression : How well can GenAI create a sense of progression in a set of multiple-choice questions? / Prompt för progression : Hur bra kan GenAI skapa progression i en uppsättning flervalsfrågor? Jönsson, August January 2024 (has links) Programming education is on the rise, leading to an increase in learning resources needed for universities and online courses. Questions are crucial for promoting good learning, and providing students with ample practice opportunities. Learning a subject relies heavily on a structured progression of topics and complexity. Yet, creating numerous questions has been proven to be a time-consuming task. Recently the technology world has been introduced to Generative AI (GenAI) systems using Large Language Models (LLMs) capable of generating large amounts of text and performing other text-related tasks. How can GenAI be used to solve problems related to creating learning materials while ensuring good quality? This study aims to investigate how well GenAI can create a sense of progression in a set of programming questions based on different prompt strategies. The method involves three question-generation cases using Chat-GPT API. Then, a qualitative evaluation of questions complexity, order, and quality is conducted. The first case aims to be the most simple way of asking Chat-GPT to generate 10 MCQs about a specific topic. The second case introduces defined complexity levels and desires of logical order and progression in complexity. The final case is the more advanced prompt building upon the second case along with a skill map as inspiration to the LLM. The skill map is a structured outline that highlights key points when learning a topic. According to the results, providing more instructions along with a skill map had a better impact on the progression of questions generated compared to a simpler prompt. The first case prompt still resulted in questions with good order but lacking in increasing complexity. The results indicate that while GenAI is capable of creating questions with a good progression that could be used in a real teaching context, it still requires quality control of the content to find outliers. Further research should be done to investigate optimal prompts and what constitutes a good skill map. / Programmeringsutbildningar blir allt fler, vilket leder till en ökning av behovet för lärresurser för universtitet och onlinekurser. Frågor är avgörande för att främja bra lärande och ge eleverna övningsmöjligheter. Att lära sig ett ämne är starkt beroende av en strukturerad progression av ämnen och komplexitet. Men att skapa många frågor har visat sig vara en tidskrävande uppgift. Nyligen har teknikvärlden introducerats till Generativa AI (GenAI)-system som använder Stora språkmodeller (LLM) som kan generera stora mängder text och utföra andra textrelaterade uppgifter. Hur kan GenAI användas för att lösa problem relaterade till att skapa läromedel samtidigt som man säkerställer en god kvalitet? Denna studie syftar till att undersöka hur väl GenAI kan skapa en känsla av progression i en uppsättning programmeringsfrågor baserade på olika prompt strategier. Metoden använder tre olika sätt att generera frågor med hjälp av Chat-GPTs API. Därefter genomförs en kvalitativ utvärdering av frågornas komplexitet, ordning och kvalité. Det första sättet syftar till att vara det enklaste sättet att be Chat-GPT att generera 10 flervalsfrågor om ett specifikt ämne. Det andra fallet introducerar definierade komplexitetsnivåer och önskemål om logisk ordning och progression i komplexitet. Det sista fallet är den mer avancerade prompten som bygger på det andra fallet tillsammans med en färdighetskarta som inspiration. Färdighetskartan är en strukturerad disposition av ett ämne som lyfter fram nyckelpunkter när man lär sig ett ämne. Resultaten visade att tillhandahålla fler instruktioner tillsammans med en färdighetskarta hade en bättre inverkan på progressionen av de genererade frågorna jämfört med det första sättet. Den första prompten resulterade fortfarande i frågor med god ordning men som saknade stegrande komplexitet. Resultaten indikerar att även om GenAI kan skapa frågor med god progression som skulle kunna användas i ett verkligt undervisningssammanhang, så krävs fortfarande en kvalitetskontroll av innehållet för att hitta felaktigheter. Ytterligare forskning bör göras för att undersöka optimala prompt och hur en bra färdighetskarta bör se ut. Generative AI LLM GPT QBL Question generation Generativ AI LLM GPT QBL Frågegenerering Computer Sciences Datavetenskap (datalogi)
12	Extracting relevant answer phrases from text : For usage in reading comprehension question generation / Extrahering av relevanta svarsfraser från text : För användning vid generering av läsförståelsefrågor Kärrfelt, Filippa January 2022 (has links) This report presents a method for extracting answer phrases, suitable as answers to reading comprehension questions, from Swedish text. All code used to produce the results is available on github. The method is developed using a Swedish BERT, a pre-trained language model based on neural networks. The BERT model is fine-tuned for three different tasks; two variations of token classification for answer extraction, and one for sentence classification with the goal of identifying relevant sentences. The dataset used for fine-tuning consists of 1814 question and answer pairs posed on 598 different texts, partitioned into a training, a validation and a test set. The models are assessed individually and are furthermore combined, using a method based on roundtrip consistency, into a system for filtering extracted answer phrases. The results for each of the models, and for the system combining them are evaluated both on quantitative measures (precision, recall and Jaccard index) and qualitative measures. Within the qualitative evaluation we both look at results produced by the models and conduct structured human evaluation with the help of four external evaluators. The final answer extraction model achieves a precision of 0.02 and recall of 0.95, with an average Jaccard index of 0.55 between the extracted answer phrases and the targets. When applying the system for filtering the precision is 0.03, the recall 0.50 and the Jaccard index 0.62 on a subset of the test data. The answer extraction model achieves the same results as the baseline on precision, outperforms it on recall by a large margin, and has worse results than the baseline on Jaccard index. The method applying filtering, which is evaluated on a subset of the test set, has worse precision than the baseline but outperform it on both recall and Jaccard index. In the qualitative evaluation we detect some flaws in the grammatical correctness of the extracted answers, as over 50% of them are classified as not grammatically correct. The joint result of the two evaluators on suitability show that 32% of the grammatically correct answers are suitable as answer phrases. / I rapporten presenteras en metod för extrahering av svarsfraser lämpliga som svar till läsförståelsefrågor på svensk text. All kod använd för att producera resultaten finns tillgänglig på github. Metoden utgår från en svensk BERT, en tränad språkmodell baserad på neurala nätverk. BERT-modellen är finjusterad (“fine-tuned“) för tre olika uppgifter; två varianter av “token classification“ för extrahering av svarsfraser samt en för “sentence classification“ med målet att identifiera relevanta meningar. Datasetet som används för finjusteringen innehåller 1814 fråge- och svarspar baserade på 598 texter, uppdelat i ett tränings-, valideringsoch testset. Resultaten utvärderas separat för varje modell, och också för ett kombinerat system av de tre modellerna. I det kombinerade systemet extraherar en modell potentiella svarsfraser medans de andra två agerar som ett filter, baserat på en variant av “roundtrip consistency“. Resultaten för varje modell och för systemet för filtrering utvärderas både kvantitativt (på “precision“, “recall“ och Jaccard index) och kvalitativt. Fyra externa utvärderare rekryterades för utvärdering av resultaten på kvalitativa grunder. Modellen med bäst resultat når en precision av 0.02 och recall av 0.95, med ett snittvärde för Jaccard index av 0.55 mellan de extraherade och korrekta svarsfraserna. Med applicering av systemet för filtrering blir resultaten för precision 0.03, recall 0.50 och Jaccard index 0.62 på en delmängd av testdatat. Den BERT-baserade modellen för extrahering av svarsfraser når samma resultat som baseline på precision, bättre resultat på recall samt sämre resultat på Jaccard index. Resultaten för metoden med filtrering, som är utvärderad på en delmängd av testdatat, har sämre resultat än baseline på precision, men bättre resultat på recall och Jaccard index. I den kvalitativa utvärderingen upptäcker vi brister i den grammatiska korrektheten av de extraherade svarsfraserna, då mer än 50% av dem klassificeras som grammatiskt felaktiga. De sammantagna resultaten av utvärderingen av svarsfrasernas lämplighet visar att 32% av de svarsfraser som är grammatiskt korrekta är lämpliga som svarsfraser. Answer phrase extraction Question generation BERT Reading comprehension Neural networks Extrahering av svarsfraser Frågegenerering BERT Läsförståelse Neurala nätverk Computer Sciences Datavetenskap (datalogi)
13	Reading with Robots: A Platform to Promote Cognitive Exercise through Identification and Discussion of Creative Metaphor in Books Parde, Natalie 08 1900 (has links) Maintaining cognitive health is often a pressing concern for aging adults, and given the world's shifting age demographics, it is impractical to assume that older adults will be able to rely on individualized human support for doing so. Recently, interest has turned toward technology as an alternative. Companion robots offer an attractive vehicle for facilitating cognitive exercise, but the language technologies guiding their interactions are still nascent; in elder-focused human-robot systems proposed to date, interactions have been limited to motion or buttons and canned speech. The incapacity of these systems to autonomously participate in conversational discourse limits their ability to engage users at a cognitively meaningful level. I addressed this limitation by developing a platform for human-robot book discussions, designed to promote cognitive exercise by encouraging users to consider the authors' underlying intentions in employing creative metaphors. The choice of book discussions as the backdrop for these conversations has an empirical basis in neuro- and social science research that has found that reading often, even in late adulthood, has been correlated with a decreased likelihood to exhibit symptoms of cognitive decline. The more targeted focus on novel metaphors within those conversations stems from prior work showing that processing novel metaphors is a cognitively challenging task, for young adults and even more so in older adults with and without dementia. A central contribution arising from the work was the creation of the first computational method for modelling metaphor novelty in word pairs. I show that the method outperforms baseline strategies as well as a standard metaphor detection approach, and additionally discover that incorporating a sentence-based classifier as a preliminary filtering step when applying the model to new books results in a better final set of scored word pairs. I trained and evaluated my methods using new, large corpora from two sources, and release those corpora to the research community. In developing the corpora, an additional contribution was the discovery that training a supervised regression model to automatically aggregate the crowdsourced annotations outperformed existing label aggregation strategies. Finally, I show that automatically-generated questions adhering to the Questioning the Author strategy are comparable to human-generated questions in terms of naturalness, sensibility, and question depth; the automatically-generated questions score slightly higher than human-generated questions in terms of clarity. I close by presenting findings from a usability evaluation in which users engaged in thirty-minute book discussions with a robot using the platform, showing that users find the platform to be likeable and engaging. natural language processing metaphor question generation dialogue systems corpora human-robot systems social robotics artificial intelligence cognitive exercise Computer Science Human-robot interaction. Evolutionary robotics. Cognition -- Age factors.
14	Génération de données synthétiques pour l'adaptation hors-domaine non-supervisée en réponse aux questions : méthodes basées sur des règles contre réseaux de neurones Duran, Juan Felipe 02 1900 (has links) Les modèles de réponse aux questions ont montré des résultats impressionnants sur plusieurs ensembles de données et tâches de réponse aux questions. Cependant, lorsqu'ils sont testés sur des ensembles de données hors domaine, la performance diminue. Afin de contourner l'annotation manuelle des données d'entraînement du nouveau domaine, des paires de questions-réponses peuvent être générées synthétiquement à partir de données non annotées. Dans ce travail, nous nous intéressons à la génération de données synthétiques et nous testons différentes méthodes de traitement du langage naturel pour les deux étapes de création d'ensembles de données : génération de questions et génération de réponses. Nous utilisons les ensembles de données générés pour entraîner les modèles UnifiedQA et Bert-QA et nous les testons sur SCIQ, un ensemble de données hors domaine sur la physique, la chimie et la biologie pour la tâche de question-réponse à choix multiples, ainsi que sur HotpotQA, TriviaQA, NatQ et SearchQA, quatre ensembles de données hors domaine pour la tâche de question-réponse. Cette procédure nous permet d'évaluer et de comparer les méthodes basées sur des règles avec les méthodes de réseaux neuronaux. Nous montrons que les méthodes basées sur des règles produisent des résultats supérieurs pour la tâche de question-réponse à choix multiple, mais que les méthodes de réseaux neuronaux produisent généralement des meilleurs résultats pour la tâche de question-réponse. Par contre, nous observons aussi qu'occasionnellement, les méthodes basées sur des règles peuvent compléter les méthodes de réseaux neuronaux et produire des résultats compétitifs lorsqu'on entraîne Bert-QA avec les bases de données synthétiques provenant des deux méthodes. / Question Answering models have shown impressive results in several question answering datasets and tasks. However, when tested on out-of-domain datasets, the performance decreases. In order to circumvent manually annotating training data from the new domain, question-answer pairs can be generated synthetically from unnanotated data. In this work, we are interested in the generation of synthetic data and we test different Natural Language Processing methods for the two steps of dataset creation: question/answer generation. We use the generated datasets to train QA models UnifiedQA and Bert-QA and we test it on SCIQ, an out-of-domain dataset about physics, chemistry, and biology for MCQA, and on HotpotQA, TriviaQA, NatQ and SearchQA, four out-of-domain datasets for QA. This procedure allows us to evaluate and compare rule-based methods with neural network methods. We show that rule-based methods yield superior results for the multiple-choice question-answering task, but neural network methods generally produce better results for the question-answering task. However, we also observe that occasionally, rule-based methods can complement neural network methods and produce competitive results when training Bert-QA with synthetic databases derived from both methods. Intelligence Artificielle Adaptation de domaine Génération automatique de questions Génération automatique de réponses Méthodes basées sur des règles Apprentissage profond Apprentissage non supervisé Automatic question generation Automatic answer generation Methods based on neural networks Rule-based methods Deep learning Unsupervised learning Domain adaptation NLP (Natural Language Processing) Artificial intelligence

Page generated in 0.1541 seconds