• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 276
  • 31
  • 25
  • 22
  • 9
  • 8
  • 5
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 429
  • 205
  • 160
  • 155
  • 150
  • 136
  • 112
  • 102
  • 92
  • 80
  • 77
  • 72
  • 72
  • 71
  • 62
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
181

Explainable Neural Claim Verification Using Rationalization

Gurrapu, Sai Charan 15 June 2022 (has links)
The dependence on Natural Language Processing (NLP) systems has grown significantly in the last decade. Recent advances in deep learning have enabled language models to generate high-quality text at the same level as human-written text. If this growth continues, it can potentially lead to increased misinformation, which is a significant challenge. Although claim verification techniques exist, they lack proper explainability. Numerical scores such as Attention and Lime and visualization techniques such as saliency heat maps are insufficient because they require specialized knowledge. It is inaccessible and challenging for the nonexpert to understand black-box NLP systems. We propose a novel approach called, ExClaim for explainable claim verification using NLP rationalization. We demonstrate that our approach can predict a verdict for the claim but also justify and rationalize its output as a natural language explanation (NLE). We extensively evaluate the system using statistical and Explainable AI (XAI) metrics to ensure the outcomes are valid, verified, and trustworthy to help reinforce the human-AI trust. We propose a new subfield in XAI called Rational AI (RAI) to improve research progress on rationalization and NLE-based explainability techniques. Ensuring that claim verification systems are assured and explainable is a step towards trustworthy AI systems and ultimately helps mitigate misinformation. / Master of Science / The dependence on Natural Language Processing (NLP) systems has grown significantly in the last decade. Recent advances in deep learning have enabled text generation models to generate high-quality text that is at the same level as human-written text. If this growth continues, it can potentially lead to increased misinformation, which is a major societal challenge. Although claim verification techniques exist, they lack proper explainability. It is difficult for the average user to understand the model's decision-making process. Numerical scores and visualization techniques exist to provide explainability, but they are insufficient because they require specialized domain knowledge. This makes it inaccessible and challenging for the nonexpert to understand black-box NLP systems. We propose a novel approach called, ExClaim for explainable claim verification using NLP rationalization. We demonstrate that our approach can predict a verdict for the claim but also justify and rationalize its output as a natural language explanation (NLE). We extensively evaluate the system using statistical and Explainable AI (XAI) metrics to ensure the outcomes are valid, verified, and trustworthy to help reinforce the human-AI trust. We propose a new subfield in XAI called Rational AI (RAI) to improve research progress on rationalization and NLE-based explainability techniques. Ensuring that claim verification systems are assured and explainable is a step towards trustworthy AI systems and ultimately helps mitigate misinformation.
182

Propagation of online consumer-perceived negativity: Quantifying the effect of supply chain underperformance on passenger car sales

Singh, A., Jenamani, M., Thakker, J.J., Rana, Nripendra P. 10 April 2021 (has links)
Yes / The paper presents a text analytics framework that analyses online reviews to explore how consumer-perceived negativity corresponding to the supply chain propagates over time and how it affects car sales. In particular, the framework integrates aspect-level sentiment analysis using SentiWordNet, time-series decomposition, and bias-corrected least square dummy variable (LSDVc) – a panel data estimator. The framework facilitates the business community by providing a list of consumers’ contemporary interests in the form of frequently discussed product attributes; quantifying consumer-perceived performance of supply chain (SC) partners and comparing the competitors; and a model assessing various firms’ sales performance. The proposed framework demonstrated to the automobile supply chain using a review dataset received from a renowned car-portal in India. Our findings suggest that consumer-voiced negativity is maximum for dealers and minimum for manufacturing and assembly related features. Firm age, GDP, and review volume significantly influence car sales whereas the sentiments corresponding to SC partners do not. The proposed research framework can help the manufacturers in inspecting their SC partners; realising consumer-cited critical car sales influencers; and accurately predicting the sales, which in turn can help them in better production planning, supply chain management, marketing, and consumer relationships.
183

Chatting Over Course Material : The Role of Retrieval Augmented Generation Systems in Enhancing Academic Chatbots.

Monteiro, Hélder January 2024 (has links)
Large Language Models (LLMs) have the potential to enhance learning among students. These tools can be used in chatbot systems allowing students to ask questions about course material, in particular when plugged with the so-called Retrieval Augmented Systems (RAGs). RAGs allow LLMs to access external knowledge, which improves tailored responses when used in a chatbot system. This thesis studies different RAGs through an experimentation approach where each RAG is constructed using different sets of parameters and tools, including small and large language models. We conclude by suggesting which of the RAGs best adapts to high school courses in Physics and undergraduate courses in Mathematics, such that the retrieval systems together with the LLMs are able to return the most relevant answers from provided course material. We conclude with two RAG-powered LLM with different configurations performing over 64% accuracy in physics and 66% in mathematics.
184

Investigating the impact of Generative AI on newcomers' understanding of Software Projects

Larsen, Knud Ronau, Edvall, Magnus January 2024 (has links)
Context: In both commercial and open-source software development, newcomers often join the development process in the advanced stages of the software development lifecycle. Newcomers frequently face barriers impeding their ability to make early contributions, often caused by a lack of understanding. For this purpose, we have developed an LLM-based tool called SPAC-B that facilitates project-specific question-answering to aid newcomers' understanding of software projects. Objective: Investigate the LLM-based tool's ability to assist newcomers in understanding software projects by measuring its accuracy and conducting an experiment. Method: In this study, a case study is conducted to investigate the accuracy of the tool, measured in relevance, completeness, and correctness. Furthermore, an experiment is performed among software developers to test the tool's ability to help newcomers formulate better plans for open-source issues. Results: SPAC-B achieved an accuracy of 4.60 in relevance, 4.30 in completeness, and 4.28 in correctness on a scale from 1 to 5. It improved the combined mean score of the plans of the 10 participants in our experiments from 1.90 to 2.70, and 8 out of 10 participants found the tool helpful. Conclusions: SPAC-B has demonstrated high accuracy and helpfulness, but further research is needed to confirm if these results can be generalized to a larger population and other contexts of use.
185

TEXT ANNOTATION IN PARLIAMENTARY RECORDSUSING BERT MODELS

Eriksson, Fabian January 2024 (has links)
This thesis has investigated whether a transformer-based language model can be improved by training the model on context sequences which are input sequences with a larger window of text, by combining a transformer model with a neural network for non-text features, or by domain-adaptive pre-training. Two types of context input sequences are tested: left context and full context. The three modifications are explored by applying BERT models to the Swedish Parliamentary Corpus to classify whether a text sequence is a heading. A standard BERT model is trained for sequence classification alongside a position model which adds an additional feedforward neural network to the model. Each model is trained with- and without context sequences as well as with- and without domain-adaptive pre-training. A standard implementation of the BERT model with domain adaptation achieves an F1 score of 0.9358 on the test set and an accuracy of 0.9940. The best performing standard BERT model with a context input sequence achieves an F1 of 0.9636 and an accuracy of 0.9966 while the best performing position model achieves an F1 of 0.9550 and an accuracy of 0.9957. The best performing model which combines context input sequences with the position model achieves an F1 of 0.9908 and an accuracy of 0.9991 on the test set. Analysis of misclassified sequences suggests that the models with context input sequences and positional features are less likely to misclassify sequences which can appear both as a heading and a non-heading in the corpus. However, a McNemar's exact test indicates that only a position model with left context input sequences differs significantly from its standard BERT counterpart in terms of the number of differing misclassifications at a 5% significance level. Furthermore, there is no experimental evidence that domain-adaptive pre-training improves classification performance on this specific sequence classification task.
186

Deep Learning for Uncertainty Measurement

Kim, Alisa 12 February 2021 (has links)
Diese Arbeit konzentriert sich auf die Lösung des Problems der Unsicherheitsmessung und ihrer Auswirkungen auf Geschäftsentscheidungen, wobei zwei Ziele verfolgt werden: Erstens die Entwicklung und Validierung robuster Modelle zur Quantifizierung der Unsicherheit, wobei insbesondere sowohl die etablierten statistischen Modelle als auch neu entwickelte maschinelle Lernwerkzeuge zum Einsatz kommen. Das zweite Ziel dreht sich um die industrielle Anwendung der vorgeschlagenen Modelle. Die Anwendung auf reale Fälle bei der Messung der Volatilität oder bei einer riskanten Entscheidung ist mit einem direkten und erheblichen Gewinn oder Verlust verbunden. Diese These begann mit der Untersuchung der impliziten Volatilität (IV) als Proxy für die Wahrnehmung der Unsicherheit von Anlegern für eine neue Klasse von Vermögenswerten - Kryptowährungen. Das zweite Papier konzentriert sich auf Methoden zur Identifizierung risikofreudiger Händler und nutzt die DNN-Infrastruktur, um das Risikoverhalten von Marktakteuren, das auf Unsicherheit beruht und diese aufrechterhält, weiter zu untersuchen. Das dritte Papier befasste sich mit dem herausfordernden Bestreben der Betrugserkennung 3 und bot das Entscheidungshilfe-modell, das eine genauere und interpretierbarere Bewertung der zur Prüfung eingereichten Finanzberichte ermöglichte. Angesichts der Bedeutung der Risikobewertung und der Erwartungen der Agenten für die wirtschaftliche Entwicklung und des Aufbaus der bestehenden Arbeiten von Baker (2016) bot das vierte Papier eine neuartige DL-NLP-basierte Methode zur Quantifizierung der wirtschaftspolitischen Unsicherheit. Die neuen Deep-Learning-basierten Lösungen bieten eine überlegene Leistung gegenüber bestehenden Ansätzen zur Quantifizierung und Erklärung wirtschaftlicher Unsicherheiten und ermöglichen genauere Prognosen, verbesserte Planungskapazitäten und geringere Risiken. Die angebotenen Anwendungsfälle bilden eine Plattform für die weitere Forschung. / This thesis focuses on solving the problem of uncertainty measurement and its impact on business decisions while pursuing two goals: first, develop and validate accurate and robust models for uncertainty quantification, employing both the well established statistical models and newly developed machine learning tools, with particular focus on deep learning. The second goal revolves around the industrial application of proposed models, applying them to real-world cases when measuring volatility or making a risky decision entails a direct and substantial gain or loss. This thesis started with the exploration of implied volatility (IV) as a proxy for investors' perception of uncertainty for a new class of assets - crypto-currencies. The second paper focused on methods to identify risk-loving traders and employed the DNN infrastructure for it to investigate further the risk-taking behavior of market actors that both stems from and perpetuates uncertainty. The third paper addressed the challenging endeavor of fraud detection and offered the decision support model that allowed a more accurate and interpretable evaluation of financial reports submitted for audit. Following the importance of risk assessment and agents' expectations in economic development and building on the existing works of Baker (2016) and their economic policy uncertainty (EPU) index, it offered a novel DL-NLP-based method for the quantification of economic policy uncertainty. In summary, this thesis offers insights that are highly relevant to both researchers and practitioners. The new deep learning-based solutions exhibit superior performance to existing approaches to quantify and explain economic uncertainty, allowing for more accurate forecasting, enhanced planning capacities, and mitigated risks. The offered use-cases provide a road-map for further development of the DL tools in practice and constitute a platform for further research.
187

Ancient Geography goes digital: Representation of Spatial Orientation in Ancient Texts

Thiering, Martin, Goerz, Guenther, Ilysushechkina, Ekaterina 19 March 2018 (has links)
No description available.
188

Dependency Syntax in the Automatic Detection of Irony and Stance

Cignarella, Alessandra Teresa 29 November 2021 (has links)
[ES] The present thesis is part of the broad panorama of studies of Natural Language Processing (NLP). In particular, it is a work of Computational Linguistics (CL) designed to study in depth the contribution of syntax in the field of sentiment analysis and, therefore, to study texts extracted from social media or, more generally, online content. Furthermore, given the recent interest of the scientific community in the Universal Dependencies (UD) project, which proposes a morphosyntactic annotation format aimed at creating a "universal" representation of the phenomena of morphology and syntax in a manifold of languages, in this work we made use of this format, thinking of a study in a multilingual perspective (Italian, English, French and Spanish). In this work we will provide an exhaustive presentation of the morphosyntactic annotation format of UD, in particular underlining the most relevant issues regarding their application to UGC. Two tasks will be presented, and used as case studies, in order to test the research hypotheses: the first case study will be in the field of automatic Irony Detection and the second in the area of Stance Detection. In both cases, historical notes will be provided that can serve as a context for the reader, an introduction to the problems faced will be outlined and the activities proposed in the computational linguistics community will be described. Furthermore, particular attention will be paid to the resources currently available as well as to those developed specifically for the study of the aforementioned phenomena. Finally, through the description of a series of experiments, both within evaluation campaigns and within independent studies, I will try to describe the contribution that syntax can provide to the resolution of such tasks. This thesis is a revised collection of my three-year PhD career and collocates within the growing trend of studies devoted to make Artificial Intelligence results more explainable, going beyond the achievement of highest scores in performing tasks, but rather making their motivations understandable and comprehensible for experts in the domain. The novel contribution of this work mainly consists in the exploitation of features that are based on morphology and dependency syntax, which were used in order to create vectorial representations of social media texts in various languages and for two different tasks. Such features have then been paired with a manifold of machine learning classifiers, with some neural networks and also with the language model BERT. Results suggest that fine-grained dependency-based syntactic information is highly informative for the detection of irony, and less informative for what concerns stance detection. Nonetheless, dependency syntax might still prove useful in the task of stance detection if firstly irony detection is considered as a preprocessing step. I also believe that the dependency syntax approach that I propose could shed some light on the explainability of a difficult pragmatic phenomenon such as irony. / [CA] La presente tesis se enmarca dentro del amplio panorama de estudios relacionados con el Procesamiento del Lenguaje Natural (NLP). En concreto, se trata de un trabajo de Lingüística Computacional (CL) cuyo objetivo principal es estudiar en profundidad la contribución de la sintaxis en el campo del análisis de sentimientos y, en concreto, aplicado a estudiar textos extraídos de las redes sociales o, más en general, de contenidos online. Además, dado el reciente interés de la comunidad científica por el proyecto Universal Dependencies (UD), en el que se propone un formato de anotación morfosintáctica destinado a crear una representación "universal" de la morfología y sintaxis aplicable a diferentes idiomas, en este trabajo se utiliza este formato con el propósito de realizar un estudio desde una perspectiva multilingüe (italiano, inglés, francés y español). En este trabajo se presenta una descripción exhaustiva del formato de anotación morfosintáctica de UD, en particular, subrayando las cuestiones más relevantes en cuanto a su aplicación a los UGC generados en las redes sociales. El objetivo final es analizar y comprobar si estas anotaciones morfosintácticas sirven para obtener información útil para los modelos de detección de la ironía y del stance o posicionamiento. Se presentarán dos tareas y se utilizarán como ejemplos de estudio para probar las hipótesis de la investigación: el primer caso se centra en el área de la detección automática de la ironía y el segundo en el área de la detección del stance o posicionamiento. En ambos casos, se proporcionan los antecendentes y trabajos relacionados notas históricas que pueden servir de contexto para el lector, se plantean los problemas encontrados y se describen las distintas actividades propuestas para resolver estos problemas en la comunidad de la lingüística computacional. Se presta especial atención a los recursos actualmente disponibles, así como a los desarrollados específicamente para el estudio de los fenómenos antes mencionados. Finalmente, a través de la descripción de una serie de experimentos, llevados a cabo tanto en campañas de evaluación como en estudios independientes, se describe la contribución que la sintaxis puede brindar a la resolución de esas tareas. Esta tesis es el resultado de toda la investigación que he llevado a cabo durante mi doctorado en una colección revisada de mi carrera de doctorado de los últimos tres años y medio, y se ubica dentro de la tendencia creciente de estudios dedicados a hacer que los resultados de la Inteligencia Artificial sean más explicables, yendo más allá del logro de puntajes más altos en la realización de tareas, sino más bien haciendo comprensibles sus motivaciones y qué los procesos sean más comprensibles para los expertos en el dominio. La contribución principal y más novedosa de este trabajo consiste en la explotación de características (o rasgos) basadas en la morfología y la sintaxis de dependencias, que se utilizaron para crear las representaciones vectoriales de textos procedentes de redes sociales en varios idiomas y para dos tareas diferentes. A continuación, estas características se han combinado con una variedad de clasificadores de aprendizaje automático, con algunas redes neuronales y también con el modelo de lenguaje BERT. Los resultados sugieren que la información sintáctica basada en dependencias utilizada es muy informativa para la detección de la ironía y menos informativa en lo que respecta a la detección del posicionamiento. No obstante, la sintaxis basada en dependencias podría resultar útil en la tarea de detección del posicionamiento si, en primer lugar, la detección de ironía se considera un paso previo al procesamiento en la detección del posicionamiento. También creo que el enfoque basado casi completamente en sintaxis de dependencias que propongo en esta tesis podría ayudar a explicar mejor un fenómeno prag / [EN] La present tesi s'emmarca dins de l'ampli panorama d'estudis relacionats amb el Processament del Llenguatge Natural (NLP). En concret, es tracta d'un treball de Lingüística Computacional (CL), l'objectiu principal del qual és estudiar en profunditat la contribució de la sintaxi en el camp de l'anàlisi de sentiments i, en concret, aplicat a l'estudi de textos extrets de les xarxes socials o, més en general, de continguts online. A més, el recent interès de la comunitat científica pel projecte Universal Dependències (UD), en el qual es proposa un format d'anotació morfosintàctica destinat a crear una representació "universal" de la morfologia i sintaxi aplicable a diferents idiomes, en aquest treball s'utilitza aquest format amb el propòsit de realitzar un estudi des d'una perspectiva multilingüe (italià, anglès, francès i espanyol). En aquest treball es presenta una descripció exhaustiva del format d'anotació morfosintàctica d'UD, en particular, posant més èmfasi en les qüestions més rellevants pel que fa a la seva aplicació als UGC generats a les xarxes socials. L'objectiu final és analitzar i comprovar si aquestes anotacions morfosintàctiques serveixen per obtenir informació útil per als sistemes de detecció de la ironia i del stance o posicionament. Es presentaran dues tasques i s'utilitzaran com a exemples d'estudi per provar les hipòtesis de la investigació: el primer cas se centra en l'àrea de la detecció automàtica de la ironia i el segon en l'àrea de la detecció del stance o posicionament. En tots dos casos es proporcionen els antecedents i treballs relacionats que poden servir de context per al lector, es plantegen els problemes trobats i es descriuen les diferents activitats proposades per resoldre aquests problemes en la comunitat de la lingüística computacional. Es fa especialment referència als recursos actualment disponibles, així com als desenvolupats específicament per a l'estudi dels fenòmens abans esmentats. Finalment, a través de la descripció d'una sèrie d'experiments, duts a terme tant en campanyes d'avaluació com en estudis independents, es descriu la contribució que la sintaxi pot oferir a la resolució d'aquestes tasques. Aquesta tesi és el resultat de tota la investigació que he dut a terme durant el meu doctorat els últims tres anys i mig, i se situa dins de la tendència creixent d'estudis dedicats a fer que els resultats de la Intel·ligència Artificial siguin més explicables, que vagin més enllà de l'assoliment de puntuacions més altes en la realització de tasques, sinó més aviat fent comprensibles les seves motivacions i què els processos siguin més comprensibles per als experts en el domini. La contribució principal i més nova d'aquest treball consisteix en l'explotació de característiques (o trets) basades en la morfologia i la sintaxi de dependències, que s'utilitzen per crear les representacions vectorials de textos procedents de xarxes socials en diversos idiomes i per a dues tasques diferents. A continuació, aquestes característiques s'han combinat amb una varietat de classificadors d'aprenentatge automàtic, amb algunes xarxes neuronals i també amb el model de llenguatge BERT. Els resultats suggereixen que la informació sintàctica utilitzada basada en dependències és molt informativa per a la detecció de la ironia i menys informativa pel que fa a la detecció del posicionament. Malgrat això, la sintaxi basada en dependències podria ser útil en la tasca de detecció del posicionament si, en primer lloc, la detecció d'ironia es considera un pas previ al processament en la detecció del posicionament. També crec que l'enfocament basat gairebé completament en sintaxi de dependències que proposo en aquesta tesi podria ajudar a explicar millor un fenomen pragmàtic tan difícil de detectar i d'interpretar com la ironia. / Cignarella, AT. (2021). Dependency Syntax in the Automatic Detection of Irony and Stance [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/177639 / TESIS
189

Evaluation of Approaches for Representation and Sentiment of Customer Reviews / Utvärdering av tillvägagångssätt för representation och uppfattning om kundrecensioner

Giorgis, Stavros January 2021 (has links)
Classification of sentiment on customer reviews is a real-world application for many companies that offer text analytics and opinion extraction on customer reviews on different domains such as consumer electronics, hotels, restaurants, and car rental agencies. Natural Language Processing’s latest progress has seen the development of many new state-of-the-art approaches for representing the meaning of sentences, phrases, and words in the text using vector space models, so-called embeddings. In this thesis, we evaluated the most current and most popular text representation techniques against traditional methods as a baseline. The evaluation dataset consists of customer reviews from different domains with different lengths used by a text analysis company. Through a train dataset exploration, we evaluated which datasets were the most suitable for this specific task. Furthermore, we explored different techniques that could be used to alter a language model’s decisions without retraining it. Finally, all the methods were evaluated against their time performance and the resource requirements to present an overall experimental assessment that could potentially help the company decide which is the most appropriate technique to replace its system in a production environment. / Klassificeringen av attityd och känsloläge i kundrecensioner är en tillämpning med praktiskt värde för flera företag i marknadsanalysbranschen. Aktuell forskning i språkteknologi har etablerat vektorrum som standardrepresentation för ord, fraser och yttranden, så kallade embeddings. Denna uppsats utvärderar den senaste tidens mest framgångsrika textrepresentationsmodeller jämfört med mer traditionella vektorrum. Utvärdering görs genom att jämföra automatiska analyser med mänskliga bedömningar för kundrecensioner av varierande längd från olika domäner tillhandahållna av ett textanalysföretag. Inom ramen för studien har olika testmängder jämförts och olika sätt att modifera en språkmodells klassficering utan om träning. Alla modeller har också jämförts med avseende på resurs- och tidsåtgång för träning för att hjälpa uppdragsgivaren fatta beslut om vilken teknik som utgör den mest ändamålsenliga utvecklingsvägen för dess driftsatta system.
190

A Prompting Framework for Natural Language Processing in the Medical Field : Assessing the Potential of Large Language Models for Swedish Healthcare / Ett ramverk för behandling av naturliga språkmodeller inom hälso- och sjukvården : Bedömningen av potentialen hos stora språkmodeller inom svensk sjukvård

Mondal, Anim January 2023 (has links)
The increasing digitisation of healthcare through the use of technology and artificial intelligence has affected the medical field in a multitude of ways. Generative Pre-trained Transformers (GPTs) is a collection of language models that have been trained on an extensive data set to generate human-like text and have been shown to achieve a strong understanding of natural language. This thesis aims to investigate whether GPT-SW3, a large language model for the Swedish language, is capable of responding to healthcare tasks accurately given prompts and context. To reach the goal, a framework was created. The framework consisted of general medical questions, an evaluation of medical reasoning, and conversations between a doctor and patient has been created to evaluate GPT-SW3's abilities in the respective areas. Each component has a ground truth which is used when evaluating the responses. Based on the results, GPT-SW3 is capable of dealing with specific medical tasks and shows, in particular instances, signs of understanding. In more basic tasks, GPT-SW3 manages to provide adequate answers to some questions. In more advanced scenarios, such as conversation and reasoning, GPT-SW3 struggles to provide coherent answers reminiscent of a human doctor's conversation. While there have been some great advancements in natural language processing, further work into a Swedish model will have to be conducted to create a model that is useful for healthcare. Whether the work is in fine-tuning the weights of the models or retraining the models with domain-specific data is left for subsequent works. / Den ökande digitaliseringen av vården genom användning av teknik och artificiell intelligens har påverkat det medicinska fältet på både positiva och negativa sätt. Generative Pre-trained Transformers (GPTs) är en samling språkmodeller som har tränats på en stor datamängd för att generera människoliknande text och har visat sig uppnå en stark förståelse av naturligt språk. Syftet med den här uppsatsen är att undersöka om GPT-SW3, en stor språkmodell för det svenska språket, kan svara på hälso- och sjukvårdsuppgifter på ett korrekt sätt med hänsyn till uppmaningar och sammanhang.  För att uppnå målet skapades ett ramverk. Ramverket bestod av allmänna medicin-ska frågor, en utvärdering av medicinska resonemang samt konversationer mellan en läkare och en patient har skapats för att utvärdera GPT-SW3:s förmåga inom respektive områden. Varje komponent har en grundsanning som används vid utvärderingen av svaren.  Generellt sett klarar GPT-SW3 av att hantera specifika medicinska uppgifter och modellen visar tecken på förståelse. I mer grundläggande uppgifter lyckas GPT-SW3 ge adekvata svar på vissa frågor. I mer avancerade scenarier, t.ex. samtal och resonemang, har GPT-SW3 svårt att ge sammanhängande svar.  Även om det har gjorts stora framsteg inom språkteknologi måste ytterligare arbete med en svensk modell utföras för att skapa en modell som är användbar för hälso- och sjukvården. Huruvida arbetet består i att finjustera modellernas vikter eller att träna om modellerna med domänspecifika data lämnas till kommande arbeten.

Page generated in 0.0569 seconds