Global ETD Search

71	Low-resource suicide ideation and depression detection with multitask learning and large language models Breau, Pierre-William 08 1900 (has links) Nous évaluons des méthodes de traitement automatique du langage naturel (TALN) pour la détection d’idées suicidaires, de la dépression et de l’anxiété à partir de publications sur les médias sociaux. Comme les ensembles de données relatifs à la santé mentale sont rares et généralement de petite taille, les méthodes classiques d’apprentissage automatique ont traditionnellement été utilisées dans ce domaine. Nous évaluons l’effet de l’apprentissage multi-tâche sur la détection d’idées suicidaires en utilisant comme tâches auxiliaires des ensembles de données disponibles publiquement pour la détection de la dépression et de l’anxiété, ainsi que la classification d’émotions et du stress. Nous constatons une hausse de la performance de classification pour les tâches de détection d’idées suicidaires, de la dépression et de l’anxiété lorsqu’elles sont entraînées ensemble en raison de similitudes entre les troubles de santé mentale à l’étude. Nous observons que l’utilisation d’ensembles de données publiquement accessibles pour des tâches connexes peut bénéficier à la détection de problèmes de santé mentale. Nous évaluons enfin la performance des modèles ChatGPT et GPT-4 dans des scénarios d’apprentissage zero-shot et few-shot. GPT-4 surpasse toutes les autres méthodes testées pour la détection d’idées suicidaires. De plus, nous observons que ChatGPT bénéficie davantage de l’apprentissage few-shot, car le modèle fournit un haut taux de réponses non concluantes si aucun exemple n’est présenté. Enfin, une analyse des faux négatifs produits par GPT-4 pour la détection d’idées suicidaires conclut qu’ils sont dus à des erreurs d’étiquetage plutôt qu’à des lacunes du modèle. / In this work we explore natural language processing (NLP) methods to suicide ideation, depression, and anxiety detection in social media posts. Since annotated mental health data is scarce and difficult to come by, classical machine learning methods have traditionally been employed on this type of task due to the small size of the datasets. We evaluate the effect of multi-task learning on suicide ideation detection using publicly-available datasets for depression, anxiety, emotion and stress classification as auxiliary tasks. We find that classification performance of suicide ideation, depression, and anxiety is improved when trained together because of the proximity between the mental disorders. We observe that publicly-available datasets for closely-related tasks can benefit the detection of certain mental health conditions. We then perform classification experiments using ChatGPT and GPT-4 using zero-shot and few-shot learning, and find that GPT-4 obtains the best performance of all methods tested for suicide ideation detection. We further observe that ChatGPT benefits the most from few-shot learning as it struggles to give conclusive answers when no examples are provided. Finally, an analysis of false negative results for suicide ideation output by GPT-4 concludes that they are due to labeling errors rather than mistakes from the model. Modèles de langage Idées suicidaires Classification de textes Apprentissage multitâche Language models Suicide ideation Text classification Multitask learning
72	Repairing Swedish Automatic Speech Recognition / Korrigering av Automatisk Taligenkänning för Svenska Rehn, Karla January 2021 (has links) The quality of automatic speech recognition has increased dramatically the last few years, but the performance for low and middle resource languages such as Swedish is still far from optimal. In this project a language model trained on large written corpora called KB-BERT is utilized to improve the quality of transcriptions for Swedish. The large language model is inserted as a repairing module after the automatic speech recognition, aiming to repair the original output into a transcription more closely resembling the ground truth by using a sequence to sequence translating approach. Two automatic speech recognition models are used to transcribe the speech, one of the models are developed in this project using the Kaldi framework, the other model is Microsoft’s Azure Speech to text platform. The performance of the translator is evaluated with four different datasets, three consisting of read speech and one of spontaneous speech. The spontaneous speech and one of the read datasets include both native and non-native speakers. The performance is measured by three different metrics, word error rate, a weighted word error rate and a semantic similarity. The repairs improve the transcriptions of two of the read speech datasets significantly, decreasing the word error rate from 13.69% to 3.05% and from 36.23% to 21.17%. The repairs improve the word error rate from 44.38% to 44.06% on the data with spontaneous speech, and fail on the last read dataset, instead increasing the word error rate. The lower performance on the latter is likely due to lack of data. / Automatisk taligenkänning har förbättrats de senaste åren, men för små språk såsom svenska är prestandan fortfarande långt ifrån optimal. Det här projektet använder KB-BERT, en neural språkmodell tränad på stora mängder skriven text, för att förbättra kvalitén på transkriptioner av svenskt tal. Transkriptionerna kommer från två olika taligenkänningsmodeller, dels en utvecklad i det här projektet med hjälp av mjukvarubiblioteket Kaldi, dels Microsoft Azures plattform för tal till text. Transkriptionerna repareras med hjälp av en sequence-to-sequence översättningsmodell, och KB-BERT används för att initiera modellen. Översättningen sker från den urpsrungliga transkriptionen från en av tal-till-text-modellerna till en transkription som är mer lik den korrekta, faktiska transkriptionen. Kvalitéen på reparationerna evalueras med tre olika metriker, på fyra olika dataset. Tre av dataseten är läst tal och det fjärde spontant, och det spontana talet samt ett av de lästa dataseten kommer både från talare som har svenska som modersmål, och talare som har det som andraspråk. De tre metrikerna är word error rate, en viktad word error rate, samt ett mått på semantisk likhet. Reparationerna förbättrar transkriptionerna från två av de lästa dataseten markant, och sänker word error rate från 13.69% till 3.05% och från 36.23% till 21.17%. På det spontana talet sänks word error rate från 44.38% till 44.06%. Reparationerna misslyckas på det fjärde datasetet, troligen på grund av dess lilla storlek. Automatic speech recognition Dialogue systems Language models ASR Repair Automatisk taligenkänning Dialogsystem Språkmodeller Reparation av taligenkänning Computer and Information Sciences Data- och informationsvetenskap
73	Active Learning for Named Entity Recognition with Swedish Language Models / Aktiv Inlärning för Namnigenkänning med Svenska Språkmodeller Öhman, Joey January 2021 (has links) The recent advancements of Natural Language Processing have cleared the path for many new applications. This is primarily a consequence of the transformer model and the transfer-learning capabilities provided by models like BERT. However, task-specific labeled data is required to fine-tune these models. To alleviate the expensive process of labeling data, Active Learning (AL) aims to maximize the information gained from each label. By including a model in the annotation process, the informativeness of each unlabeled sample can be estimated and hence allow human annotators to focus on vital samples and avoid redundancy. This thesis investigates to what extent AL can accelerate model training with respect to the number of labels required. In particular, the focus is on pre- trained Swedish language models in the context of Named Entity Recognition. The data annotation process is simulated using existing labeled datasets to evaluate multiple AL strategies. Experiments are evaluated by analyzing the F1 score achieved by models trained on the data selected by each strategy. The results show that AL can significantly accelerate the model training and hence reduce the manual annotation effort. The state-of-the-art strategy for sentence classification, ALPS, shows no sign of accelerating the model training. However, uncertainty-based strategies consistently outperform random selection. Under certain conditions, these strategies can reduce the number of labels required by more than a factor of two. / Framstegen som nyligen har gjorts inom naturlig språkbehandling har möjliggjort många nya applikationer. Det är mestadels till följd av transformer-modellerna och lärandeöverföringsmöjligheterna som kommer med modeller som BERT. Däremot behövs det fortfarande uppgiftsspecifik annoterad data för att finjustera dessa modeller. För att lindra den dyra processen att annotera data, strävar aktiv inlärning efter att maximera informationen som utvinns i varje annotering. Genom att inkludera modellen i annoteringsprocessen, kan man estimera hur informationsrikt varje träningsexempel är, och på så sätt låta mänskilga annoterare fokusera på viktiga datapunkter. Detta examensarbete utforskar hur väl aktiv inlärning kan accelerera modellträningen med avseende på hur många annoterade träningsexempel som behövs. Fokus ligger på förtränade svenska språkmodeller och uppgiften namnigenkänning. Dataannoteringsprocessen simuleras med färdigannoterade dataset för att evaluera flera olika strategier för aktiv inlärning. Experimenten evalueras genom att analysera den uppnådda F1-poängen av modeller som är tränade på datapunkterna som varje strategi har valt. Resultaten visar att aktiv inlärning har en signifikant förmåga att accelerera modellträningen och reducera de manuella annoteringskostnaderna. Den toppmoderna strategin för meningsklassificering, ALPS, visar inget tecken på att kunna accelerera modellträningen. Däremot är osäkerhetsbaserade strategier är konsekvent bättre än att slumpmässigt välja datapunkter. I vissa förhållanden kan dessa strategier reducera antalet annoteringar med mer än en faktor 2. Active learning Named entity recognition Language models Natural language processing Bert Swedish Aktiv inlärning Namnigenkänning Språkmodeller Naturlig språkbehandling Bert Svenska Computer and Information Sciences Data- och informationsvetenskap
74	Text Content Features for Hybrid Recommendations : Pre-trained Language Models for Better Recommendations Lazarova, Mariya January 2021 (has links) Nowadays, with the ever growing availability of options in many areas of our lives, it is crucial to have good ways to navigate your choices. This is why recommendation engines’ role is growing more important. Recommenders are often based on user-item interaction. In many areas like news and podcasts, however, by the time there is enough interaction data for an item, the item has already become irrelevant. This is why incorporating content features is desirable, as the content does not depend on the popularity or novelty of an item. Very often, there is text describing an item, so text features are good candidates for features within recommender systems. Within Natural Language Processing (NLP), pre-trained language models based on the Transformer architecture have brought a revolution in recent years, achieving state-of-the-art performance on many language tasks. Because of this, it is natural to explore how such models can play a role within recommendation systems. The scope of this work is on the intersection between NLP and recommendation systems where we investigate what are the effects of adding BERT-based encodings of titles and descriptions of movies and books to a recommender system. The results show that even in off-the-shelf BERT-models there is a considerable amount of information on movie and book similarity. It also shows that BERT based representations could be used in a recommender system for user recommendation to combine the best of collaborative and content representations. In this thesis, it is shown that adding deep pre-trained language model representations could improve a recommender system’s capability to predict good items for users with up to 0.43 AUC-ROC score for a shallow model, and 0.017 AUC-ROC score for a deeper model. It is also shown that SBERT can be fine-tuned to encode item similarity with up to 0.03 nDCG and up to 0.05 nDCG@10 score improvement. / Med den ständigt växande tillgängligheten av val i många delar av våra liv har det blivit viktigt att enkelt kunna navigera kring olika alternativ. Det är därför rekommendationssystems har blivit viktigare. Rekommendationssystem baseras ofta på interaktion-historiken mellan användare och artikel. När tillräckligt mycket data inom nyheter och podcast har hunnits samlats in för att utföra en rekommendation så har artikeln hunnit bli irrelevant. Det är därför det är önskvärt att införa innehållsfunktioner till rekommenderaren, då innehållet inte är beroende av popularitet eller nymodigheten av artikeln. Väldigt ofta finns det text som beskriver en artikel vilket har lett till textfunktioner blivit bra kandidater som funktion för rekommendationssystem. Inom Naturlig Språkbehandling (NLP), har förtränande språkmodeller baserad på transformator arkitekturen revolutionerat området de senaste åren. Den nya arkitekturen har uppnått toppmoderna resultat på flertal språkuppgifter. Tack vare detta, har det blivit naturligt att utforska hur sådana modeller kan fungera inom rekommendationssystem. Det här arbetet är mellan två områden, NLP och rekommendationssystem. Arbetet utforskar effekten av att lägga till BERT-baserade kodningar av titel och beskrivning av filmer, samt böcker till ett rekommendationssystem. Resultaten visar att även i förpackade BERT modeller finns det mycket av information om likheter mellan film och böcker. Resultaten visar även att BERT representationer kan användas i rekommendationssystem för användarrekommendationer, i kombination med kollaborativa och artikel baserade representationer. Uppsatsen visar att lägga till förtränade djupspråkmodell representationer kan förbättra rekommendationssystemens förmåga att förutsäga bra artiklar för användare. Förbättringarna är upp till 0.43 AUC-ROC poäng för en grundmodell, samt 0.017 AUC-ROC poäng för en djupmodell. Uppsatsen visar även att SBERT kan bli finjusterad för att koda artikel likhet med upp till 0.03 nDCG och upp till 0.05 nDCG@10 poängs förbättring. Recommendation Systems Natural Language Processing Pre-trained language models BERT Two-tower networks Rekommendationssystem Naturlig språkbehandling Förtränande språkmodeller BERT Två-tornnätverk. Other Computer and Information Science Annan data- och informationsvetenskap
75	Bidirectional Encoder Representations from Transformers (BERT) for Question Answering in the Telecom Domain. : Adapting a BERT-like language model to the telecom domain using the ELECTRA pre-training approach / BERT för frågebesvaring inom telekomdomänen : Anpassning till telekomdomänen av en BERT-baserad språkmodell genom ELECTRA-förträningsmetoden Holm, Henrik January 2021 (has links) The Natural Language Processing (NLP) research area has seen notable advancements in recent years, one being the ELECTRA model which improves the sample efficiency of BERT pre-training by introducing a discriminative pre-training approach. Most publicly available language models are trained on general-domain datasets. Thus, research is lacking for niche domains with domain-specific vocabulary. In this paper, the process of adapting a BERT-like model to the telecom domain is investigated. For efficiency in training the model, the ELECTRA approach is selected. For measuring target- domain performance, the Question Answering (QA) downstream task within the telecom domain is used. Three domain adaption approaches are considered: (1) continued pre- training on telecom-domain text starting from a general-domain checkpoint, (2) pre-training on telecom-domain text from scratch, and (3) pre-training from scratch on a combination of general-domain and telecom-domain text. Findings indicate that approach 1 is both inexpensive and effective, as target- domain performance increases are seen already after small amounts of training, while generalizability is retained. Approach 2 shows the highest performance on the target-domain QA task by a wide margin, albeit at the expense of generalizability. Approach 3 combines the benefits of the former two by achieving good performance on QA both in the general domain and the telecom domain. At the same time, it allows for a tokenization vocabulary well-suited for both domains. In conclusion, the suitability of a given domain adaption approach is shown to depend on the available data and computational budget. Results highlight the clear benefits of domain adaption, even when the QA task is learned through behavioral fine-tuning on a general-domain QA dataset due to insufficient amounts of labeled target-domain data being available. / Dubbelriktade språkmodeller som BERT har på senare år nått stora framgångar inom språkteknologiområdet. Flertalet vidareutvecklingar av BERT har tagits fram, bland andra ELECTRA, vars nyskapande diskriminativa träningsprocess förkortar träningstiden. Majoriteten av forskningen inom området utförs på data från den allmänna domänen. Med andra ord finns det utrymme för kunskapsbildning inom domäner med områdesspecifikt språk. I detta arbete utforskas metoder för att anpassa en dubbelriktad språkmodell till telekomdomänen. För att säkerställa hög effektivitet i förträningsstadiet används ELECTRA-modellen. Uppnådd prestanda i måldomänen mäts med hjälp av ett frågebesvaringsdataset för telekom-området. Tre metoder för domänanpassning undersöks: (1) fortsatt förträning på text från telekom-området av en modell förtränad på den allmänna domänen; (2) förträning från grunden på telekom-text; samt (3) förträning från grunden på en kombination av text från telekom-området och den allmänna domänen. Experimenten visar att metod 1 är både kostnadseffektiv och fördelaktig ur ett prestanda-perspektiv. Redan efter kort fortsatt förträning kan tydliga förbättringar inom frågebesvaring inom måldomänen urskiljas, samtidigt som generaliserbarhet kvarhålls. Tillvägagångssätt 2 uppvisar högst prestanda inom måldomänen, om än med markant sämre förmåga att generalisera. Metod 3 kombinerar fördelarna från de tidigare två metoderna genom hög prestanda dels inom måldomänen, dels inom den allmänna domänen. Samtidigt tillåter metoden användandet av ett tokenizer-vokabulär väl anpassat för båda domäner. Sammanfattningsvis bestäms en domänanpassningsmetods lämplighet av den respektive situationen och datan som tillhandahålls, samt de tillgängliga beräkningsresurserna. Resultaten påvisar de tydliga vinningar som domänanpassning kan ge upphov till, även då frågebesvaringsuppgiften lärs genom träning på ett dataset hämtat ur den allmänna domänen på grund av otillräckliga mängder frågebesvaringsdata inom måldomänen. Deep Learning Natural Language Understanding Transformers Language Models Representation Learning Domain Adaption Representationsinlärning Djupinlärning Språkteknologi Transformatorer Språkmodeller Domänanpassning Computer and Information Sciences Data- och informationsvetenskap
76	Поддержка принятия решений на разных этапах ЖЦ ИТ-инновации с использованием ChatGPT : магистерская диссертация / Decision-making support at different stages of LC IT-innovation using ChatGPT Ерицян, Г. А., Yeritsyan, G. A. January 2023 (has links) Данная работа исследует использование ChatGPT, мощной модели искусственного интеллекта, для поддержки принятия решений на различных этапах жизненного цикла ИТ-инновации. Работа основана на анализе эффективности и применимости ChatGPT в контексте поддержки принятия решений внаучно-исследовательских работах и процессах разработки программного обеспечения. Исследование описывает принципы взаимодействия с ChatGPT, его способность анализировать и обрабатывать текстовую информацию, а также его возможности в генерации содержательных и контекстно связанных ответов в контексте решаемых задач в научно-исследовательских работах и тестировании программного обеспечения. В работе рассматривается потенциальное использование использования ChatGPT для поддержки принятия решений, для определения преимуществ и ограничений. / This paper explores the use of ChatGPT, a powerful artificial intelligence model, to support decision-making at various stages of the IT innovation lifecycle. The work is based on the analysis of the effectiveness and applicability of ChatGPT in the context of decision support in scientific research and software development processes. The study describes the principles of interaction with ChatGPT, its ability to analyze and process textual information, as well as its capabilities in generating meaningful and contextually related answers in the context of tasks being solved in research and software testing. The paper discusses the potential use of using ChatGPT to support decision-making, to identify advantages and limitations. ПРИНЯТИЕ РЕШЕНИЙ CHATGPT MASTER'S THESIS DECISION MAKING CHATGPT BIG LANGUAGE MODELS IT INNOVATION LIFECYCLE
77	Active Learning for Extractive Question Answering Marti Roman, Salvador January 2022 (has links) Data labelling for question answering tasks (QA) is a costly procedure that requires oracles to read lengthy excerpts of texts and reason to extract an answer for a given question from within the text. QA is a task in natural language processing (NLP), where a majority of recent advancements have come from leveraging the vast corpora of unlabelled and unstructured text available online. This work aims to extend this trend in the efficient use of unlabelled text data to the problem of selecting which subset of samples to label in order to maximize performance. This practice of selective labelling is called active learning (AL). Recent developments in AL for NLP have introduced the use of self-supervised learning on large corpora of text in the labelling process of samples for classification problems. This work adapts this research to the task of question answering and performs an initial exploration of expected performance. The methods covered in this work use uncertainty estimates obtained from neural networks to guide an incremental labelling process. These estimates are obtained from transformer-based models, previously trained in a self-supervised manner, by calculating the entropy of the confidence scores or with an approximation of Bayesian uncertainty obtained through Monte Carlo dropout. These methods are evaluated on two different benchmarking QA datasets: SQuAD v1 and TriviaQA. Several factors are observed to influence the behaviour of these uncertainty-based acquisition functions, including the choice of language model used, the presence of unanswered questions and the acquisition size used in the incremental process. The study produces no evidence to support that averaging or selecting maximal uncertainty values between the classification of an answer’s starting and ending positions affects sample acquisition quality. However, language model choice, the presence of unanswerable questions and acquisition size are all identified as key factors affecting consistency between runs and degree of success. Machine Learning Deep Learning Active Learning Natural Language Processing NLP Question Answering Transformers Uncertainty Language Models Probability Theory and Statistics Sannolikhetsteori och statistik
78	Broad-domain Quantifier Scoping with RoBERTa Rasmussen, Nathan Ellis 10 August 2022 (has links) No description available. Linguistics quantifiers quantifier scope disambiguation explanatory text Simple English Wikipedia corpus annotation inter-annotator agreement RoBERTa self-trained language models transfer learning span pair classification
79	Exploring source languages for Faroese in single-source and multi-source transfer learning using language-specific and multilingual language models Fischer, Kristóf January 2024 (has links) Cross-lingual transfer learning has been the driving force of low-resource natural language processing in recent years, relying on massively multilingual language models with hopes of solving the data scarcity issue for languages with a limited digital presence. However, this "one-size-fits-all" approach is not equally applicable to all low-resource languages, suggesting limitations of such models in cross-lingual transfer. Besides, known similarities and phylogenetic relationships between source and target languages are often overlooked. In this work, the emphasis is placed on Faroese, a low-resource North Germanic language with several closely related resource-rich sibling languages. The cross-lingual transfer potential from these strong Scandinavian source candidates, as well as from additional genetically related, geographically proximate, and syntactically similar source languages is studied in single-source and multi-source experiments, in terms of Faroese syntactic parsing and part-of-speech tagging. In addition, the effect of task-specific fine-tuning on monolingual, linguistically informed smaller multilingual, and massively multilingual pre-trained language models is explored. The results suggest Icelandic as a strong source candidate, however, only when fine-tuning a monolingual model. With multilingual models, task-specific fine-tuning in Norwegian and Swedish seems even more beneficial. Although they do not surpass fully Scandinavian fine-tuning, models trained on genetically related and syntactically similar languages produce good results. Additionally, the findings indicate that multilingual models outperform models pre-trained on a single language, and that even better results can be achieved using a smaller, linguistically informed model, compared to a massively multilingual one. low-resource languages natural language processing faroese dependency parsing part-of-speech tagging scandinavian languages multilingual language models
80	Vocal communication in bonobos (Pan paniscus) : studies in the contexts of feeding and sex Clay, Zanna January 2011 (has links) Despite having being discovered nearly 80 years ago, bonobos (Pan paniscus) are still one of the least well understood of the great apes, largely remaining in the shadow of their better known cousins, the chimpanzees (Pan troglodytes). This is especially evident in the domain of communication, with bonobo vocal behaviour still a neglected field of study, especially compared to that of chimpanzees. In this thesis, I address this issue by exploring the natural vocal communication of bonobos and its underlying cognition, focusing on the role that vocalisations play during two key contexts, food discovery and sex. In the context of food-discovery, I combine observational and experimental techniques to examine whether bonobos produce and understand vocalisations that convey meaningful information about the quality of food encountered by the caller. Results indicate that bonobos produce an array of vocalisations when finding food, and combine different food-associated calls together into sequences in a way that relates to perceived food quality. In a subsequent playback study, it was demonstrated that receivers are able to extract meaning about perceived food quality by attending to these calls and integrating information across call sequences. In the context of sexual interactions, I examine the acoustic structure of female copulation calls, as well as patterns in call usage, to explore how these signals are used by individuals. My results show that females emit copulation calls in similar ways with both male and female partners, suggesting that these signals have become partly divorced from a function in reproduction, to assume a greater social role. Overall, my results highlight the relevance of studying primate vocalisations to investigate the underlying cognition and suggest that vocalisations are important behavioural tools for bonobos to navigate their social and physical worlds. 591.5

Search results