Global ETD Search

21	Pretraining Deep Learning Models for Natural Language Understanding Shao, Han 18 May 2020 (has links) No description available. Computer Science Machine learning NLP Deep learning
22	Can artificial intelligence replace humans in programming? Ekedahl, Hampus, Helander, Vilma January 2023 (has links) The recent developments in artificial intelligence have brought forth natural language models like ChatGPT, which exhibits abilities in tasks such as language translation, text generation, and interacting conversations. Notably, ChatGPT's ability to generate code has sparked debates regarding the role of artificial intelligence in software engineering and its potential to replace human programmers. In this thesis, we conduct an experiment in which we prompt ChatGPT with common programming problems, in order to evaluate ChatGPT’s potential in replacing humans as programmers. Our study specifically focuses on code correctness, run-time performance, and memory usage. The objective of this thesis is to investigate the potential of ChatGPT in replacing humans as programmers. We achieved this by designing and conducting an experiment where we prompted ChatGPT with a set of 90 diverse programming problems in terms of types and difficulty levels. Based on the results of our experiment, we have observed that ChatGPT is proficient in solving programming problems at lower and medium difficulty levels. However, its ability to produce correct code declines when prompted with harder problems. In terms of run-time and memory usage, ChatGPT demonstrated above-average results for problems at lower and medium difficulty levels, but its performance declined when faced with more challenging tasks.While ChatGPT falls shortly in fully replacing human programmers, it exhibits potential as programming assistant. Our study shed light onto current capabilities of ChatGPT and others chat-bots as code generating tools and can serve as a groundwork for future work in the area. AI ChatGPT NLP Computer Sciences Datavetenskap (datalogi)
23	Snort Rule Generation for Malware Detection Using the GPT2 Transformer Laryea, Ebenezer Nii Afotey 04 July 2022 (has links) Natural Language machine learning methods are applied to rules generated to identify malware at the network level. These rules use a computer-based signature specification "language" called Snort. Using Natural Language processing techniques and other machine learning methods, new rules are generated based on a training set of existing Snort rule signatures for a specific type of malware family. The performance is then measured, in terms of the detection of existing types of malware and the number of "false positive" triggering events. GPT-2 Snort malware detection NLP
24	Named Entity Recognition for Detecting Trends in Biomedical Literature Törnkvist, Betty January 2024 (has links) The number of publications in the biomedical field increases exponentially, which makes the task of keeping up with current research more and more difficult. However, rapid advances in the field of Natural Language Processing (NLP) offer possible solutions to this problem. In this thesis we focus on investigating three main questions of importance for utilizing the field of NLP, or more specifically the two subfields Named Entity Recognition (NER) and Large Language Models (LLM), to help solve this problem. The questions are; comparing LLM performance to NER models on NER-tasks, the importance of normalization, and how the analysis is affected by the availability of data. We find for the first question that the two models offer a reasonably comparable performance for the specific task we are looking at. For the second question, we find that normalization plays a substantial role in improving the results for tasks involving data synthesis and analysis. Lastly, for the third question, we find that it is important to have access to full papers in most cases since important information can be hidden outside of the abstracts. NLP NER CHO Computer Sciences Datavetenskap (datalogi)
25	[en] TRANSITIONBASED DEPENDENCY PARSING APPLIED ON UNIVERSAL DEPENDENCIES / [pt] ANÁLISE DE DEPENDÊNCIA BASEADA EM TRANSIÇÃO APLICADA A UNIVERSAL DEPENDENCIES CESAR DE SOUZA BOUCAS 11 February 2019 (has links) [pt] Análise de dependência consiste em obter uma estrutura sintática correspondente a determinado texto da linguagem natural. Tal estrutura, usualmente uma árvore de dependência, representa relações hierárquicas entre palavras. Representação computacionalmente eficiente que vem sendo utilizada para lidar com desafios que surgem com o crescente volume de informação textual online. Podendo ser utilizada, por exemplo, para inferir computacionalmente o significado de palavras das mais diversas línguas. Este trabalho apresenta a análise de dependência com enfoque em uma de suas modelagens mais populares em aprendizado de máquina: o método baseado em transição. Desenvolvemos uma implementação gulosa deste modelo com um classificador neural simples para executar experimentos. Datasets da iniciativa Universal Dependencies são utilizados para treinar e posteriormente testar o sistema com a validação disponibilizada na tarefa compartilhada da CoNLL-2017. Os resultados mostram empiricamente que se pode obter ganho de performance inicializando a camada de entrada da rede neural com uma representação de palavras obtida com pré-treino. Chegando a uma performance de 84,51 LAS no conjunto de teste da língua portuguesa do Brasil e 75,19 LAS no conjunto da língua inglesa. Ficando cerca de 4 pontos atrás da performance do melhor resultado para analisadores de dependência baseados em sistemas de transição. / [en] Dependency parsing is the task that transforms a sentence into a syntactic structure, usually a dependency tree, that represents relations between words. This representations are useful to deal with several tasks that arises with the increasing volume of textual online information and the need for technologies that depends on NLP tasks to work. It can be used, for example, to enable computers to infer the meaning of words of multiple natural languages. This paper presents dependency parsing with focus on one of its most popular modeling in machine learning: the transition-based method. A greedy implementation of this model with a simple neural network-based classifier is used to perform experiments. Universal Dependencies treebanks are used to train and then test the system using the validation script published in the CoNLL-2017 shared task. The results empirically indicate the benefits of initializing the input layer of the network with word embeddings obtained through pre-training. It reached 84.51 LAS in the Portuguese of Brazil test set and 75.19 LAS in the English test set. This result is nearly 4 points behind the performance of the best results of transition-based parsers. [pt] APRENDIZADO DE MAQUINA [en] MACHINE LEARNING [pt] ANALISE DE DEPENDENCIA [en] DEPENDENCY PARSING [pt] NLP [en] NLP
26	Nuomonių analizės taikymas komentarams lietuvių kalboje / Opinion analysis of comments in Lithuanian Kavaliauskas, Vytautas 15 June 2011 (has links) Pastaruosius keletą metų, žmonėms vis aktyviau pradėjus reikšti savo požiūrį, įsitikinimus ir potyrius internete, susiformavo nauja tyrinėjimų sritis, kuri apima nuomonių gavybą ir sentimentų analizę. Šios srities tyrinėjimus aktyviai skatina ir jais domisi įvairios verslo kompanijos, matančios didelį, dėka nuolat tobulėjančių rezultatų, praktinį potencialą. Šis darbas skirtas apžvelgti teorinius bei praktinius nuomonės gavybos ir sentimentų analizės rezultatus bei realizuoti prototipinę nuomonės analizės sistemą, skirtą tyrinėti trumpus komentarus, parašytus lietuvių kalba. Taip pat darbe aprašomos problemos, susijusios su lietuvių kalbos taikymu nuomonės gavybos ir sentimentų analizės sistemų veikloje. Galiausiai, baigiamojoje dalyje suformuluojami ir išdėstomi rekomendacinio pobūdžio etapai, skirti nuomonės analizės sistemų kūrimui bei tobulinimui. / In past few years, more and more people started to express their views, beliefs and experiences on the Internet. This caused the emergence of a new research field, which includes opinion mining and sentiment analysis. Various business companies are actively interested in researches of this domain and seeing big potential for practical adaptation of the results. This Master Thesis covers the review of theoretical and practical results of opinion mining and sentiment analysis, including attempt of creating prototype system for opinion analysis of comments in Lithuanian. Also this study aims to identify problems related to adaptation of Lithuanian language in opinion mining and sentiment analysis system work. Finally, last part contains of the formulated guidance steps for development and improvement of the opinion mining and sentiment analysis. Informatics Nuomonių gavyba Sentimentų analizė Sentimentai Nuomonės NLP Opinion mining Sentiment analysis Sentiments Opinions NLP
27	Tool for linguistic quality evaluation of student texts / Verktyg för språklig utvärdering av studenttexter Kärde, Wilhelm January 2015 (has links) Spell checkers are nowadays a common occurrence in most editors. A student writing an essay in school will often have the availability of a spell checker. However, the feedback from a spell checker seldom correlates with the feedback from a teacher. A reason for this being that the teacher has more aspects on which it evaluates a text. The teacher will, as opposed to the the spell checker, evaluate a text based on aspects such as genre adaptation, structure and word variation. This thesis evaluates how well those aspects translate to NLP (Natural Language Processing) and implements those who translate well into a rule based solution called Granska. / Grammatikgranskare ﬁnns numera tillgängligt i de ﬂesta ordbehandlare. En student som skriver en uppsats har allt som oftast tillgång till en grammatikgranskare. Dock så skiljer det sig mycket mellan den återkoppling som studenten får från grammatikgranskaren respektive läraren. Detta då läraren ofta har ﬂer aspekter som den använder sig av vid bedömingen utav en elevtext. Läraren, till skillnad från grammatikgranskaren, bedömmer en text på aspekter så som hur väl texten hör till en viss genre, dess struktur och ordvariation. Denna uppsats utforskar hur pass väl dessa aspekter går att anpassas till NLP (Natural Language Processing) och implementerar de som passar väl in i en regelbaserad lösning som heter Granska. linguistic spell checker NLP natural language processing lingvistik grammatikgranskare NLP Computer Sciences Datavetenskap (datalogi)
28	Textová klasifikace s limitovanými trénovacími daty / Text classification with limited training data Laitoch, Petr January 2021 (has links) The aim of this thesis is to minimize manual work needed to create training data for text classification tasks. Various research areas including weak supervision, interactive learning and transfer learning explore how to minimize training data creation effort. We combine ideas from available literature in order to design a comprehensive text classification framework that employs keyword-based labeling instead of traditional text annotation. Keyword-based labeling aims to label texts based on keywords contained in the texts that are highly correlated with individual classification labels. As noted repeatedly in previous work, coming up with many new keywords is challenging for humans. To accommodate for this issue, we propose an interactive keyword labeler featuring the use of word similarity for guiding a user in keyword labeling. To verify the effectiveness of our novel approach, we implement a minimum viable prototype of the designed framework and use it to perform a user study on a restaurant review multi-label classification problem.
29	Building a Personally Identifiable Information Recognizer in a Privacy Preserved Manner Using Automated Annotation and Federated Learning Hathurusinghe, Rajitha 16 September 2020 (has links) This thesis explores the training of a deep neural network based named entity recognizer in an end-to-end privacy preserved setting where dataset creation and model training happen in an environment with minimal manual interventions. With the improvement of accuracy in Deep Learning Models for practical tasks, a rising concern is satisfying the demand for training data for these models amidst the concerns on the data privacy. Several scenarios of data protection are suggested in the recent past due to public concerns hence the legal guidelines to enforce them. A promising new development is the decentralized model training on isolated datasets, which eliminates the compromises of privacy upon providing data to a centralized entity. However, in this federated setting curating the data source is still a privacy risk mostly in unstructured data sources such as text. We explore the feasibility of automatic dataset annotation for a Named Entity Recognition (NER) task and training a deep learning model with it in two federated learning settings. We explore the feasibility of utilizing a dataset created in this manner for fine-tuning a stateof- the-art deep learning language model for the downstream task of named entity recognition. We also explore this novel setting of deep learning NLP model and federated learning for its deviation from the classical centralized setting. We created an automatically annotated dataset containing around 80,000 sentences, a manual human annotated test set and tools to extend the dataset with more manual annotations. We observed the noise from automated annotation can be overcome to a level by increasing the dataset size. We also contributed to the federated learning framework with state-of-the-art NLP model developments. Overall, our NER model achieved around 0.80 F1-score for recognition of entities in sentences. Federated Learning Named Entity Recognition BERT Transformer based NLP NLP NER Deep learning Privacy Machine learning
30	Can ChatGPT Generate Code to Support a System Sciences Bachelor’s Thesis? / Kan ChatGPT generera kod för att stödja en kandidatuppsats i systemvetenskap? Amin, Solin, Hellström, Johan January 2023 (has links) Background ChatGPT is a chatbot released in November 2022. Its usage has grown to include being used in academia and for scientific writing, with varying results. We investigate if ChatGPT can be used for the technical part in a Bachelor’s thesis in System Sciences. Aim We evaluate if it is possible to generate the code for detecting potential gender bias in previous responses from ChatGPT, in the form of a dialogue. Method We use an exploratory case study where an iterative dialogue with ChatGPT is used to generate Python code to be able to analyse previous responses made byChatGPT. The methods for development were chosen by the authors from suggestions by ChatGPT. Results Two separate dialogues resulted in a program that combined a fine-tuned Natural Language Processing model together with sentiment analysis and word frequency analysis. The program successfully identified responses in the dataset as having a female or male gender bias or being gender neutral. Conclusions ChatGPT serves as a powerful tool for coding, although it currently falls short of being a one-stop solution that can generate code sufficient for more complex tasks witha single prompt. Our experience suggests that ChatGPT accelerates one’s work when the user possesses some programming knowledge. With further development, ChatGPT could transform coding workflows and increase productivity in related fields. Implications ChatGPT as a tool is very capable in supporting students in the technical aspect of a Bachelor’s thesis and it is not unreasonable to assume that it works in other contexts, as well. As such, one can achieve more with the tool than without, and consequently it would be for the better to integrate ChatGPT into thesis work. This stresses the point that we need to find better regulations for cheating and plagiarism. / Bakgrund ChatGPT är en chatbot som släpptes den 22 november 2022. Sedan dess har dess användningsområden växt till att inkludera den akademiska världen och vetenskapligt skrivande, med varierande resultat. Vi undersöker om ChatGPT kan användas för den tekniska delen av en kandidatexamen i systemvetenskap. Syfte Vi utvärderar om det är möjligt att i en dialogform generera kod för att upptäcka potentiell könsbias i tidigare svar från ChatGPT. Metod Vi använder en utforskande fallstudie där en iterativ dialog med ChatGPT används för att generera Python-kod för att kunna analysera tidigare svar från ChatGPT. Utvecklingsmetoderna valdes av författarna utifrån förslag från ChatGPT. Resultat Två separata dialoger med ChatGPT resulterade i ett program som kombinerade en finjusterad Natural Language Processing-modell med stämnings- och ordfrekvensanalys. Programmet identifierade svar i datasetet med att ha kvinnlig eller manlig könsbias, eller att vara könsneutralt. Slutsatser ChatGPT är ett kraftfullt verktyg som kan användas för programmering. I dagsläget är ChatGPT ingen komplett lösning som kan generera kod tillräcklig för mer komplexa uppgifter med en enda prompt. Vår erfarenhet visar att ChatGPT accelererar ens arbete då användaren besitter viss kunskap inom programmering. Vid fortsatt utveckling kan ChatGPT ombilda programmeringsflöden och öka produktiviteten i relaterade områden. Följder ChatGPT som verktyg är mer än kapabelt med att stödja studenter med den tekniska delen av ett examensarbete, det är heller inte orealistiskt att anta att det är möjligt att även använda det i andra sammanhang. Med detta sagt kan man utföra mer med verktyget än utan, och följaktligen är det till det bättre att integrera ChatGPT i examensarbeten. Detta driver på poängen att vi behöver finna en lösning vad gäller reglering och hantering av plagiat. AI ChatGPT NLP Python AI ChatGPT NLP Python Information Systems

Search results