Global ETD Search

311	Intelligent chatbot assistant: A study of integration with VOIP and Artificial Intelligence Wärmegård, Erik January 2020 (has links) Development and research on Artificial Intelligence have increased during recent years, and the field of medicine is not excluded as a target audience for this top modern technology. Despite new research and tools in favor of medical care, the staff is still under heavy workloads. The goal of this thesis is to analyze and propose the possibility of a chatbot that aims to ease the pressure on the medical staff. To provide a guarantee that patients are being monitored. With Artificial Intelligence, VOIP, Natural Language Processing, and web development, this chatbot can communicate with a patient, which will act as an assistant tool that conducts preparatory work for the medical staff. The system of the chatbot is integrated through a web application where the administrator can initiate call and store clients onto the database. To ascertain that the system operates in real-time, several tests have been carried out to tests concerning the latency between subsystems and the quality of service. / I utvecklingen av intelligenta system har sjukvården etablerat sig som en stor målgrupp. Trots avancerade tekniker så är sjukvården fortfarande under tung belastning. Målet för detta examensarbete är att undersöka möjligheten av en chatbot vars syfte är att lätta på arbetsbelastningen hos sjukvårdspersonalen och samtidigt erbjuda en garanti för att patienter får den tillsyn och återkoppling de behöver. Med hjälp av Artificiell Intelligens, VOIP, Natural Language Processing och webbutveckling kan denna chatbot kommunicera med patienten. Chatboten agerar som ett assisterande verktyg som står för ett förarbete i beslutstagandet för sjukvårdspersonal. Ett systemsom inte bara ger praktisk nytta utan också ett främjande av den utveckling som Artificiell Intelligens gör inom sjukvården. Systemet administreras genom en hemsida som kopplar samman de flera olika komponenterna. Här kan en administratör initiera samtal och spara klienter som ska ringas till databasen. För att kunna fastställa att systemet opererar i realtid har görs flertalet prestandatester avseende både tidsfördröjningar och samtalskvalité. ai nlp voip database pstn chatbot react web Artificial Intelligence Natural Language Processing Speech Recognition Dialogflow Chatbot Public switched telephone network Computer Systems Datorsystem
312	Automatic Speech Recognition System for Somali in the interest of reducing Maternal Morbidity and Mortality. Laryea, Joycelyn, Jayasundara, Nipunika January 2020 (has links) Developing an Automatic Speech Recognition (ASR) system for the Somali language, though not novel, is not actively explored; hence there has been no success in a model for conversational speech. Neither are related works accessible as open-source. The unavailability of digital data is what labels Somali as a low resource language and poses the greatest impediment to the development of an ASR for Somali. The incentive to develop an ASR system for the Somali language is to contribute to reducing the Maternal Mortality Rate (MMR) in Somalia. Researchers acquire interview audio data regarding maternal health and behaviour in the Somali language; to be able to engage the relevant stakeholders to bring about the needed change, these audios must be transcribed into text, which is an important step towards translation into any language. This work investigates available ASR for Somali and attempts to develop a prototype ASR system to convert Somali audios into Somali text. To achieve this target, we first identified the available open-source systems for speech recognition and selected the DeepSpeech engine for the implementation of the prototype. With three hours of audio data, the accuracy of transcription is not as required and cannot be deployed for use. This we attribute to insufficient training data and estimate that the effort towards an ASR for Somali will be more significant by acquiring about 1200 hours of audio to train the DeepSpeech engine Automatic Speech Recognition (ASR) DeepSpeech Natural Language Processing (NLP) Word Error Rate (WER) Character Error Rate (CER) Social Sciences Samhällsvetenskap
313	Multilingual identification of offensive content in social media Pàmies Massip, Marc January 2020 (has links) In today’s society there is a large number of social media users that are free to express their opinion on shared platforms. The socio-cultural differences between the people behind those accounts (in terms of ethnicity, gender, sexual orientation, religion, politics, . . . ) give rise to an important percentage of online discussions that make use of offensive language, which often affects in a negative way the psychological well-being of the victims. In order to address the problem, the endless stream of user-generated content engenders a need to find an accurate and scalable solution to detect offensive language using automated methods. This thesis explores different approaches to the offensiveness detection task focusing on five different languages: Arabic, Danish, English, Greek and Turkish. The results obtained using Support Vector Machines (SVM), Convolutional Neural Networks (CNN) and the Bidirectional Encoder Representations from Transformers (BERT) are compared, achieving state-of-the-art results with some of the methods tested. The effect of the embeddings used, the dataset size, the class imbalance percentage and the addition of sentiment features are studied and analysed, as well as the cross-lingual capabilities of pre-trained multilingual models. offensive language hate speech twitter social media nlp ai natural language processing artificial intelligence machine learning bert text classification Engineering and Technology Teknik och teknologier
314	EVIDENCE BASED MEDICAL QUESTION ANSWERING SYSTEM USING KNOWLEDGE GRAPH PARADIGM Aqeel, Aya 22 June 2022 (has links) No description available. Artificial Intelligence Biomedical Research Medicine NLP Natural Language Processing Question Answering QA Biomedical Question Answering Knowledge Graph KG, Machine Learning ML EBM Evidence-Based Medicine
315	The past, present or future? : A comparative NLP study of Naive Bayes, LSTM and BERT for classifying Swedish sentences based on their tense Navér, Norah January 2021 (has links) Natural language processing is a field in computer science that is becoming increasingly important. One important part of NLP is the ability to sort text to the past, present or future, depending on when the event came or will come about. The objective of this thesis was to use text classification to classify Swedish sentences based on their tense, either past, present or future. Furthermore, the objective was also to compare how lemmatisation would affect the performance of the models. The problem was tackled by implementing three machine learning models on both lemmatised and not lemmatised data. The machine learning models were Naive Bayes, LSTM and BERT. The result showed that the overall performance was affected negatively when the data was lemmatised. The best performing model was BERT with an accuracy of 96.3\%. The result was useful as the best performing model had very high accuracy and performed well on newly constructed sentences. / Språkteknologi är område inom datavetenskap som som har blivit allt viktigare. En viktig del av språkteknologi är förmågan att sortera texter till det förflutna, nuet eller framtiden, beroende på när en händelse skedde eller kommer att ske. Syftet med denna avhandling var att använda textklassificering för att klassificera svenska meningar baserat på deras tempus, antingen dåtid, nutid eller framtid. Vidare var syftet även att jämföra hur lemmatisering skulle påverka modellernas prestanda. Problemet hanterades genom att implementera tre maskininlärningsmodeller på både lemmatiserade och icke lemmatiserade data. Maskininlärningsmodellerna var Naive Bayes, LSTM och BERT. Resultatet var att den övergripande prestandan påverkades negativt när datan lemmatiserade. Den bäst presterande modellen var BERT med en träffsäkerhet på 96,3 \%. Resultatet var användbart eftersom den bäst presterande modellen hade mycket hög träffsäkerhet och fungerade bra på nybyggda meningar. LSTM Naive Bayes BERT tense NLP text classification machine learning Computer Sciences Datavetenskap (datalogi)
316	Unsupervised Natural Language Processing for Knowledge Extraction from Domain-specific Textual Resources Hänig, Christian 17 April 2013 (has links) This thesis aims to develop a Relation Extraction algorithm to extract knowledge out of automotive data. While most approaches to Relation Extraction are only evaluated on newspaper data dealing with general relations from the business world their applicability to other data sets is not well studied. Part I of this thesis deals with theoretical foundations of Information Extraction algorithms. Text mining cannot be seen as the simple application of data mining methods to textual data. Instead, sophisticated methods have to be employed to accurately extract knowledge from text which then can be mined using statistical methods from the field of data mining. Information Extraction itself can be divided into two subtasks: Entity Detection and Relation Extraction. The detection of entities is very domain-dependent due to terminology, abbreviations and general language use within the given domain. Thus, this task has to be solved for each domain employing thesauri or another type of lexicon. Supervised approaches to Named Entity Recognition will not achieve reasonable results unless they have been trained for the given type of data. The task of Relation Extraction can be basically approached by pattern-based and kernel-based algorithms. The latter achieve state-of-the-art results on newspaper data and point out the importance of linguistic features. In order to analyze relations contained in textual data, syntactic features like part-of-speech tags and syntactic parses are essential. Chapter 4 presents machine learning approaches and linguistic foundations being essential for syntactic annotation of textual data and Relation Extraction. Chapter 6 analyzes the performance of state-of-the-art algorithms of POS tagging, syntactic parsing and Relation Extraction on automotive data. The findings are: supervised methods trained on newspaper corpora do not achieve accurate results when being applied on automotive data. This is grounded in various reasons. Besides low-quality text, the nature of automotive relations states the main challenge. Automotive relation types of interest (e. g. component – symptom) are rather arbitrary compared to well-studied relation types like is-a or is-head-of. In order to achieve acceptable results, algorithms have to be trained directly on this kind of data. As the manual annotation of data for each language and data type is too costly and inflexible, unsupervised methods are the ones to rely on. Part II deals with the development of dedicated algorithms for all three essential tasks. Unsupervised POS tagging (Chapter 7) is a well-studied task and algorithms achieving accurate tagging exist. All of them do not disambiguate high frequency words, only out-of-lexicon words are disambiguated. Most high frequency words bear syntactic information and thus, it is very important to differentiate between their different functions. Especially domain languages contain ambiguous and high frequent words bearing semantic information (e. g. pump). In order to improve POS tagging, an algorithm for disambiguation is developed and used to enhance an existing state-of-the-art tagger. This approach is based on context clustering which is used to detect a word type’s different syntactic functions. Evaluation shows that tagging accuracy is raised significantly. An approach to unsupervised syntactic parsing (Chapter 8) is developed in order to suffice the requirements of Relation Extraction. These requirements include high precision results on nominal and prepositional phrases as they contain the entities being relevant for Relation Extraction. Furthermore, accurate shallow parsing is more desirable than deep binary parsing as it facilitates Relation Extraction more than deep parsing. Endocentric and exocentric constructions can be distinguished and improve proper phrase labeling. unsuParse is based on preferred positions of word types within phrases to detect phrase candidates. Iterating the detection of simple phrases successively induces deeper structures. The proposed algorithm fulfills all demanded criteria and achieves competitive results on standard evaluation setups. Syntactic Relation Extraction (Chapter 9) is an approach exploiting syntactic statistics and text characteristics to extract relations between previously annotated entities. The approach is based on entity distributions given in a corpus and thus, provides a possibility to extend text mining processes to new data in an unsupervised manner. Evaluation on two different languages and two different text types of the automotive domain shows that it achieves accurate results on repair order data. Results are less accurate on internet data, but the task of sentiment analysis and extraction of the opinion target can be mastered. Thus, the incorporation of internet data is possible and important as it provides useful insight into the customer\''s thoughts. To conclude, this thesis presents a complete unsupervised workflow for Relation Extraction – except for the highly domain-dependent Entity Detection task – improving performance of each of the involved subtasks compared to state-of-the-art approaches. Furthermore, this work applies Natural Language Processing methods and Relation Extraction approaches to real world data unveiling challenges that do not occur in high quality newspaper corpora. info:eu-repo/classification/ddc/500 ddc:500
317	An Evaluation of Automatic Test Case Generation strategy from Requirements for Electric/Autonomous Vehicles Gangadharan, Athul January 2020 (has links) Software testing is becoming more prominent within the automotive industry due to more complex systems, and functions are implemented in the vehicles. The vehicles in the future will have the functionality to manage different levels of automation, which also means that vehicles driven by humans will have more supportive functionality to increase safety and avoid accidents. These functionalities result in a massive growth in the number of test scenarios to indicate that the vehicles are safe, and this makes it impossible to continue performing the tests in the same way as it has been done until today. The new conditions require that the test scenarios and Test Cases both be generated and executed automatically. In this thesis, an investigation and evaluation are performed to analyze the Automatic Test Case Generation methods available for inputs from Natural Language Requirements in an automotive industrial context at NEVS AB. This study aims to evaluate the NAT2TEST strategy by replacing the manual method and obtain a similar or better result. A comparative analysis is performed between the manual and automated approaches for various levels of requirements. The results show that utilizing this strategy in an industrial scenario can improve efficiency if the requirements to be tested are for well-documented lower-level requirements. software testing nlp test case generation natural language requirements requirements-based testing electric vehicles Information Systems, Social aspects
318	Compression automatique de phrases : une étude vers la génération de résumés / Automatic sentence compression : towards abstract summarization Molina Villegas, Alejandro 30 September 2013 (has links) Cette étude présente une nouvelle approche pour la génération automatique de résumés, un des principaux défis du Traitement de la Langue Naturelle. Ce sujet, traité pendant un demi-siècle par la recherche, reste encore actuel car personne n’a encore réussi à créer automatiquement des résumés comparables, en qualité, avec ceux produits par des humains. C’est dans ce contexte que la recherche en résumé automatique s’est divisée en deux grandes catégories : le résumé par extraction et le résumé par abstraction. Dans le premier, les phrases sont triées de façon à ce que les meilleures conforment le résumé final. Or, les phrases sélectionnées pour le résumé portent souvent des informations secondaires, une analyse plus fine s’avère nécessaire.Nous proposons une méthode de compression automatique de phrases basée sur l’élimination des fragments à l’intérieur de celles-ci. À partir d’un corpus annoté, nous avons créé un modèle linéaire pour prédire la suppression de ces fragments en fonction de caractéristiques simples. Notre méthode prend en compte trois principes : celui de la pertinence du contenu, l’informativité ; celui de la qualité du contenu, la grammaticalité, et la longueur, le taux de compression. Pour mesurer l’informativité des fragments,nous utilisons une technique inspirée de la physique statistique : l’énergie textuelle.Quant à la grammaticalité, nous proposons d’utiliser des modèles de langage probabilistes.La méthode proposée est capable de générer des résumés corrects en espagnol.Les résultats de cette étude soulèvent divers aspects intéressants vis-à- vis du résumé de textes par compression de phrases. On a observé qu’en général il y a un haut degré de subjectivité de la tâche. Il n’y a pas de compression optimale unique mais plusieurs compressions correctes possibles. Nous considérons donc que les résultats de cette étude ouvrent la discussion par rapport à la subjectivité de l’informativité et son influence pour le résumé automatique. / This dissertation presents a novel approach to automatic text summarization, one of the most challenging tasks in Natural Language Processing (NLP). Until now, no one had ever created a summarization method capable of producing summaries comparable in quality with those produced by humans. Even many of state-of-the-art approaches form the summary by selecting a subset of sentences from the original text. Since some of the selected sentences might still contain superfluous information, a finer analysis is needed. We propose an Automatic Sentence Compression method based on the elimination of intra-phrase discourse segments. Using a manually annotated big corpus, we have obtained a linear model that predicts the elimination probability of a segment on the basis of three simple three criteria: informativity, grammaticality and compression rate. We discuss the difficulties for automatic assessment of these criteria in documents and phrases and we propose a solution based on existing techniques in NLP literature, one applying two different algorithms that produce summaries with compressed sentences. After applying both algorithms in documents in Spanish, our method is able to produce high quality results. Finally, we evaluate the produced summaries using the Turing test to determine if human judges can distinguish between human-produced summaries and machine-produced summaries. This dissertation addresses many previously ignored aspects of NLP, namely the subjectivity of informativity, the sentence compression in Spanish documents, and the evaluation of NLP using the Turing test. Résumé automatique Compression automatique de phrases Segmentation discursive Le test de Turing pour le TALN Summarization Sentence compression Discourse segmentation Turing test for NLP 006.454 006.35
319	Méthodologies pour la détection de diachronies sémantiques et leurs impacts Kletz, David 08 1900 (has links) Le sens d’un mot est sujet à des variations au cours du temps. Nombre de phénomènes motivent ces modifications comme l’apparition de nouveaux objets ou les changements d’habitudes. Ainsi, un même mot peut se voir assigner un nouveau sens, retirer un sens, ou encore rester stable entre deux dates. L’étude de la diachronie sémantique est un domaine s’intéressant à ces changements de sens. Les récents travaux sur la diachronie sémantique proposent des méthodologies pour le repérage de diachronies. Pour ce faire, ils s’appuient sur des textes issus de plusieurs périodes temporelles différentes, et grâce auxquels sont entrainés des modèles de langue. Un alignement des représentations obtenues, et une comparaison de celles de mots-cibles leur permet de conclure quant à leur changement de sens. Néanmoins, l’absence de jeu de données (dataset) de référence pour la validation de ces méthodes conduit au développement de méthodes de validation alternatives, suggérant notamment de s’appuyer sur les changements de sens recensés dans les dictionnaires traditionnels. Le travail réalisé au cours de ma maitrise s’attache à exposer une réflexion sur les méthodes existantes de repérage des diachronies. En nous appuyant sur un corpus journalistique couvrant l’ensemble du XXème siècle, nous proposons des méthodes complémentaires grâce auxquelles nous démontrons que les évaluations proposées font l’objet d’ambiguïtés. Celles-ci ne permettent dès lors pas de conclure quant à la qualité des méthodes. Nous nous sommes ensuite attachés à développer une méthodologie pour la construction d’un jeu de données de validation. Cette méthodologie tire parti d’un algorithme de désambiguïsation afin d’associer à tous les sens recensés d’un mot une date d’apparition au cours du temps. Nous proposons un jeu de données composé de 151 mots permettant d’évaluer le repérage de diachronies en français entre 1910 et 1990. / The meaning of a word is subject to variations over time. Many phenomena motivate these modifications such as the appearance of new objects or changes in habits. Thus, the same word can be assigned a new meaning, or have a meaning withdrawn, or remain stable between two dates. The study of semantic diachrony is a field that focuses on these changes in meaning. Recent work on semantic diachrony proposes methodologies for the detection of diachronies. In order to do so, they rely on texts from several different temporal periods, and through which language models are trained. An alignment of the obtained representations, and a comparison of those of target words enables one to infer the change of meaning. Nevertheless, the absence of a reference dataset for the validation of these methods leads to the development of alternative validation methods, suggesting in particular to rely on the changes of meaning identified in traditional dictionaries. The work carried out during my master's degree aims at presenting a reflection on the existing methods of diachrony detection. Based on a corpus of newspapers covering the whole 20th century, we propose complementary methods thanks to which we demonstrate that the proposed evaluations are subject to ambiguities. These ambiguities do not allow us to ensure the quality of the methods. We then develop a methodology for the construction of a validation dataset. This methodology takes advantage of a disambiguation algorithm in order to associate a date of appearance in the course of time to all the senses of a word. We propose a dataset composed of 151 words allowing one to evaluate the identification of diachronies in French between 1910 and 1990. TAL diachronie évaluation validation jeu de données désambiguïsation de sens NLP diachrony dataset Word sens disambiguation
320	Using Natural Language Processing and Machine Learning for Analyzing Clinical Notes in Sickle Cell Disease Patients Khizra, Shufa January 2018 (has links) No description available. Computer Science Sickle Cell Disease SCD cTAKES Natural Language Processing NLP Logistic Regression Random Forest Support Vector Machines Multinomial Naive Bayes

Search results