Spelling suggestions: "subject:"batural anguage aprocessing NLP"" "subject:"batural anguage eprocessing NLP""
31 |
Comparison of Machine Learning Models Used for Swedish Text Classification in Chat MessagingKarim, Mezbahul, Amanzadi, Amirtaha January 2022 (has links)
The rise of social media and the use of mobile applications has led to increasing concerns regarding the content that is shared through these apps and whether they are being regulated or not. One of the problems that can arise due to a lack of regulation is that chat messages that are inappropriate or of profane nature can be allowed to be shared through these apps. Thus, it is vital to detect whenever these types of chat messages are shared through these mobile applications. In addition to that, there should also be detection of chat messages that can lead to the identity of the users being revealed as that is how the app in this thesis project was intended to be used. One of the most popular approaches to detect chat messages of this nature is to use machine learning techniques that can classify text. We were quick to discover that there were not many machine learning models that were built to classify short text messages in the Swedish language, thus the main problem of our thesis was the lack of evaluation and analysis of machine learning models for text classification in the context of the chat messages in Swedish. Thus, the purpose of our project was mainly to find the best performing models for text classification, implement these models and evaluate them to find the best among the ones we found. After the models were created, a hosting server, as well as an API, was required for the text classifying system to compute and communicate the prediction results to the mobile application in real-time. Therefore, the models were containerized and deployed as a REST API that serves requests upon arrival on a cloud server. The goal of this project was to help future work being done on text classification in the Swedish language by providing the results of this thesis to any parties that are interested in our line of work. From our own experience, we realized how challenging it can be to find and choose the best machine learning models when one has no previous data on which can be the best performing one. Thus, we believe that the results of this thesis project will greatly aid future projects in this area. The chosen research methodology was qualitative and dealt with quantitative data. The results we received showed that the BERT model was the best choice among the three models that we compared. With minute adjustments, this model should be more than capable of detecting the type of chat messages that it is required within the mobile application. / Uppkomsten av social media och användning av mobilapplikationer ledde till ökande oro om innehållet som är delad inom dessa appar och om dem är reglerad eller inte. Ett problem som uppstår på grund av bristande reglering kan vara att chatmeddelanden som är olämplig eller profan kan bli delad med dessa appar. Därför är det viktig att upptäcka när dessa typer av chatmeddelande är delad genom mobilapplikationer. Dessutom det måste finnas ett system som upptäcker chattmeddelanden som kan hjälpa att avslöja användarens identiteter, som den här appen i detta projekt avsedda att användas. En av mest populära sett att upptäcka den typen av chattmeddelanden är användning av mäskinlärning tekniker som kan klassificera text. Vi snart hittade att det finns inte så många mäskinlärning modeller som var byggt att klassificera texter på svenska, alltså huvudproblem med vår exam en var bistrande utvärdering och analys av mäskinlärning modeller för textklassificering i kontext av svenska språket. Så, syftet med vårt projekt var att hitta de bästa presenterande modeller för textklassifikation, genomföra dessa modeller själva och sedan utvärdera dem att hitta den bästa. Därtill, för att textklassificering ska beräkna och kommunicera den förutsägelseresultaten till mobila applikationer i realtid behövs en värdserver samt en API. Därför, modellerna containeriserades och distribuerad es som en REST API som betjänar begäran vid ankomst på en molnserver. Målet med det här projektet var att hjälpa framtidsarbete inom textklassifikation på svenska språket genom att tillhandahålla resultaten till partier som är intresserad i vår arbetslin je. Från vår egen erfarenhet, vi insåg att det var svårt att hitta och välja dem bästa mäskinlärning modeller, specifikt när man har inga data som tidigare visat den med bäst prestanda. Och därför vi anser att den resultaten av den har examen kommer att v ara stor hjälp till framtida projekt i det här området. Den valda forskningsmetodiken var kvalitativ och handlade om kvantitativ data. Resultaten visade att BERT modell var den bästa bland de tre modellerna som vi jämförde med. Med lite justeringen är mod ellen mer än kapable att detektera den typen av krävs inom mobilapplikationen.
|
32 |
Primary stage Lung Cancer Prediction with Natural Language Processing-based Machine Learning / Tidig lungcancerprediktering genom maskininlärning för textbehandlingSadek, Ahmad January 2022 (has links)
Early detection reduces mortality in lung cancer, but it is also considered as a challenge for oncologists and for healthcare systems. In addition, screening modalities like CT-scans come with undesired effects, many suspected patients are wrongly diagnosed with lung cancer. This thesis contributes to solve the challenge of early lung cancer detection by utilizing unique data consisting of self-reported symptoms. The proposed method is a predictive machine learning algorithm based on natural language processing, which handles the data as an unstructured data set. A replication of a previous study where a prediction model based on a conventional multivariate machine learning using the same data is done and presented, for comparison. After evaluation, validation and interpretation, a set of variables were highlighted as early predictors of lung cancer. The performance of the proposed approach managed to match the performance of the conventional approach. This promising result opens for further development where such an approach can be used in clinical decision support systems. Future work could then involve other modalities, in a multimodal machine learning approach. / Tidig lungcancerdiagnostisering kan öka chanserna för överlevnad hos lungcancerpatienter, men att upptäcka lungcancer i ett tidigt stadie är en av de större utmaningarna för onkologer och sjukvården. Idag undersöks patienter med riskfaktorer baserat på rökning och ålder, dessa undersökningar sker med hjälp av bland annat medicinskt avbildningssystem, då oftast CT-bilder, vilket medför felaktiga och kostsamma diagnoser. Detta arbete föreslår en maskininlärninig algoritm baserad på Natural language processing, som genom analys och bearbetning av ostrukturerade data, av patienternas egna anamneser, kan prediktera lungcancer. Arbetet har genomfört en jämförelse med en konventionell maskininlärning algoritm baserat på en replikering av ett annat studie där samma data behandlades som strukturerad. Den föreslagna metoden har visat ett likartat resultat samt prestanda, och har identifierat riskfaktorer samt symptom för lungcancer. Detta arbete öppnar upp för en utveckling mot ett kliniskt användande i form av beslutsstödsystem, som även kan hantera elektriska hälsojournaler. Andra arbeten kan vidareutveckla metoden för att hantera andra varianter av data, så som medicinska bilder och biomarkörer, och genom det förbättra prestandan.
|
33 |
Framing and Voting / The German Immigration Debate and the Effects of News Coverage on Political PreferencesBerk, Nicolai 03 April 2024 (has links)
Eine umfangreiche Literatur zu Framing-Effekten legt nahe, dass Bürger nur über begrenzte politische Präferenzen verfügen. Wenn die öffentliche Meinung so offen für Einflussnahme ist, stellt sie ein wackliges Fundament für den demokratischen Prozess dar. Diese Dissertation stellt daher die Frage, wie sich vorherige experimentelle Erkenntnisse auf komplexe, reale Situationen übertragen lassen und ob Framing auch Wahlabsichten beeinflussen kann. Sie entwickelt eine Methode zur automatischen Identifizierung von Nachrichtenframes.
Die Dissertation präsentiert Original- und Sekundärdaten und untersucht den Zusammenhang zwischen Nachrichten-Framing, Migrationseinstellungen und Wahlabsichten. Sie bietet einen Überblick über die Darstellung der Einwanderung in den deutschen Nachrichtenmedien und zeigt, dass weder die Aufmerksamkeit noch das Framing von Migration den Aufstieg der rechtsradikalen AfD erklären können. Anschließend nutzt sie eine Änderung in der Migrationsberichterstattung Deutschlands größter Boulevardzeitung, Bild, und zeigt begrenzte Auswirkungen auf politische Einstellungen und Wahlabsichten ihrer Leser auf. Das letzte empirische Kapitel präsentiert experimentelle Daten, die aufzeigen, dass Framing lediglich die Wahlabsichten eher uninformierter Bürger beeinflusst.
Die Ergebnisse tragen zum besseren Verständnis von Framing-Effekten bei und legen nahe, dass Einstellungen von Bürgern nicht so leicht manipuliert werden können und die Macht der Nachrichtenmedien begrenzter ist als oft angenommen. Stattdessen finden Framing-Effekte unter sehr spezifischen Bedingungen statt, die häufig nicht erfüllt sind. Das sich abzeichnende Bild der öffentlichen Meinung zeichnet sich durch kristallisierte Einstellungen aus, die ausschliesslich auf neuartige Ereignisse reagieren. Aus dieser Sicht ist Politik ein Muster aufeinander folgender kritischer Ereignisse, von denen jedes eine einzigartige Gelegenheit bietet, das vorherrschende Verständnis eines Themas zu ändern. / A large experimental literature on framing effects suggests that citizens form rather limited political preferences, open to severe manipulation. If citizens’ attitudes were always so easily malleable for media outlets and political actors, it would not constitute a very meaningful input for the democratic process. This dissertation asks how these experimental findings translate into complex, realworld news environments and whether news frames structure citizens’ voting intentions. It provides a clear conceptualization of frames, on which it builds a method to identify news frames automatically, and theorises a link between news frames and voting intentions.
The dissertation presents original and secondary data, exploring the relationship of news framing, immigration attitudes, and voting intentions. Providing a broad overview of immigration framing in the German news media, it shows that neither immigration attention nor framing can explain the rise of the radical-right AfD. It then exploits a change in the immigration framing of Germany’s largest tabloid, Bild, showing that this shift had no effects on immigration attitudes or voting intentions among its readers. The final empirical chapter presents experimental evidence revealing that framing only affects voting intentions among rather uninformed citizens.
The findings contribute to the study of framing and public opinion, suggesting that citizens’ attitudes are not as easily manipulated and the power of the news media more limited than often thought. Instead, framing effects take place under highly specific conditions, which are often not fulfilled. The emerging picture of public opinion is one of crystallized and resistant attitudes, which only respond to novel events. In other words: whoever gets to the voter first, wins. Politics, in this view, is a pattern of critical events following upon each other, each presenting a unique opportunity to change the dominant understanding of an issue.
|
34 |
Comparative Analysis of ChatGPT-4and Gemini Advanced in ErroneousCode Detection and CorrectionSun, Erik Wen Han, Grace, Yasine January 2024 (has links)
This thesis investigates the capabilities of two advanced Large Language Models(LLMs) OpenAI’s ChatGPT-4 and Google’s Gemini Advanced in the domain ofSoftware engineering. While LLMs are widely utilized across various applications,including text summarization and synthesis, their potential for detecting and correct-ing programming errors has not been thoroughly explored. This study aims to fill thisgap by conducting a comprehensive literature search and experimental comparisonof ChatGPT-4 and Gemini Advanced using the QuixBugs and LeetCode benchmarkdatasets, with specific focus on Python and Java programming languages. The re-search evaluates the models’ abilities to detect and correct bugs using metrics suchas Accuracy, Recall, Precision, and F1-score.Experimental results presets that ChatGPT-4 consistently outperforms GeminiAdvanced in both the detection and correction of bugs. These findings provide valu-able insights that could guide further research in the field of LLMs.
|
35 |
[en] A NOVEL SOLUTION TO EMPOWER NATURAL LANGUAGE INTERFACES TO DATABASES (NLIDB) TO HANDLE AGGREGATIONS / [pt] UMA NOVA SOLUÇÃO PARA CAPACITAR INTERFACES DE LINGUAGEM NATURAL PARA BANCOS DE DADOS (NLIDB) PARA LIDAR COM AGREGAÇÕESALEXANDRE FERREIRA NOVELLO 19 July 2021 (has links)
[pt] Perguntas e Respostas (Question Answering - QA) é um campo de estudo dedicado à construção de sistemas que respondem automaticamente a perguntas feitas em linguagem natural. A tradução de uma pergunta feita em linguagem natural em uma consulta estruturada (SQL ou SPARQL) em um banco de dados também é conhecida como Interface de Linguagem Natural para Bancos de Dados (Natural Language Interface to Database - NLIDB). Os sistemas NLIDB geralmente não lidam com agregações, que podem ter os seguintes elementos: funções de agregação (como contagem, soma, média, mínimo e máximo), uma cláusula de agrupamento (GROUP BY) e uma cláusula HAVING. No entanto, eles fornecem bons resultados para consultas normais. Esta dissertação aborda a criação de um módulo genérico, para ser utilizado em sistemas NLIDB, que permite a tais sistemas realizar consultas com agregações, desde que os resultados da consulta que o NLIDB retorna sejam, ou possam ser transformados, em um resultado no formato tabular. O trabalho cobre agregações com especificidades como ambiguidades, diferenças de escala de tempo, agregações em atributos múltiplos, o uso de adjetivos superlativos, reconhecimento básico de unidade de medida, agregações em atributos com nomes compostos e subconsultas com funções de agregação aninhadas em até dois níveis. / [en] Question Answering (QA) is a field of study dedicated to building systems that automatically answer questions asked in natural language. The translation of a question asked in natural language into a structured query (SQL or SPARQL) in a database is also known as Natural Language Interface to Database (NLIDB). NLIDB systems usually do not deal with aggregations, which can have the following elements: aggregation functions (as count, sum, average, minimum and maximum), a grouping clause (GROUP BY) and a having clause (HAVING). However, they deliver good results for normal queries. This dissertation addresses the creation of a generic module, to be used in NLIDB systems, that allows such systems to perform queries with aggregations, on the condition that the query results the NLIDB return are, or can be transformed into, a result set in the form of a table. The work covers aggregations with specificities such as ambiguities, timescale differences, aggregations in multiple attributes, the use of superlative adjectives, basic unit measure recognition, aggregations in attributes with compound names and subqueries with aggregation functions nested up to two levels.
|
36 |
Élaboration d'ontologies médicales pour une approche multi-agents d'aide à la décision clinique / A multi-agent framework for the development of medical ontologies in clinical decision makingShen, Ying 20 March 2015 (has links)
La combinaison du traitement sémantique des connaissances (Semantic Processing of Knowledge) et de la modélisation des étapes de raisonnement (Modeling Steps of Reasoning), utilisés dans le domaine clinique, offrent des possibilités intéressantes, nécessaires aussi, pour l’élaboration des ontologies médicales, utiles à l'exercice de cette profession. Dans ce cadre, l'interrogation de banques de données médicales multiples, comme MEDLINE, PubMed… constitue un outil précieux mais insuffisant car elle ne permet pas d'acquérir des connaissances facilement utilisables lors d’une démarche clinique. En effet, l'abondance de citations inappropriées constitue du bruit et requiert un tri fastidieux, incompatible avec une pratique efficace de la médecine.Dans un processus itératif, l'objectif est de construire, de façon aussi automatisée possible, des bases de connaissances médicales réutilisables, fondées sur des ontologies et, dans cette thèse, nous développons une série d'outils d'acquisition de connaissances qui combinent des opérateurs d'analyse linguistique et de modélisation de la clinique, fondés sur une typologie des connaissances mises en œuvre, et sur une implémentation des différents modes de raisonnement employés. La connaissance ne se résume pas à des informations issues de bases de données ; elle s’organise grâce à des opérateurs cognitifs de raisonnement qui permettent de la rendre opérationnelle dans le contexte intéressant le praticien.Un système multi-agents d’aide à la décision clinique (SMAAD) permettra la coopération et l'intégration des différents modules entrant dans l'élaboration d'une ontologie médicale et les sources de données sont les banques médicales, comme MEDLINE, et des citations extraites par PubMed ; les concepts et le vocabulaire proviennent de l'Unified Medical Language System (UMLS).Concernant le champ des bases de connaissances produites, la recherche concerne l'ensemble de la démarche clinique : le diagnostic, le pronostic, le traitement, le suivi thérapeutique de différentes pathologies, dans un domaine médical donné.Différentes approches et travaux sont recensés, dans l’état de question, et divers paradigmes sont explorés : 1) l'Evidence Base Medicine (une médecine fondée sur des indices). Un indice peut se définir comme un signe lié à son mode de mise en œuvre ; 2) Le raisonnement à partir de cas (RàPC) se fonde sur l'analogie de situations cliniques déjà rencontrées ; 3) Différentes approches sémantiques permettent d'implémenter les ontologies.Sur l’ensemble, nous avons travaillé les aspects logiques liés aux opérateurs cognitifs de raisonnement utilisés et nous avons organisé la coopération et l'intégration des connaissances exploitées durant les différentes étapes du processus clinique (diagnostic, pronostic, traitement, suivi thérapeutique). Cette intégration s’appuie sur un SMAAD : système multi-agent d'aide à la décision. / The combination of semantic processing of knowledge and modelling steps of reasoning employed in the clinical field offers exciting and necessary opportunities to develop ontologies relevant to the practice of medicine. In this context, multiple medical databases such as MEDLINE, PubMed are valuable tools but not sufficient because they cannot acquire the usable knowledge easily in a clinical approach. Indeed, abundance of inappropriate quotations constitutes the noise and requires a tedious sort incompatible with the practice of medicine.In an iterative process, the objective is to build an approach as automated as possible, the reusable medical knowledge bases is founded on an ontology of the concerned fields. In this thesis, the author will develop a series of tools for knowledge acquisition combining the linguistic analysis operators and clinical modelling based on the implemented knowledge typology and an implementation of different forms of employed reasoning. Knowledge is not limited to the information from data, but also and especially on the cognitive operators of reasoning for making them operational in the context relevant to the practitioner.A multi-agent system enables the integration and cooperation of the various modules used in the development of a medical ontology.The data sources are from medical databases such as MEDLINE, the citations retrieved by PubMed, and the concepts and vocabulary from the Unified Medical Language System (UMLS).Regarding the scope of produced knowledge bases, the research concerns the entire clinical process: diagnosis, prognosis, treatment, and therapeutic monitoring of various diseases in a given medical field.It is essential to identify the different approaches and the works already done.Different paradigms will be explored: 1) Evidence Based Medicine. An index can be defined as a sign related to its mode of implementation; 2) Case-based reasoning, which based on the analogy of clinical situations already encountered; 3) The different semantic approaches which are used to implement ontologies.On the whole, we worked on logical aspects related to cognitive operators of used reasoning, and we organized the cooperation and integration of exploited knowledge during the various stages of the clinical process (diagnosis, prognosis, treatment, therapeutic monitoring). This integration is based on a SMAAD: multi-agent system for decision support.
|
37 |
Atribuição automática de autoria de obras da literatura brasileira / Atribuição automática de autoria de obras da literatura brasileiraNobre Neto, Francisco Dantas 19 January 2010 (has links)
Made available in DSpace on 2015-05-14T12:36:48Z (GMT). No. of bitstreams: 1
arquivototal.pdf: 1280792 bytes, checksum: d335d67b212e054f48f0e8bca0798fe5 (MD5)
Previous issue date: 2010-01-19 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Authorship attribution consists in categorizing an unknown document among
some classes of authors previously selected. Knowledge about authorship of a
text can be useful when it is required to detect plagiarism in any literary
document or to properly give the credits to the author of a book. The most
intuitive form of human analysis of a text is by selecting some characteristics
that it has. The study of selecting attributes in any written document, such as
average word length and vocabulary richness, is known as stylometry. For
human analysis of an unknown text, the authorship discovery can take months,
also becoming tiring activity. Some computational tools have the functionality of
extracting such characteristics from the text, leaving the subjective analysis to
the researcher. However, there are computational methods that, in addition to
extract attributes, make the authorship attribution, based in the characteristics
gathered in the text. Techniques such as neural network, decision tree and
classification methods have been applied to this context and presented results
that make them relevant to this question. This work presents a data
compression method, Prediction by Partial Matching (PPM), as a solution of the
authorship attribution problem of Brazilian literary works. The writers and works
selected to compose the authors database were, mainly, by their representative
in national literature. Besides, the availability of the books has also been
considered. The PPM performs the authorship identification without any
subjective interference in the text analysis. This method, also, does not make
use of attributes presents in the text, differently of others methods. The correct
classification rate obtained with PPM, in this work, was approximately 93%,
while related works exposes a correct rate between 72% and 89%. In this work,
was done, also, authorship attribution with SVM approach. For that, were
selected attributes in the text divided in two groups, one word based and other in
function-words frequency, obtaining a correct rate of 36,6% and 88,4%,
respectively. / Atribuição de autoria consiste em categorizar um documento desconhecido
dentre algumas classes de autores previamente selecionadas. Saber a autoria
de um texto pode ser útil quando é necessário detectar plágio em alguma obra
literária ou dar os devidos créditos ao autor de um livro. A forma mais intuitiva
ao ser humano para se analisar um texto é selecionando algumas
características que ele possui. O estudo de selecionar atributos em um
documento escrito, como tamanho médio das palavras e riqueza vocabular, é
conhecido como estilometria. Para análise humana de um texto desconhecido,
descobrir a autoria pode demandar meses, além de se tornar uma tarefa
cansativa. Algumas ferramentas computacionais têm a funcionalidade de extrair
tais características do texto, deixando a análise subjetiva para o pesquisador.
No entanto, existem métodos computacionais que, além de extrair atributos,
atribuem a autoria baseado nas características colhidas ao longo do texto.
Técnicas como redes neurais, árvores de decisão e métodos de classificação já
foram aplicados neste contexto e apresentaram resultados que os tornam
relevantes para tal questão. Este trabalho apresenta um método de compressão
de dados, o Prediction by Partial Matching (PPM), para solução do problema de
atribuição de autoria de obras da literatura brasileira. Os escritores e obras
selecionados para compor o banco de autores se deram, principalmente, pela
representatividade que possuem na literatura nacional. Além disso, a
disponibilidade dos livros em formato eletrônico também foi considerada. O
PPM realiza a identificação de autoria sem ter qualquer interferência subjetiva
na análise do texto. Este método, também, não faz uso de atributos presentes
ao longo do texto, diferentemente de outros métodos. A taxa de classificação
correta alcançada com o PPM, neste trabalho, foi de aproximadamente 93%,
enquanto que trabalhos relacionados mostram uma taxa de acerto entre 72% e
89%. Neste trabalho, também foi realizado atribuição de autoria com a
abordagem SVM. Para isso, foram selecionados atributos no texto dividido em
dois tipos, sendo um baseado em palavras e o outro na contagem de palavrasfunção,
obtendo uma taxa de acerto de 36,6% e 88,4%, respectivamente.
|
38 |
Can Wizards be Polyglots: Towards a Multilingual Knowledge-grounded Dialogue SystemLiu, Evelyn Kai Yan January 2022 (has links)
The research of open-domain, knowledge-grounded dialogue systems has been advancing rapidly due to the paradigm shift introduced by large language models (LLMs). While the strides have improved the performance of the dialogue systems, the scope is mostly monolingual and English-centric. The lack of multilingual in-task dialogue data further discourages research in this direction. This thesis explores the use of transfer learning techniques to extend the English-centric dialogue systems to multiple languages. In particular, this work focuses on five typologically diverse languages, of which well-performing models could generalize to the languages that are part of the language family as the target languages, hence widening the accessibility of the systems to speakers of various languages. I propose two approaches: Multilingual Retrieval-Augmented Dialogue Model (xRAD) and Multilingual Generative Dialogue Model (xGenD). xRAD is adopted from a pre-trained multilingual question answering (QA) system and comprises a neural retriever and a multilingual generation model. Prior to the response generation, the retriever fetches relevant knowledge and conditions the retrievals to the generator as part of the dialogue context. This approach can incorporate knowledge into conversational agents, thus improving the factual accuracy of a dialogue model. In addition, xRAD has advantages over xGenD because of its modularity, which allows the fusion of QA and dialogue systems so long as appropriate pre-trained models are employed. On the other hand, xGenD takes advantage of an existing English dialogue model and performs a zero-shot cross-lingual transfer by training sequentially on English dialogue and multilingual QA datasets. Both automated and human evaluation were carried out to measure the models' performance against the machine translation baseline. The result showed that xRAD outperformed xGenD significantly and surpassed the baseline in most metrics, particularly in terms of relevance and engagingness. Whilst xRAD performance was promising to some extent, a detailed analysis revealed that the generated responses were not actually grounded in the retrieved paragraphs. Suggestions were offered to mitigate the issue, which hopefully could lead to significant progress of multilingual knowledge-grounded dialogue systems in the future.
|
39 |
Improving the Performance of Clinical Prediction Tasks by Using Structured and Unstructured Data Combined with a Patient NetworkNouri Golmaei, Sara 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / With the increasing availability of Electronic Health Records (EHRs) and advances in deep learning techniques, developing deep predictive models that use EHR data to solve healthcare problems has gained momentum in recent years. The majority of clinical predictive models benefit from structured data in EHR (e.g., lab measurements and medications). Still, learning clinical outcomes from all possible information sources is one of the main challenges when building predictive models. This work focuses mainly on two sources of information that have been underused by researchers; unstructured data (e.g., clinical notes) and a patient network. We propose a novel hybrid deep learning model, DeepNote-GNN, that integrates clinical notes information and patient network topological structure to improve 30-day hospital readmission prediction. DeepNote-GNN is a robust deep learning framework consisting of two modules: DeepNote and patient network. DeepNote extracts deep representations of clinical notes using a feature aggregation unit on top of a state-of-the-art Natural Language Processing (NLP) technique - BERT. By exploiting these deep representations, a patient network is built, and Graph Neural Network (GNN) is used to train the network for hospital readmission predictions. Performance evaluation on the MIMIC-III dataset demonstrates that DeepNote-GNN achieves superior results compared to the state-of-the-art baselines on the 30-day hospital readmission task. We extensively analyze the DeepNote-GNN model to illustrate the effectiveness and contribution of each component of it. The model analysis shows that patient network has a significant contribution to the overall performance, and DeepNote-GNN is robust and can consistently perform well on the 30-day readmission prediction task. To evaluate the generalization of DeepNote and patient network modules on new prediction tasks, we create a multimodal model and train it on structured and unstructured data of MIMIC-III dataset to predict patient mortality and Length of Stay (LOS). Our proposed multimodal model consists of four components: DeepNote, patient network, DeepTemporal, and score aggregation. While DeepNote keeps its functionality and extracts representations of clinical notes, we build a DeepTemporal module using a fully connected layer stacked on top of a one-layer Gated Recurrent Unit (GRU) to extract the deep representations of temporal signals. Independent to DeepTemporal, we extract feature vectors of temporal signals and use them to build a patient network. Finally, the DeepNote, DeepTemporal, and patient network scores are linearly aggregated to fit the multimodal model on downstream prediction tasks. Our results are very competitive to the baseline model. The multimodal model analysis reveals that unstructured text data better help to estimate predictions than temporal signals. Moreover, there is no limitation in applying a patient network on structured data. In comparison to other modules, the patient network makes a more significant contribution to prediction tasks. We believe that our efforts in this work have opened up a new study area that can be used to enhance the performance of clinical predictive models.
|
40 |
Performance Benchmarking and Cost Analysis of Machine Learning Techniques : An Investigation into Traditional and State-Of-The-Art Models in Business Operations / Prestandajämförelse och kostnadsanalys av maskininlärningstekniker : en undersökning av traditionella och toppmoderna modeller inom affärsverksamhetLundgren, Jacob, Taheri, Sam January 2023 (has links)
Eftersom samhället blir allt mer datadrivet revolutionerar användningen av AI och maskininlärning sättet företag fungerar och utvecklas på. Denna studie utforskar användningen av AI, Big Data och Natural Language Processing (NLP) för att förbättra affärsverksamhet och intelligens i företag. Huvudsyftet med denna avhandling är att undersöka om den nuvarande klassificeringsprocessen hos värdorganisationen kan upprätthållas med minskade driftskostnader, särskilt lägre moln-GPU-kostnader. Detta har potential att förbättra klassificeringsmetoden, förbättra produkten som företaget erbjuder sina kunder på grund av ökad klassificeringsnoggrannhet och stärka deras värdeerbjudande. Vidare utvärderas tre tillvägagångssätt mot varandra och implementationerna visar utvecklingen inom området. Modellerna som jämförs i denna studie inkluderar traditionella maskininlärningsmetoder som Support Vector Machine (SVM) och Logistisk Regression, tillsammans med state-of-the-art transformermodeller som BERT, både Pre-Trained och Fine-Tuned. Artikeln visar att det finns en avvägning mellan prestanda och kostnad vilket illustrerar problemet som många företag, som Valu8, står inför när de utvärderar vilket tillvägagångssätt de ska implementera. Denna avvägning diskuteras och analyseras sedan mer detaljerat för att utforska möjliga kompromisser från varje perspektiv i ett försök att hitta en balanserad lösning som kombinerar prestandaeffektivitet och kostnadseffektivitet. / As society is becoming more data-driven, Artificial Intelligence (AI) and Machine Learning are revolutionizing how companies operate and evolve. This study explores the use of AI, Big Data, and Natural Language Processing (NLP) in improving business operations and intelligence in enterprises. The primary objective of this thesis is to examine if the current classification process at the host company can be maintained with reduced operating costs, specifically lower cloud GPU costs. This can improve the classification method, enhance the product the company offers its customers due to increased classification accuracy, and strengthen its value proposition. Furthermore, three approaches are evaluated against each other, and the implementations showcase the evolution within the field. The models compared in this study include traditional machine learning methods such as Support Vector Machine (SVM) and Logistic Regression, alongside state-of-the-art transformer models like BERT, both Pre-Trained and Fine-Tuned. The paper shows a trade-off between performance and cost, showcasing the problem many companies like Valu8 stand before when evaluating which approach to implement. This trade-off is discussed and analyzed in further detail to explore possible compromises from each perspective to strike a balanced solution that combines performance efficiency and cost-effectiveness.
|
Page generated in 0.0826 seconds