Global ETD Search

21	Image-classification for Brain Tumor using Pre-trained Convolutional Neural Network / Bildklassificering för hjärntumör med hjälp av förtränat konvolutionellt neuralt nätverk Alsabbagh, Bushra January 2023 (has links) Brain tumor is a disease characterized by uncontrolled growth of abnormal cells in the brain. The brain is responsible for regulating the functions of all other organs, hence, any atypical growth of cells in the brain can have severe implications for its functions. The number of global mortality in 2020 led by cancerous brains was estimated at 251,329. However, early detection of brain cancer is critical for prompt treatment and improving patient’s quality of life as well as survival rates. Manual medical image classification in diagnosing diseases has been shown to be extremely time-consuming and labor-intensive. Convolutional Neural Networks (CNNs) has proven to be a leading algorithm in image classification outperforming humans. This paper compares five CNN architectures namely: VGG-16, VGG-19, AlexNet, EffecientNetB7, and ResNet-50 in terms of performance and accuracy using transfer learning. In addition, the authors discussed in this paper the economic impact of CNN, as an AI approach, on the healthcare sector. The models’ performance is demonstrated using functions for loss and accuracy rates as well as using the confusion matrix. The conducted experiment resulted in VGG-19 achieving best performance with 97% accuracy, while EffecientNetB7 achieved worst performance with 93% accuracy. / Hjärntumör är en sjukdom som kännetecknas av okontrollerad tillväxt av onormala celler i hjärnan. Hjärnan är ansvarig för att styra funktionerna hos alla andra organ, därför kan all onormala tillväxt av celler i hjärnan ha allvarliga konsekvenser för dess funktioner. Antalet globala dödligheten ledda av hjärncancer har uppskattats till 251329 under 2020. Tidig upptäckt av hjärncancer är dock avgörande för snabb behandling och för att förbättra patienternas livskvalitet och överlevnadssannolikhet. Manuell medicinsk bildklassificering vid diagnostisering av sjukdomar har visat sig vara extremt tidskrävande och arbetskrävande. Convolutional Neural Network (CNN) är en ledande algoritm för bildklassificering som har överträffat människor. Denna studie jämför fem CNN-arkitekturer, nämligen VGG-16, VGG-19, AlexNet, EffecientNetB7, och ResNet-50 i form av prestanda och noggrannhet. Dessutom diskuterar författarna i studien CNN:s ekonomiska inverkan på sjukvårdssektorn. Modellens prestanda demonstrerades med hjälp av funktioner om förlust och noggrannhets värden samt med hjälp av en Confusion matris. Resultatet av det utförda experimentet har visat att VGG-19 har uppnått bästa prestanda med 97% noggrannhet, medan EffecientNetB7 har uppnått värsta prestanda med 93% noggrannhet. Brain tumor Deep learning Convolutional Neural Network (CNN) diagnosis Image classification pre-trained models dataset economic impact. Cancer Hjärntumör Artificiell intelligens (AI) djupinlärning konvolutionellt neuralt nätverk (CNN) Diagnostik Bildklassificering förtränade modeller dataset. Computer and Information Sciences Data- och informationsvetenskap
22	Fine-Tuning Pre-Trained Language Models for CEFR-Level and Keyword Conditioned Text Generation : A comparison between Google’s T5 and OpenAI’s GPT-2 / Finjustering av förtränade språkmodeller för CEFR-nivå och nyckelordsbetingad textgenerering : En jämförelse mellan Googles T5 och OpenAIs GPT-2 Roos, Quintus January 2022 (has links) This thesis investigates the possibilities of conditionally generating English sentences based on keywords-framing content and different difficulty levels of vocabulary. It aims to contribute to the field of Conditional Text Generation (CTG), a type of Natural Language Generation (NLG), where the process of creating text is based on a set of conditions. These conditions include words, topics, content or perceived sentiments. Specifically, it compares the performances of two well-known model architectures: Sequence-toSequence (Seq2Seq) and Autoregressive (AR). These are applied to two different tasks, individual and combined. The Common European Framework of Reference (CEFR) is used to assess the vocabulary level of the texts. In the absence of openly available CEFR-labelled datasets, the author has developed a new methodology with the host company to generate suitable datasets. The generated texts are evaluated on accuracy of the vocabulary levels and readability using readily available formulas. The analysis combines four established readability metrics, and assesses classification accuracy. Both models show a high degree of accuracy when classifying texts into different CEFR-levels. However, the same models are weaker when generating sentences based on a desired CEFR-level. This study contributes empirical evidence suggesting that: (1) Seq2Seq models have a higher accuracy than AR models in generating English sentences based on a desired CEFR-level and keywords; (2) combining Multi-Task Learning (MTL) with instructiontuning is an effective way to fine-tune models on text-classification tasks; and (3) it is difficult to assess the quality of computer generated language using only readability metrics. / I den här studien undersöks möjligheterna att villkorligt generera engelska meningar på så-kallad “naturligt” språk, som baseras på nyckelord, innehåll och vokabulärnivå. Syftet är att bidra till området betingad textgenerering, en underkategori av naturlig textgenerering, vilket är en metod för att skapa text givet vissa ingångsvärden, till exempel ämne, innehåll eller uppfattning. I synnerhet jämförs prestandan hos två välkända modellarkitekturer: sekvenstill-sekvens (Seq2Seq) och autoregressiv (AR). Dessa tillämpas på två uppgifter, såväl individuellt som kombinerat. Den europeiska gemensamma referensramen (CEFR) används för att bedöma texternas vokabulärnivå. I och med avsaknaden av öppet tillgängliga CEFR-märkta dataset har författaren tillsammans med värdföretaget utvecklat en ny metod för att generera lämpliga dataset. De av modellerna genererade texterna utvärderas utifrån vokabulärnivå och läsbarhet samt hur väl de uppfyller den sökta CEFRnivån. Båda modellerna visade en hög träffsäkerhet när de klassificerar texter i olika CEFR-nivåer. Dock uppvisade samma modeller en sämre förmåga att generera meningar utifrån en önskad CEFR-nivå. Denna studie bidrar med empiriska bevis som tyder på: (1) att Seq2Seq-modeller har högre träffsäkerhet än AR-modeller när det gäller att generera engelska meningar utifrån en önskad CEFR-nivå och nyckelord; (2) att kombinera inlärning av multipla uppgifter med instruktionsjustering är ett effektivt sätt att finjustera modeller för textklassificering; (3) att man inte kan bedömma kvaliteten av datorgenererade meningar genom att endast använda läsbarhetsmått. Natural Language Processing Natural Language Generation Conditional Text Generation Text Classification Fine-tuning English Language Learning. naturlig språkbehandling naturlig språkgenerering betingad textgenerering finjustering instruktionsjustering engelska inlärning. Computer Sciences Datavetenskap (datalogi)
23	Smart Auto-completion in Live Chat Utilizing the Power of T5 / Smart automatisk komplettering i livechatt som utnyttjar styrkan hos T5 Wang, Zhanpeng January 2021 (has links) Auto-completion is a task that requires an algorithm to give suggestions for completing sentences. Specifically, the history of live chat and the words already typed by the agents are provided to the algorithm for outputting the suggestions to finish the sentences. This study aimed to investigate if the above task can be handled by fine-tuning a pre-trained T5 model on the target dataset. In this thesis, both an English and a Portuguese dataset were selected. Then, T5 and its multilingual version mT5were fine-tuned on the target datasets. The models were evaluated with different metrics (log perplexity, token level accuracy, and multi-word level accuracy), and the results are compared to those of the baseline methods. The results on these different metrics show that a method based on pre-trained T5 is a promising approach to handle the target task. / Automatisk komplettering är en uppgift som kräver en algoritm för att ge förslag på hur man kan slutföra meningar. Specifikt levereras historien om livechatt och de ord som redan har skrivits av agenterna till algoritmen för att mata ut förslagen för att avsluta meningarna. Denna studie syftade till att undersöka om ovanstående uppgift kan hanteras genom att finjustera en förtränad T5-modell på måldatamängden. I denna avhandling valdes både en engelsk och en portugisisk datamängd. Därefter finjusterades T5 och dess flerspråkiga version mT5 på måldatauppsättningarna. Modellerna utvärderades med olika mätvärden (log-perplexitet, precision på ordnivå och flerordsnivå), och resultaten jämförs med baslinjemetoderna. Resultaten på dessa olika mätvärden visar att en metod baserad på en förtränad T5 är ett lovande tillvägagångssätt för att hantera uppgiften. Natural Language Processing Natural Language Generation T5 Sequence-to- sequence Auto-completion Naturlig språkbehandling Generering av naturligt språk T5 Sekvens-tillsekvens Automatkomplettering Computer and Information Sciences Data- och informationsvetenskap
24	Vitiligo image classification using pre-trained Convolutional Neural Network Architectures, and its economic impact on health care / Vitiligo bildklassificering med hjälp av förtränade konvolutionella neurala nätverksarkitekturer och dess ekonomiska inverkan på sjukvården Bashar, Nour, Alsaid Suliman, MRami January 2022 (has links) Vitiligo is a skin disease where the pigment cells that produce melanin die or stop functioning, which causes white patches to appear on the body. Although vitiligo is not considered a serious disease, there is a risk that something is wrong with a person's immune system. In recent years, the use of medical image processing techniques has grown, and research continues to develop new techniques for analysing and processing medical images. In many medical image classification tasks, deep convolutional neural network technology has proven its effectiveness, which means that it may also perform well in vitiligo classification. Our study uses four deep convolutional neural networks in order to classify images of vitiligo and normal skin. The architectures selected are VGG-19, ResNeXt101, InceptionResNetV2 and Inception V3. ROC and AUC metrics are used to assess each model's performance. In addition, the authors investigate the economic benefits that this technology may provide to the healthcare system and patients. To train and evaluate the CNN models, the authors used a dataset that contains 1341 images in total. Because the dataset is limited, 5-fold cross validation is also employed to improve the model's prediction. The results demonstrate that InceptionV3 achieves the best performance in the classification of vitiligo, with an AUC value of 0.9111, and InceptionResNetV2 has the lowest AUC value of 0.8560. / Vitiligo är en hudsjukdom där pigmentcellerna som producerar melanin dör eller slutar fungera, vilket får vita fläckar att dyka upp på kroppen. Även om Vitiligo inte betraktas som en allvarlig sjukdom, det finns fortfarande risk att något är fel på en persons immun. Under de senaste åren har användningen av medicinska bildbehandlingstekniker vuxit och forskning fortsätter att utveckla nya tekniker för att analysera och bearbeta medicinska bilder. I många medicinska bildklassificeringsuppgifter har djupa konvolutionella neurala nätverk bevisat sin effektivitet, vilket innebär att den också kan fungera bra i Vitiligo klassificering. Vår studie använder fyra djupa konvolutionella neurala nätverk för att klassificera bilder av vitiligo och normal hud. De valda arkitekturerna är VGG-19, RESNEXT101, InceptionResNetV2 och Inception V3. ROC- och AUC mätvärden används för att bedöma varje modells prestanda. Dessutom undersöker författarna de ekonomiska fördelarna som denna teknik kan ge till sjukvårdssystemet och patienterna. För att träna och utvärdera CNN modellerna använder vi ett dataset som innehåller totalt 1341 bilder. Eftersom datasetet är begränsat används också 5-faldigt korsvalidering för att förbättra modellens förutsägelse. Resultaten visar att InceptionV3 uppnår bästa prestanda i klassificeringen av Vitiligo, med ett AUC -värde på 0,9111, och InceptionResNetV2 har det lägsta AUC -värdet på 0,8560. Vitiligo deep CNN architectures Image classification pre-trained models dataset AUC economic impact. Vitiligo djupa CNN-arkitekturer bildklassificering förtränade modeller dataset AUC ekonomisk påverkan. Medical Image Processing Medicinsk bildbehandling
25	<b>INTEGRATION OF UAV AND LLM IN AGRICULTURAL ENVIRONMENT</b> Sudeep Reddy Angamgari (20431028) 16 December 2024 (has links) <p dir="ltr">Unmanned Aerial Vehicles (UAVs) are increasingly applied in agricultural tasks such as crop monitoring, especially with AI-driven enhancements significantly increasing their autonomy and ability to execute complex operations without human interventions. However, existing UAV systems lack efficiency, intuitive user interfaces using natural language processing for command input, and robust security which is essential for real-time operations in dynamic environments. In this paper, we propose a novel solution to create a secure, efficient, and user-friendly interface for UAV control by integrating Large Language Model (LLM) with the case study on agricultural environment. In particular, we designed a four-stage approach that allows only authorized user to issue voice commands to the UAV. The command is issued to the LLM controller processed by LLM using API and generates UAV control code. Additionally, we focus on optimizing UAV battery life and enhancing scene interpretation of the environment. We evaluate our approach using AirSim and an agricultural setting built in Unreal Engine, testing under various conditions, including variable weather and wind factors. Our experimental results confirm our method's effectiveness, demonstrating improved operational efficiency and adaptability in diverse agricultural scenarios.</p> Autonomous agents and multiagent systems Modelling and simulation Speech recognition artificial intelligence (< Generative pre-trained transformers Light Detection And Ranging - LIDAR large language models in education Unmanned Aerial Vehicle Path Planning You Only Look Once (YOLO)
26	Deriving an Natural Language Processing inference Cost Model with Greenhouse Gas Accounting : Towards a sustainable usage of Machine Learning / Härledning av en Kostnadsmodell med växthusgasredovisning angående slutledning inom Naturlig Språkbehandling : Mot en hållbar användning av Maskininlärning Axberg, Tom January 2022 (has links) The interest in using State-Of-The-Art (SOTA) Pre-Trained Language Model (PLM) in product development is growing. The fact that developers can use PLM has changed the way to build reliable models, and it is the go-to method for many companies and organizations. Selecting the Natural Language Processing (NLP) model with the highest accuracy is the usual way of deciding which PLM to use. However, with growing concerns about negative climate changes, we need new ways of making decisions that consider the impact on our future needs. The best solution with the highest accuracy might not be the best choice when other parameters matter, such as sustainable development. This thesis investigates how to calculate an approximate total cost considering Operating Expenditure (OPEX) and CO2~emissions for a deployed NLP solution over a given period, specifically the inference phase. We try to predict the total cost with Floating Point Operation (FLOP) and test NLP models on a classification task. We further present the tools to make energy measurements and examine the metric FLOP to predict costs. Using a bottom-up approach, we investigate the components that affect the cost and measure the energy consumption for different deployed models. By constructing this cost model and testing it against real-life examples, essential information about a given NLP implementation and the relationship between monetary and environmental costs will be derived. The literature studies reveal that the derival of a cost model is a complex area, and the results confirm that it is not a straightforward procedure to approximate energy costs. Even if a cost model was not feasible to derive with the resources given, this thesis covers the area and shows why it is complex by examine FLOP. / Intresset att använda State-Of-The-Art (SOTA) Pre-Trained Language Model (PLM) i produktutveckling växer. Det faktum att utvecklare kan använda PLM har förändrat sättet att träna tillförlitliga modeller på och det är den bästa metoden för många företag och organisationer att använda SOTA Naturlig Språkbehandling (NLP). Att välja NLP-modellen med högsta noggrannhet är det vanliga sättet att bestämma vilken PLM som ska användas. Men med växande oro för miljöförändringar behöver vi nya sätt att fatta beslut som kommer att påverka våra framtida behov. Denna avhandling undersöker hur man beräknar en ungefärlig totalkostnad med hänsyn till Operating Expenditure (OPEX) och CO2~utsläpp för en utplacerad NLP-lösning under en given period, dvs slutledningsfasen. Vi försöker förutspå den totala kostnaden med flyttalsoperationer och testar mot en klassificerings uppgift. Vi undersöker verktygen för att göra mätningar samt variabeln Flyttalsoperationer för att förutspå energiförbrukning. Machine Learning Inference Inferencing Natural Language Processing Pre-trained Language Model Greenhouse Gas Software Energy Measurement Floating Point Operations Green-AI Green-IT OPEX Maskininlärning Slutledning Naturlig Språkbehandling Förutbildad språkmodell Växthusgas Energimätning av Mjukvara Green-AI Green-IT. OPEX Computer and Information Sciences Data- och informationsvetenskap
27	Dynamic Network Modeling from Temporal Motifs and Attributed Node Activity Giselle Zeno (16675878) 26 July 2023 (has links) <p>The most important networks from different domains—such as Computing, Organization, Economic, Social, Academic, and Biology—are networks that change over time. For example, in an organization there are email and collaboration networks (e.g., different people or teams working on a document). Apart from the connectivity of the networks changing over time, they can contain attributes such as the topic of an email or message, contents of a document, or the interests of a person in an academic citation or a social network. Analyzing these dynamic networks can be critical in decision-making processes. For instance, in an organization, getting insight into how people from different teams collaborate, provides important information that can be used to optimize workflows.</p> <p><br></p> <p>Network generative models provide a way to study and analyze networks. For example, benchmarking model performance and generalization in tasks like node classification, can be done by evaluating models on synthetic networks generated with varying structure and attribute correlation. In this work, we begin by presenting our systemic study of the impact that graph structure and attribute auto-correlation on the task of node classification using collective inference. This is the first time such an extensive study has been done. We take advantage of a recently developed method that samples attributed networks—although static—with varying network structure jointly with correlated attributes. We find that the graph connectivity that contributes to the network auto-correlation (i.e., the local relationships of nodes) and density have the highest impact on the performance of collective inference methods.</p> <p><br></p> <p>Most of the literature to date has focused on static representations of networks, partially due to the difficulty of finding readily-available datasets of dynamic networks. Dynamic network generative models can bridge this gap by generating synthetic graphs similar to observed real-world networks. Given that motifs have been established as building blocks for the structure of real-world networks, modeling them can help to generate the graph structure seen and capture correlations in node connections and activity. Therefore, we continue with a study of motif evolution in <em>dynamic</em> temporal graphs. Our key insight is that motifs rarely change configurations in fast-changing dynamic networks (e.g. wedges intotriangles, and vice-versa), but rather keep reappearing at different times while keeping the same configuration. This finding motivates the generative process of our proposed models, using temporal motifs as building blocks, that generates dynamic graphs with links that appear and disappear over time.</p> <p><br></p> <p>Our first proposed model generates dynamic networks based on motif-activity and the roles that nodes play in a motif. For example, a wedge is sampled based on the likelihood of one node having the role of hub with the two other nodes being the spokes. Our model learns all parameters from observed data, with the goal of producing synthetic graphs with similar graph structure and node behavior. We find that using motifs and node roles helps our model generate the more complex structures and the temporal node behavior seen in real-world dynamic networks.</p> <p><br></p> <p>After observing that using motif node-roles helps to capture the changing local structure and behavior of nodes, we extend our work to also consider the attributes generated by nodes’ activities. We propose a second generative model for attributed dynamic networks that (i) captures network structure dynamics through temporal motifs, and (ii) extends the structural roles of nodes in motifs to roles that generate content embeddings. Our new proposed model is the first to generate synthetic dynamic networks and sample content embeddings based on motif node roles. To the best of our knowledge, it is the only attributed dynamic network model that can generate <em>new</em> content embeddings—not observed in the input graph, but still similar to that of the input graph. Our results show that modeling the network attributes with higher-order structures (e.g., motifs) improves the quality of the networks generated.</p> <p><br></p> <p>The generative models proposed address the difficulty of finding readily-available datasets of dynamic networks—attributed or not. This work will also allow others to: (i) generate networks that they can share without divulging individual’s private data, (ii) benchmark model performance, and (iii) explore model generalization on a broader range of conditions, among other uses. Finally, the evaluation measures proposed will elucidate models, allowing fellow researchers to push forward in these domains.</p> Modelling and simulation Data mining and knowledge discovery Graph, social and multimedia data Neural networks Graph Machine Learning network evolution model temporal graph model Dynamic Networks, Attributed Graphs Social network analysis tools convolutional neural network (CNN) graph convolutional network (GCN) node embeddings language model bert Collective classification Collective inference Node classification model evaluation techniques synthetic networks BERT models pre-trained language models
28	Aligning language models to code : exploring efficient, temporal, and preference alignment for code generation Weyssow, Martin 09 1900 (has links) Pre-trained and large language models (PLMs, LLMs) have had a transformative impact on the artificial intelligence (AI) for software engineering (SE) research field. Through large-scale pre-training on terabytes of natural and programming language data, these models excel in generative coding tasks such as program repair and code generation. Existing approaches to align the model's behaviour with specific tasks propose using parameter-free methods like prompting or fine-tuning to improve their effectiveness. Nevertheless, it remains unclear how to align code PLMs and LLMs to more complex scenarios that extend beyond task effectiveness. We focus on model alignment in three overlooked scenarios for code generation, each addressing a specific objective: optimizing fine-tuning costs, aligning models with new data while retaining previous knowledge, and aligning with user coding preferences or non-functional requirements. We explore these scenarios in three articles, which constitute the main contributions of this thesis. In the first article, we conduct an empirical study on parameter-efficient fine-tuning techniques (PEFTs) for code LLMs in resource-constraint settings. Our study reveals the superiority of PEFTs over few-shot learning, showing that PEFTs like LoRA and QLoRA allow fine-tuning LLMs with up to 33 billion parameters on a single 24GB GPU without compromising task effectiveness. In the second article, we examine the behaviour of code PLMs in a continual fine-tuning setting, where the model acquires new knowledge from sequential domain-specific datasets. Each dataset introduces new data about third-party libraries not seen during pre-training or previous fine-tuning. We demonstrate that sequential fine-tuning leads to catastrophic forgetting and implement replay- and regularization-based continual learning approaches, showcasing their superiority in balancing task effectiveness and knowledge retention. In our third article, we introduce CodeUltraFeedback and CODAL-Bench, a novel dataset and benchmark for aligning code LLMs to user coding preferences or non-functional requirements. Our experiments reveal that tuning LLMs with reinforcement learning techniques like direct preference optimization (DPO) using CodeUltraFeedback results in better-aligned LLMs to coding preferences and substantial improvement in the functional correctness of LLM-generated code. / Les modèles de langue pré-entraînés et de grande taille (PLMs, LLMs) ont eu un impact transformateur sur le domaine de la recherche en intelligence artificielle (IA) pour l’ingénierie logicielle (SE). Grâce à un pré-entraînement à grande échelle sur des téraoctets de données en langage naturel et de programmation, ces modèles excellent dans les tâches de codage génératif telles que la réparation de programmes et la génération de code. Les approches existantes pour aligner le comportement du modèle avec des tâches spécifiques proposent l’utilisation de méthodes non paramétriques telles que le prompting ou le fine-tuning pour améliorer leur efficacité. Néanmoins, il reste incertain comment aligner les PLMs et LLMs de code sur des scénarios plus complexes qui nécessitent plus que garantir l’efficacité du modèle sur des tâches cibles. Nous nous concentrons sur l’alignement des modèles dans trois scénarios négligés pour la génération de code, chacun abordant un objectif spécifique: optimiser les coûts de fine-tuning, aligner les modèles avec de nouvelles données dans le temps tout en conservant les connaissances antérieures, et aligner les modèles sur les préférences de codage des utilisateurs ou exigences non fonctionnelles. Nous explorons ces scénarios dans trois articles, qui constituent les principales contributions de cette thèse. Dans le premier article, nous réalisons une étude empirique sur les techniques de finetuning efficaces en paramètres (PEFTs) pour les LLMs de code dans des environnements à ressources limitées. Notre étude révèle la supériorité des PEFTs par rapport au few-shot learning, montrant que des PEFTs comme LoRA et QLoRA permettent de fine-tuner des LLMs jusqu’à 33 milliards de paramètres sur un seul GPU de 24Go sans compromettre l’efficacité sur les tâches. Dans le deuxième article, nous examinons le comportement des PLMs de code dans un contexte de fine-tuning continu, où le modèle acquiert de nouvelles connaissances à partir de jeux de données séquentiels. Chaque jeu de données introduit de nouvelles informations sur des bibliothèques tierces non vues lors de la phase de préentraînement ou dans les jeux de données de fine-tuning précédents. Nous démontrons que le fine-tuning séquentiel conduit à de l’oubli catastrophique et mettons en œuvre des approches d’apprentissage continu basées sur le replay et la régularisation, et montrons leur supériorité pour balancer l’efficacité du modèle et la rétention des connaissances. Dans notre troisième article, nous introduisons CodeUltraFeedback et CODAL-Bench, un nouveau jeu de données et un banc d’essai pour aligner les LLMs de code sur les préférences de codage des utilisateurs ou exigences non fonctionnelles. Nos expériences révèlent que le tuning des LLMs avec des techniques d’apprentissage par renforcement comme l’optimisation directe des préférences (DPO) utilisant CodeUltraFeedback résulte en des LLMs mieux alignés sur les préférences de codage et une amélioration substantielle de l’exactitude fonctionnelle des codes générés. AI for SE pre-trained language models large language models model alignment efficient fine-tuning continual fine-tuning preference tuning Modèles de langue pré-entraînés Grands modèles de langue Alignement des modèles Fine-tuning efficace Fine-tuning continu Alignment sur les préférences
29	Representation Learning for Biomedical Text Mining Sänger, Mario 10 January 2025 (has links) Die Untersuchung von Beziehungen zwischen biomedizinischen Entitäten bildet einen Eckpfeiler der modernen Medizin. Angesichts der rasanten Zunahme der Forschungsliteratur wird es jedoch zunehmend schwieriger, durch bloßes Lesen umfassende Informationen über bestimmte Entitäten und deren Beziehungen zu gewinnen. Text-Mining Ansätze versuchen, die Verarbeitung dieser riesigen Datenmengen mit Hilfe von Maschinellen Lernen zu erleichtern. Wir tragen zu dieser Forschung bei indem wir Methoden zum Erlernen von Entitäts- und Textrepräsentationen auf Basis großer Publikations- und Wissensdatenbanken entwickeln. Als erstes schlagen wir zwei neuartige Ansätze zur Relationsextraktion vor, die Techniken des Representation Learnings nutzen, um umfassende Modelle biomedizinischer Entitäten und Entitätspaaren zu lernen. Diese Modelle lernen Vektorrepräsentationen, indem sie alle PubMed-Artikel berücksichtigen, die eine bestimmte Entität oder ein Entitätspaar erwähnen. Wir verwenden diese Vektoren als Eingabe für ein neuronales Netzwerk, um Relationen global zu klassifizieren, d. h. die Vorhersagen basieren auf dem gesamten Korpus und nicht auf einzelnen Sätzen oder Artikeln wie in konventionellen Ansätzen. In unserem zweiten Beitrag untersuchen wir die Auswirkungen multimodaler Entitätsinformationen auf die Vorhersage von Relationen mithilfe von Knowledge Graph Embedding Methoden. In unserer Studie erweitern wir bestehende Modelle, indem wir Wissensgraphen mit multimodalen Informationen anreichern. Ferner schlagen wir ein allgemeines Framework für die Integration dieser Informationen in den Lernprozess für Entitätsrepräsentationen vor. In unserem dritten Beitrag erweitern wir Sprachmodelle mit zusätzlichen Entitätsinformationen für die Identifikation von Relationen in Texten. Wir führen eine umfangreiche Evaluation durch, welche die Leistung solcher Modelle in mehreren Szenarien erfasst und damit eine umfassende, jedoch bisher fehlende, Bewertung solcher Modelle liefert. / With the rapid growth of biomedical literature, obtaining comprehensive information regarding particular biomedical entities and relations by only reading is becoming increasingly difficult. Text mining approaches seek to facilitate processing these vast amounts of text using machine learning. This renders effective and efficient encoding of all relevant information regarding specific entities as one central challenge in these approaches. In this thesis, we contribute to this research by developing machine learning methods for learning entity and text representations based on large-scale publication repositories and diverse information from in-domain knowledge bases. First, we propose two novel relation extraction approaches that use representation learning techniques to create comprehensive models of entities or entity pairs. These models learn low-dimensional embeddings by considering all publications from PubMed mentioning a specific entity or pair of entities. We use these embeddings as input for a neural network to classify relations globally, i.e., predictions are based on the entire corpus, not on single sentences or articles as in prior art. In our second contribution, we investigate the impact of multi-modal entity information for biomedical link prediction using knowledge graph embedding methods (KGEM). Our study enhances existing KGEMs by augmenting biomedical knowledge graphs with multi-modal entity information from in-domain databases. We propose a general framework for integrating this information into the KGEM entity representation learning process. In our third contribution, we augment pre-trained language models (PLM) with additional context information to identify interactions described in scientific texts. We perform an extensive benchmark that assesses the performance of such models across a wide range of biomedical relation scenarios, providing a comprehensive, but so far missing, evaluation of knowledge-augmented PLM-based extraction models. Biomedizinisches Text Mining Relationsextraction Vortrainierte Sprachmodelle Benchmark Multimodale Entitätsinformationen Biomedical Natural Language Processing Biomedical Text Mining Relation Extraction Representation Learning Pre-trained Language Models Multi-modal Entity Information Knowledge Augmentation Machine Learning Benchmark Multi-modal Knowledge Graphs 004 Informatik 570 Biologie ddc:004 ddc:570

Search results