• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 140
  • 5
  • 4
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 165
  • 95
  • 80
  • 69
  • 67
  • 52
  • 50
  • 48
  • 47
  • 47
  • 46
  • 45
  • 45
  • 42
  • 42
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
141

Classifying personal data on contextual information / Klassificering av persondata från kontextuell information

Dath, Carl January 2023 (has links)
In this thesis, a novel approach to classifying personal data is tested. Previous personal data classification models read the personal data before classifying it. However, this thesis instead investigates an approach to classify personal data by looking at contextual information frequently available in data sets. The thesis compares the well-researched word embedding methods Word2Vec, Global representations of Vectors (GloVe) and Bidirectional Encoder Representations from Transformers (BERT) used in conjunction with the different types of classification methods Bag Of Word representation (BOW), Convolutional Neural Networks (CNN), and Long Short-term Memory (LSTM) when solving a personal data classification task. The comparisons are made by extrinsically evaluating the different embeddings' and models' performance in a personal data classification task on a sizable collection of well-labeled datasets belonging to Spotify. The results suggest that the embedded representations of the contextual data capture enough information to be able to classify personal data both when classifying non-personal data against personal data, and also when classifying different types of personal data from each other. / I denna uppsats undersöks ett nytt tillvägagångssätt att klassificera personlig data. Tidigare dataklassificerings modeller läser data innan den klassificerar den. I denna uppsats undersöks istället ett tillvägagångssätt där den kontextuella informationen används. Uppsatsen jämför flera väletablerade metoder för 'word embedding' så som Word2Vec, Global representations of Vectors (GloVe) och Bidirectional Encoder Representations from Transformers (BERT) i kombination med klassificeringsmodellerna Bag Of Word representation (BOW), Convolutional Neural Networks (CNN) och Long Short-term Memory (LSTM). Modellerna jämförs genom att evaluera deras förmåga att klassificera olika typer av personlig data baserad på namngivning och beskrivning av dataset. Resultaten pekar på att representationerna samt modellerna fångar tillräckligt med information för att kunna klassificera personlig data baserat på den kontextuell information som gavs. Utöver detta antyder resultaten att modellerna även klarar av att urskilja olika typer av personlig data från varandra.
142

Parameter efficiency in Fine tuning Pretrained Large Language Models for Downstream Tasks

Dorairaj, Jonathan January 2024 (has links)
This thesis investigates Parameter-Efficient Fine-Tuning (PEFT) methods, specifically Low-Rank Adaptation (LoRA) (Hu et al. 2021) and Adapters (Houlsby et al. 2019), using the General Language Understanding Evaluation (GLUE) dataset (Wang et al. 2019). The primary focus is to evaluate the effectiveness and efficiency of these methods in fine-tuning pre-trained language models. Additionally, we introduce a novel application by applying the methodology from Yang et al. 2024 to the adapter module weights. We utilize Laplace approximations over both the LoRA (Yang et al. 2024, Daxberger et al. 2022a) and the newly adapted Adapter weights, assessing the Expected Calibration Error (ECE) and Negative Log-Likelihood (NLL). Furthermore, we discuss practical considerations such as training time, memory usage, and storage space implications of these PEFT techniques. The findings provide valuable insights into the trade-offs and benefits of using LoRA and Adapters for fine-tuning in resource-constrained environments.
143

Better representation learning for TPMS

Raza, Amir 10 1900 (has links)
Avec l’augmentation de la popularité de l’IA et de l’apprentissage automatique, le nombre de participants a explosé dans les conférences AI/ML. Le grand nombre d’articles soumis et la nature évolutive des sujets constituent des défis supplémentaires pour les systèmes d’évaluation par les pairs qui sont cruciaux pour nos communautés scientifiques. Certaines conférences ont évolué vers l’automatisation de l’attribution des examinateurs pour les soumissions, le TPMS [1] étant l’un de ces systèmes existants. Actuellement, TPMS prépare des profils de chercheurs et de soumissions basés sur le contenu, afin de modéliser l’adéquation des paires examinateur-soumission. Dans ce travail, nous explorons différentes approches pour le réglage fin auto-supervisé des transformateurs BERT pour les données des documents de conférence. Nous démontrons quelques nouvelles approches des vues d’augmentation pour l’auto-supervision dans le traitement du langage naturel, qui jusqu’à présent était davantage axée sur les problèmes de vision par ordinateur. Nous utilisons ensuite ces représentations d’articles individuels pour construire un modèle d’expertise qui apprend à combiner la représentation des différents travaux publiés d’un examinateur et à prédire leur pertinence pour l’examen d’un article soumis. Au final, nous montrons que de meilleures représentations individuelles des papiers et une meilleure modélisation de l’expertise conduisent à de meilleures performances dans la tâche de prédiction de l’adéquation de l’examinateur. / With the increase in popularity of AI and Machine learning, participation numbers have exploded in AI/ML conferences. The large number of submission papers and the evolving nature of topics constitute additional challenges for peer-review systems that are crucial for our scientific communities. Some conferences have moved towards automating the reviewer assignment for submissions, TPMS [1] being one such existing system. Currently, TPMS prepares content-based profiles of researchers and submission papers, to model the suitability of reviewer-submission pairs. In this work, we explore different approaches to self-supervised fine-tuning of BERT transformers for conference papers data. We demonstrate some new approaches to augmentation views for self-supervision in natural language processing, which till now has been more focused on problems in computer vision. We then use these individual paper representations for building an expertise model which learns to combine the representation of different published works of a reviewer and predict their relevance for reviewing a submission paper. In the end, we show that better individual paper representations and expertise modeling lead to better performance on the reviewer suitability prediction task.
144

Dynamic Network Modeling from Temporal Motifs and Attributed Node Activity

Giselle Zeno (16675878) 26 July 2023 (has links)
<p>The most important networks from different domains—such as Computing, Organization, Economic, Social, Academic, and Biology—are networks that change over time. For example, in an organization there are email and collaboration networks (e.g., different people or teams working on a document). Apart from the connectivity of the networks changing over time, they can contain attributes such as the topic of an email or message, contents of a document, or the interests of a person in an academic citation or a social network. Analyzing these dynamic networks can be critical in decision-making processes. For instance, in an organization, getting insight into how people from different teams collaborate, provides important information that can be used to optimize workflows.</p> <p><br></p> <p>Network generative models provide a way to study and analyze networks. For example, benchmarking model performance and generalization in tasks like node classification, can be done by evaluating models on synthetic networks generated with varying structure and attribute correlation. In this work, we begin by presenting our systemic study of the impact that graph structure and attribute auto-correlation on the task of node classification using collective inference. This is the first time such an extensive study has been done. We take advantage of a recently developed method that samples attributed networks—although static—with varying network structure jointly with correlated attributes. We find that the graph connectivity that contributes to the network auto-correlation (i.e., the local relationships of nodes) and density have the highest impact on the performance of collective inference methods.</p> <p><br></p> <p>Most of the literature to date has focused on static representations of networks, partially due to the difficulty of finding readily-available datasets of dynamic networks. Dynamic network generative models can bridge this gap by generating synthetic graphs similar to observed real-world networks. Given that motifs have been established as building blocks for the structure of real-world networks, modeling them can help to generate the graph structure seen and capture correlations in node connections and activity. Therefore, we continue with a study of motif evolution in <em>dynamic</em> temporal graphs. Our key insight is that motifs rarely change configurations in fast-changing dynamic networks (e.g. wedges intotriangles, and vice-versa), but rather keep reappearing at different times while keeping the same configuration. This finding motivates the generative process of our proposed models, using temporal motifs as building blocks, that generates dynamic graphs with links that appear and disappear over time.</p> <p><br></p> <p>Our first proposed model generates dynamic networks based on motif-activity and the roles that nodes play in a motif. For example, a wedge is sampled based on the likelihood of one node having the role of hub with the two other nodes being the spokes. Our model learns all parameters from observed data, with the goal of producing synthetic graphs with similar graph structure and node behavior. We find that using motifs and node roles helps our model generate the more complex structures and the temporal node behavior seen in real-world dynamic networks.</p> <p><br></p> <p>After observing that using motif node-roles helps to capture the changing local structure and behavior of nodes, we extend our work to also consider the attributes generated by nodes’ activities. We propose a second generative model for attributed dynamic networks that (i) captures network structure dynamics through temporal motifs, and (ii) extends the structural roles of nodes in motifs to roles that generate content embeddings. Our new proposed model is the first to generate synthetic dynamic networks and sample content embeddings based on motif node roles. To the best of our knowledge, it is the only attributed dynamic network model that can generate <em>new</em> content embeddings—not observed in the input graph, but still similar to that of the input graph. Our results show that modeling the network attributes with higher-order structures (e.g., motifs) improves the quality of the networks generated.</p> <p><br></p> <p>The generative models proposed address the difficulty of finding readily-available datasets of dynamic networks—attributed or not. This work will also allow others to: (i) generate networks that they can share without divulging individual’s private data, (ii) benchmark model performance, and (iii) explore model generalization on a broader range of conditions, among other uses. Finally, the evaluation measures proposed will elucidate models, allowing fellow researchers to push forward in these domains.</p>
145

Exploring DeepSEA CNN and DNABERT for Regulatory Feature Prediction of Non-coding DNA

Stachowicz, Jacob January 2021 (has links)
Prediction and understanding of the regulatory effects of non-coding DNA is an extensive research area in genomics. Convolutional neural networks have been used with success in the past to predict regulatory features, making chromatin feature predictions based solely on non-coding DNA sequences. Non-coding DNA shares various similarities with the human spoken language. This makes Language models such as the transformer attractive candidates for deciphering the non-coding DNA language. This thesis investigates how well the transformer model, usually used for NLP problems, predicts chromatin features based on genome sequences compared to convolutional neural networks. More specifically, the CNN DeepSEA, which is used for regulatory feature prediction based on noncoding DNA, is compared with the transformer DNABert. Further, this study explores the impact different parameters and training strategies have on performance. Furthermore, other models (DeeperDeepSEA and DanQ) are also compared on the same tasks to give a broader comparison value. Lastly, the same experiments are conducted on modified versions of the dataset where the labels cover different amounts of the DNA sequence. This could prove beneficial to the transformer model, which can understand and capture longrange dependencies in natural language problems. The replication of DeepSEA was successful and gave similar results to the original model. Experiments used for DeepSEA were also conducted on DNABert, DeeperDeepSEA, and DanQ. All the models were trained on different datasets, and their results were compared. Lastly, a Prediction voting mechanism was implemented, which gave better results than the models individually. The results showed that DeepSEA performed slightly better than DNABert, regarding AUC ROC. The Wilcoxon Signed-Rank Test showed that, even if the two models got similar AUC ROC scores, there is statistical significance between the distribution of predictions. This means that the models look at the dataset differently and might be why combining their prediction presents good results. Due to time restrictions of training the computationally heavy DNABert, the best hyper-parameters and training strategies for the model were not found, only improved. The Datasets used in this thesis were gravely unbalanced and is something that needs to be worked on in future projects. This project works as a good continuation for the paper Whole-genome deep-learning analysis identifies contribution of non-coding mutations to autism risk, Which uses the DeepSEA model to learn more about how specific mutations correlate with Autism Spectrum Disorder. / Arbetet kring hur icke-kodande DNA påverkar genreglering är ett betydande forskningsområde inom genomik. Convolutional neural networks (CNN) har tidigare framgångsrikt använts för att förutsäga reglerings-element baserade endast på icke-kodande DNA-sekvenser. Icke-kod DNA har ett flertal likheter med det mänskliga språket. Detta gör språkmodeller, som Transformers, till attraktiva kandidater för att dechiffrera det icke-kodande DNA-språket. Denna avhandling undersöker hur väl transformermodellen kan förutspå kromatin-funktioner baserat på gensekvenser jämfört med CNN. Mer specifikt jämförs CNN-modellen DeepSEA, som används för att förutsäga reglerande funktioner baserat på icke-kodande DNA, med transformern DNABert. Vidare undersöker denna studie vilken inverkan olika parametrar och träningsstrategier har på prestanda. Dessutom jämförs andra modeller (DeeperDeepSEA och DanQ) med samma experiment för att ge ett bredare jämförelsevärde. Slutligen utförs samma experiment på modifierade versioner av datamängden där etiketterna täcker olika mängder av DNA-sekvensen. Detta kan visa sig vara fördelaktigt för transformer modellen, som kan förstå beroenden med lång räckvidd i naturliga språkproblem. Replikeringen av DeepSEA experimenten var lyckad och gav liknande resultat som i den ursprungliga modellen. Experiment som användes för DeepSEA utfördes också på DNABert, DeeperDeepSEA och DanQ. Alla modeller tränades på olika datamängder, och resultat på samma datamängd jämfördes. Slutligen implementerades en algoritm som kombinerade utdatan av DeepDEA och DNABERT, vilket gav bättre resultat än modellerna individuellt. Resultaten visade att DeepSEA presterade något bättre än DNABert, med avseende på AUC ROC. Wilcoxon Signed-Rank Test visade att, även om de två modellerna fick liknande AUC ROC-poäng, så finns det en statistisk signifikans mellan fördelningen av deras förutsägelser. Det innebär att modellerna hanterar samma information på olika sätt och kan vara anledningen till att kombinationen av deras förutsägelser ger bra resultat. På grund av tidsbegränsningar för träning av det beräkningsmässigt tunga DNABert hittades inte de bästa hyper-parametrarna och träningsstrategierna för modellen, utan förbättrades bara. De datamängder som användes i denna avhandling var väldigt obalanserade, vilket måste hanteras i framtida projekt. Detta projekt fungerar som en bra fortsättning för projektet Whole-genome deep-learning analysis identifies contribution of non-coding mutations to autism risk, som använder DeepSEA-modellen för att lära sig mer om hur specifika DNA-mutationer korrelerar med autismspektrumstörning.
146

Duplicate Detection and Text Classification on Simplified Technical English / Dublettdetektion och textklassificering på Förenklad Teknisk Engelska

Lund, Max January 2019 (has links)
This thesis investigates the most effective way of performing classification of text labels and clustering of duplicate texts in technical documentation written in Simplified Technical English. Pre-trained language models from transformers (BERT) were tested against traditional methods such as tf-idf with cosine similarity (kNN) and SVMs on the classification task. For detecting duplicate texts, vector representations from pre-trained transformer and LSTM models were tested against tf-idf using the density-based clustering algorithms DBSCAN and HDBSCAN. The results show that traditional methods are comparable to pre-trained models for classification, and that using tf-idf vectors with a low distance threshold in DBSCAN is preferable for duplicate detection.
147

Extractive Multi-document Summarization of News Articles

Grant, Harald January 2019 (has links)
Publicly available data grows exponentially through web services and technological advancements. To comprehend large data-streams multi-document summarization (MDS) can be used. In this research, the area of multi-document summarization is investigated. Multiple systems for extractive multi-document summarization are implemented using modern techniques, in the form of the pre-trained BERT language model for word embeddings and sentence classification. This is combined with well proven techniques, in the form of the TextRank ranking algorithm, the Waterfall architecture and anti-redundancy filtering. The systems are evaluated on the DUC-2002, 2006 and 2007 datasets using the ROUGE metric. Where the results show that the BM25 sentence representation implemented in the TextRank model using the Waterfall architecture and an anti-redundancy technique outperforms the other implementations, providing competitive results with other state-of-the-art systems. A cohesive model is derived from the leading system and tried in a user study using a real-world application. The user study is conducted using a real-time news detection application with users from the news-domain. The study shows a clear favour for cohesive summaries in the case of extractive multi-document summarization. Where the cohesive summary is preferred in the majority of cases.
148

Sociální prostředí a vazby klienta z pohledu systemických konstelací / Social environment and relationships of a client in terms of systemic constellations

BŘEZINOVÁ, Lenka January 2009 (has links)
The diploma paper addresses the topic of the relation between systemic constellations, social environment and social work. The objective of the paper was to ascertain for what reasons clients most often use this method of work. The questions we wished to be answered were for example how the respondents came across systemic constellations, how long their decision-making lasted, and what primary impulse was the most important one. It was also interesting to learn what experience they had with this method of work, what effect on their life they perceived and what benefit for themselves they saw in the method. The paper includes description of the course of their first experience with their own constellation which had been elaborated for them.
149

VGCN-BERT : augmenting BERT with graph embedding for text classification : application to offensive language detection

Lu, Zhibin 05 1900 (has links)
Le discours haineux est un problème sérieux sur les média sociaux. Dans ce mémoire, nous étudions le problème de détection automatique du langage haineux sur réseaux sociaux. Nous traitons ce problème comme un problème de classification de textes. La classification de textes a fait un grand progrès ces dernières années grâce aux techniques d’apprentissage profond. En particulier, les modèles utilisant un mécanisme d’attention tel que BERT se sont révélés capables de capturer les informations contextuelles contenues dans une phrase ou un texte. Cependant, leur capacité à saisir l’information globale sur le vocabulaire d’une langue dans une application spécifique est plus limitée. Récemment, un nouveau type de réseau de neurones, appelé Graph Convolutional Network (GCN), émerge. Il intègre les informations des voisins en manipulant un graphique global pour prendre en compte les informations globales, et il a obtenu de bons résultats dans de nombreuses tâches, y compris la classification de textes. Par conséquent, notre motivation dans ce mémoire est de concevoir une méthode qui peut combiner à la fois les avantages du modèle BERT, qui excelle en capturant des informations locales, et le modèle GCN, qui fournit les informations globale du langage. Néanmoins, le GCN traditionnel est un modèle d'apprentissage transductif, qui effectue une opération convolutionnelle sur un graphe composé d'éléments à traiter dans les tâches (c'est-à-dire un graphe de documents) et ne peut pas être appliqué à un nouveau document qui ne fait pas partie du graphe pendant l'entraînement. Dans ce mémoire, nous proposons d'abord un nouveau modèle GCN de vocabulaire (VGCN), qui transforme la convolution au niveau du document du modèle GCN traditionnel en convolution au niveau du mot en utilisant les co-occurrences de mots. En ce faisant, nous transformons le mode d'apprentissage transductif en mode inductif, qui peut être appliqué à un nouveau document. Ensuite, nous proposons le modèle Interactive-VGCN-BERT qui combine notre modèle VGCN avec BERT. Dans ce modèle, les informations locales captées par BERT sont combinées avec les informations globales captées par VGCN. De plus, les informations locales et les informations globales interagissent à travers différentes couches de BERT, ce qui leur permet d'influencer mutuellement et de construire ensemble une représentation finale pour la classification. Via ces interactions, les informations de langue globales peuvent aider à distinguer des mots ambigus ou à comprendre des expressions peu claires, améliorant ainsi les performances des tâches de classification de textes. Pour évaluer l'efficacité de notre modèle Interactive-VGCN-BERT, nous menons des expériences sur plusieurs ensembles de données de différents types -- non seulement sur le langage haineux, mais aussi sur la détection de grammaticalité et les commentaires sur les films. Les résultats expérimentaux montrent que le modèle Interactive-VGCN-BERT surpasse tous les autres modèles tels que Vanilla-VGCN-BERT, BERT, Bi-LSTM, MLP, GCN et ainsi de suite. En particulier, nous observons que VGCN peut effectivement fournir des informations utiles pour aider à comprendre un texte haiteux implicit quand il est intégré avec BERT, ce qui vérifie notre intuition au début de cette étude. / Hate speech is a serious problem on social media. In this thesis, we investigate the problem of automatic detection of hate speech on social media. We cast it as a text classification problem. With the development of deep learning, text classification has made great progress in recent years. In particular, models using attention mechanism such as BERT have shown great capability of capturing the local contextual information within a sentence or document. Although local connections between words in the sentence can be captured, their ability of capturing certain application-dependent global information and long-range semantic dependency is limited. Recently, a new type of neural network, called the Graph Convolutional Network (GCN), has attracted much attention. It provides an effective mechanism to take into account the global information via the convolutional operation on a global graph and has achieved good results in many tasks including text classification. In this thesis, we propose a method that can combine both advantages of BERT model, which is excellent at exploiting the local information from a text, and the GCN model, which provides the application-dependent global language information. However, the traditional GCN is a transductive learning model, which performs a convolutional operation on a graph composed of task entities (i.e. documents graph) and cannot be applied directly to a new document. In this thesis, we first propose a novel Vocabulary GCN model (VGCN), which transforms the document-level convolution of the traditional GCN model to word-level convolution using a word graph created from word co-occurrences. In this way, we change the training method of GCN, from the transductive learning mode to the inductive learning mode, that can be applied to new documents. Secondly, we propose an Interactive-VGCN-BERT model that combines our VGCN model with BERT. In this model, local information including dependencies between words in a sentence, can be captured by BERT, while the global information reflecting the relations between words in a language (e.g. related words) can be captured by VGCN. In addition, local information and global information can interact through different layers of BERT, allowing them to influence mutually and to build together a final representation for classification. In so doing, the global language information can help distinguish ambiguous words or understand unclear expressions, thereby improving the performance of text classification tasks. To evaluate the effectiveness of our Interactive-VGCN-BERT model, we conduct experiments on several datasets of different types -- hate language detection, as well as movie review and grammaticality, and compare them with several state-of-the-art baseline models. Experimental results show that our Interactive-VGCN-BERT outperforms all other models such as Vanilla-VGCN-BERT, BERT, Bi-LSTM, MLP, GCN, and so on. In particular, we have found that VGCN can indeed help understand a text when it is integrated with BERT, confirming our intuition to combine the two mechanisms.
150

Filtrování spamových zpráv pomocí metod umělé inteligence / Email spam filtering using artificial intelligence

Safonov, Yehor January 2020 (has links)
In the modern world, email communication defines itself as the most used technology for exchanging messages between users. It is based on three pillars which contribute to the popularity and stimulate its rapid growth. These pillars are represented by free availability, efficiency and intuitiveness during exchange of information. All of them constitute a significant advantage in the provision of communication services. On the other hand, the growing popularity of email technologies poses considerable security risks and transforms them into an universal tool for spreading unsolicited content. Potential attacks may be aimed at either a specific endpoints or whole computer infrastructures. Despite achieving high accuracy during spam filtering, traditional techniques do not often catch up to rapid growth and evolution of spam techniques. These approaches are affected by overfitting issues, converging into a poor local minimum, inefficiency in highdimensional data processing and have long-term maintainability issues. One of the main goals of this master's thesis is to develop and train deep neural networks using the latest machine learning techniques for successfully solving text-based spam classification problem belonging to the Natural Language Processing (NLP) domain. From a theoretical point of view, the master's thesis is focused on the e-mail communication area with an emphasis on spam filtering. Next parts of the thesis bring attention to the domain of machine learning and artificial neural networks, discuss principles of their operations and basic properties. The theoretical part also covers possible ways of applying described techniques to the area of text analysis and solving NLP. One of the key aspects of the study lies in a detailed comparison of current machine learning methods, their specifics and accuracy when applied to spam filtering. At the beginning of the practical part, focus will be placed on the e-mail dataset processing. This phase was divided into five stages with the motivation of maintaining key features of the raw data and increasing the final quality of the dataset. The created dataset was used for training, testing and validation of types of the chosen deep neural networks. Selected models ULMFiT, BERT and XLNet have been successfully implemented. The master's thesis includes a description of the final data adaptation, neural networks learning process, their testing and validation. In the end of the work, the implemented models are compared using a confusion matrix and possible improvements and concise conclusion are also outlined.

Page generated in 0.0455 seconds