Spelling suggestions: "subject:"representationsinlärning"" "subject:"representation:taking""
51 |
Neural Representation Learning for Semi-Supervised Node Classification and ExplainabilityHogun Park (9179561) 28 July 2020 (has links)
<div>Many real-world domains are relational, consisting of objects (e.g., users and pa- pers) linked to each other in various ways. Because class labels in graphs are often only available for a subset of the nodes, semi-supervised learning for graphs has been studied extensively to predict the unobserved class labels. For example, we can pre- dict political views in a partially labeled social graph dataset and get expected gross incomes of movies in an actor/movie graph with a few labels. Recently, advances in representation learning for graph data have made great strides for the semi-supervised node classification. However, most of the methods have mainly focused on learning node representations by considering simple relational properties (e.g., random walk) or aggregating nearby attributes, and it is still challenging to learn complex inter- action patterns in partially labeled graphs and provide explanations on the learned representations. </div><div><br></div><div>In this dissertation, multiple methods are proposed to alleviate both challenges for semi-supervised node classification. First, we propose a graph neural network architecture, REGNN, that leverages local inferences for unlabeled nodes. REGNN performs graph convolution to enable label propagation via high-order paths and predicts class labels for unlabeled nodes. In particular, our proposed attention layer of REGNN measures the role equivalence among nodes and effectively reduces the noise, which is generated during the aggregation of observed labels from distant neighbors at various distances. Second, we also propose a neural network archi- tecture that jointly captures both temporal and static interaction patterns, which we call Temporal-Static-Graph-Net (TSGNet). The architecture learns a latent rep- resentation of each node in order to encode complex interaction patterns. Our key insight is that leveraging both a static neighbor encoder, that learns aggregate neigh- bor patterns, and a graph neural network-based recurrent unit, that captures complex interaction patterns, improves the performance of node classification. Lastly, in spite of better performance of representation learning on node classification tasks, neural network-based representation learning models are still less interpretable than the pre- vious relational learning models due to the lack of explanation methods. To address the problem, we show that nodes with high bridgeness scores have larger impacts on node embeddings such as DeepWalk, LINE, Struc2Vec, and PTE under perturbation. However, it is computationally heavy to get bridgeness scores, and we propose a novel gradient-based explanation method, GRAPH-wGD, to find nodes with high bridgeness efficiently. In our evaluations, our proposed architectures (REGNN and TSGNet) for semi-supervised node classification consistently improve predictive performance on real-world datasets. Our GRAPH-wGD also identifies important nodes as global explanations, which significantly change both predicted probabilities on node classification tasks and k-nearest neighbors in the embedding space after perturbing the highly ranked nodes and re-learning low-dimensional node representations for DeepWalk and LINE embedding methods.</div>
|
52 |
Self-supervised Representation Learning via Image Out-painting for Medical Image AnalysisJanuary 2020 (has links)
abstract: In recent years, Convolutional Neural Networks (CNNs) have been widely used in not only the computer vision community but also within the medical imaging community. Specifically, the use of pre-trained CNNs on large-scale datasets (e.g., ImageNet) via transfer learning for a variety of medical imaging applications, has become the de facto standard within both communities.
However, to fit the current paradigm, 3D imaging tasks have to be reformulated and solved in 2D, losing rich 3D contextual information. Moreover, pre-trained models on natural images never see any biomedical images and do not have knowledge about anatomical structures present in medical images. To overcome the above limitations, this thesis proposes an image out-painting self-supervised proxy task to develop pre-trained models directly from medical images without utilizing systematic annotations. The idea is to randomly mask an image and train the model to predict the missing region. It is demonstrated that by predicting missing anatomical structures when seeing only parts of the image, the model will learn generic representation yielding better performance on various medical imaging applications via transfer learning.
The extensive experiments demonstrate that the proposed proxy task outperforms training from scratch in six out of seven medical imaging applications covering 2D and 3D classification and segmentation. Moreover, image out-painting proxy task offers competitive performance to state-of-the-art models pre-trained on ImageNet and other self-supervised baselines such as in-painting. Owing to its outstanding performance, out-painting is utilized as one of the self-supervised proxy tasks to provide generic 3D pre-trained models for medical image analysis. / Dissertation/Thesis / Masters Thesis Computer Science 2020
|
53 |
DECEPTIVE REVIEW IDENTIFICATION VIA REVIEWER NETWORK REPRESENTATION LEARNINGShih-Feng Yang (11502553) 19 December 2021 (has links)
<div><div>With the growth of the popularity of e-commerce and mobile apps during the past decade, people rely on online reviews more than ever before for purchasing products, booking hotels, and choosing all kinds of services. Users share their opinions by posting product reviews on merchant sites or online review websites (e.g., Yelp, Amazon, TripAdvisor). Although online reviews are valuable information for people who are interested in products and services, many reviews are manipulated by spammers to provide untruthful information for business competition. Since deceptive reviews can damage the reputation of brands and mislead customers’ buying behaviors, the identification of fake reviews has become an important topic for online merchants. Among the computational approaches proposed for fake review identification, network-based fake review analysis jointly considers the information from review text, reviewer behaviors, and production information. Researchers have proposed network-based methods (e.g., metapath) on heterogeneous networks, which have shown promising results.</div><div><br></div><div>However, we’ve identified two research gaps in this study: 1) We argue the previous network-based reviewer representations are not sufficient to preserve the relationship of reviewers in networks. To be specific, previous studies only considered first-order proximity, which indicates the observable connection between reviewers, but not second-order proximity, which captures the neighborhood structures where two vertices overlap. Moreover, although previous network-based fake review studies (e.g., metapath) connect reviewers through feature nodes across heterogeneous networks, they ignored the multi-view nature of reviewers. A view is derived from a single type of proximity or relationship between the nodes, which can be characterized by a set of edges. In other words, the reviewers could form different networks with regard to different relationships. 2) The text embeddings of reviews in previous network-based fake review studies were not considered with reviewer embeddings.</div><div><br></div><div>To tackle the first gap, we generated reviewer embeddings via MVE (Qu et al., 2017), a framework for multi-view network representation learning, and conducted spammer classification experiments to examine the effectiveness of the learned embeddings for distinguishing spammers and non-spammers. In addition, we performed unsupervised hierarchical clustering to observe the clusters of the reviewer embeddings. Our results show the clusters generated based on reviewer embeddings capture the difference between spammers and non-spammers better than those generated based on reviewers’ features.</div><div><br></div><div>To fill the second gap, we proposed hybrid embeddings that combine review text embeddings with reviewer embeddings (i.e., the vector that represents a reviewer’s characteristics, such as writing or behavioral patterns). We conducted fake review classification experiments to compare the performance between using hybrid embeddings (i.e., text+reviewer) as features and using text-only embeddings as features. Our results suggest that hybrid embedding is more effective than text-only embedding for fake review identification. Moreover, we compared the prediction performance of the hybrid embeddings with baselines and showed our approach outperformed others on fake review identification experiments.</div><div><br></div><div>The contributions of this study are four-fold: 1) We adopted a multi-view representation learning approach for reviewer embedding learning and analyze the efficacy of the embeddings used for spammer classification and fake review classification. 2) We proposed a hybrid embedding that considers the characteristics of both review text and the reviewer. Our results are promising and suggest hybrid embedding is very effective for fake review identification. 3) We proposed a heuristic network construction approach that builds a user network based on user features. 4) We evaluated how different spammer thresholds impact the performance of fake review classification. Several studies have used the same datasets as we used in this study, but most of them followed the spammer definition mentioned by Jindal and Liu (2008). We argued that the spammer definition should be configurable based on different datasets. Our findings showed that by carefully choosing the spammer thresholds for the target datasets, hybrid embeddings have higher efficacy for fake review classification.</div></div>
|
54 |
Detekce střihů a vyhledávání známých scén ve videu s pomocí metod hlubokého učení / Detekce střihů a vyhledávání známých scén ve videu s pomocí metod hlubokého učeníSouček, Tomáš January 2020 (has links)
Video retrieval represents a challenging problem with many caveats and sub-problems. This thesis focuses on two of these sub-problems, namely shot transition detection and text-based search. In the case of shot detection, many solutions have been proposed over the last decades. Recently, deep learning-based approaches improved the accuracy of shot transition detection using 3D convolutional architectures and artificially created training data, but one hundred percent accuracy is still an unreachable ideal. In this thesis we present a deep network for shot transition detection TransNet V2 that reaches state-of- the-art performance on respected benchmarks. In the second case of text-based search, deep learning models projecting textual query and video frames into a joint space proved to be effective for text-based video retrieval. We investigate these query representation learning models in a setting of known-item search and propose improvements for the text encoding part of the model. 1
|
55 |
Apprentissage de représentations pour la prédiction de propagation d'information dans les réseaux sociaux / Representation learning for information diffusion prediction in social networkBourigault, Simon 10 November 2016 (has links)
Dans ce manuscrit, nous étudions la diffusion d'information dans les réseaux sociaux en ligne. Des sites comme Facebook ou Twitter sont en effet devenus aujourd'hui des media d'information à part entière, sur lesquels les utilisateurs échangent de grandes quantités de données. La plupart des modèles existant pour expliquer ce phénomène de diffusion sont des modèles génératifs, basés sur des hypothèses fortes concernant la structure et la dynamique temporelle de la diffusion d'information. Nous considérerons dans ce manuscrit le problème de la prédiction de diffusion dans le cas où le graphe social est inconnu, et où seules les actions des utilisateurs peuvent être observées. - Nous proposons, dans un premier temps, une méthode d'apprentissage du modèle independent cascade consistant à ne pas prendre en compte la dimension temporelle de la diffusion. Des résultats expérimentaux obtenus sur des données réelles montrent que cette approche permet d'obtenir un modèle plus performant et plus robuste. - Nous proposons ensuite plusieurs méthodes de prédiction de diffusion reposant sur des technique d'apprentissage de représentations. Celles-ci nous permettent de définir des modèles plus compacts, et plus robustes à la parcimonie des données. - Enfin, nous terminons en appliquant une approche similaire au problème de détection de source, consistant à retrouver l'utilisateur ayant lancé une rumeur sur un réseau social. En utilisant des méthodes d'apprentissage de représentations, nous obtenons pour cette tâche un modèle beaucoup plus rapide et performant que ceux de l'état de l'art. / In this thesis, we study information diffusion in online social networks. Websites like Facebook or Twitter have indeed become information medias, on which users create and share a lot of data. Most existing models of the information diffusion phenomenon relies on strong hypothesis about the structure and dynamics of diffusion. In this document, we study the problem of diffusion prediction in the context where the social graph is unknown and only user actions are observed. - We propose a learning algorithm for the independant cascades model that does not take time into account. Experimental results show that this approach obtains better results than time-based learning schemes. - We then propose several representations learning methods for this task of diffusion prediction. This let us define more compact and faster models. - Finally, we apply our representation learning approach to the source detection task, where it obtains much better results than graph-based approaches.
|
56 |
Vers l’universalité des représentations visuelle et multimodales / On The Universality of Visual and Multimodal RepresentationsTamaazousti, Youssef 01 June 2018 (has links)
En raison de ses enjeux sociétaux, économiques et culturels, l’intelligence artificielle (dénotée IA) est aujourd’hui un sujet d’actualité très populaire. L’un de ses principaux objectifs est de développer des systèmes qui facilitent la vie quotidienne de l’homme, par le biais d’applications telles que les robots domestiques, les robots industriels, les véhicules autonomes et bien plus encore. La montée en popularité de l’IA est fortement due à l’émergence d’outils basés sur des réseaux de neurones profonds qui permettent d’apprendre simultanément, la représentation des données (qui était traditionnellement conçue à la main), et la tâche à résoudre (qui était traditionnellement apprise à l’aide de modèles d’apprentissage automatique). Ceci résulte de la conjonction des avancées théoriques, de la capacité de calcul croissante ainsi que de la disponibilité de nombreuses données annotées. Un objectif de longue date de l’IA est de concevoir des machines inspirées des humains, capables de percevoir le monde, d’interagir avec les humains, et tout ceci de manière évolutive (c’est `a dire en améliorant constamment la capacité de perception du monde et d’interaction avec les humains). Bien que l’IA soit un domaine beaucoup plus vaste, nous nous intéressons dans cette thèse, uniquement à l’IA basée apprentissage (qui est l’une des plus performante, à ce jour). Celle-ci consiste `a l’apprentissage d’un modèle qui une fois appris résoud une certaine tâche, et est généralement composée de deux sous-modules, l’un représentant la donnée (nommé ”représentation”) et l’autre prenant des décisions (nommé ”résolution de tâche”). Nous catégorisons, dans cette thèse, les travaux autour de l’IA, dans les deux approches d’apprentissage suivantes : (i) Spécialisation : apprendre des représentations à partir de quelques tâches spécifiques dans le but de pouvoir effectuer des tâches très spécifiques (spécialisées dans un certain domaine) avec un très bon niveau de performance; ii) Universalité : apprendre des représentations à partir de plusieurs tâches générales dans le but d’accomplir autant de tâches que possible dansdifférents contextes. Alors que la spécialisation a été largement explorée par la communauté de l’apprentissage profond, seules quelques tentatives implicites ont été réalisée vers la seconde catégorie, à savoir, l’universalité. Ainsi, le but de cette thèse est d’aborder explicitement le problème de l’amélioration de l’universalité des représentations avec des méthodes d’apprentissage profond, pour les données d’image et de texte. [...] / Because of its key societal, economic and cultural stakes, Artificial Intelligence (AI) is a hot topic. One of its main goal, is to develop systems that facilitates the daily life of humans, with applications such as household robots, industrial robots, autonomous vehicle and much more. The rise of AI is highly due to the emergence of tools based on deep neural-networks which make it possible to simultaneously learn, the representation of the data (which were traditionally hand-crafted), and the task to solve (traditionally learned with statistical models). This resulted from the conjunction of theoretical advances, the growing computational capacity as well as the availability of many annotated data. A long standing goal of AI is to design machines inspired humans, capable of perceiving the world, interacting with humans, in an evolutionary way. We categorize, in this Thesis, the works around AI, in the two following learning-approaches: (i) Specialization: learn representations from few specific tasks with the goal to be able to carry out very specific tasks (specialized in a certain field) with a very good level of performance; (ii) Universality: learn representations from several general tasks with the goal to perform as many tasks as possible in different contexts. While specialization was extensively explored by the deep-learning community, only a few implicit attempts were made towards universality. Thus, the goal of this Thesis is to explicitly address the problem of improving universality with deep-learning methods, for image and text data. We have addressed this topic of universality in two different forms: through the implementation of methods to improve universality (“universalizing methods”); and through the establishment of a protocol to quantify its universality. Concerning universalizing methods, we proposed three technical contributions: (i) in a context of large semantic representations, we proposed a method to reduce redundancy between the detectors through, an adaptive thresholding and the relations between concepts; (ii) in the context of neural-network representations, we proposed an approach that increases the number of detectors without increasing the amount of annotated data; (iii) in a context of multimodal representations, we proposed a method to preserve the semantics of unimodal representations in multimodal ones. Regarding the quantification of universality, we proposed to evaluate universalizing methods in a Transferlearning scheme. Indeed, this technical scheme is relevant to assess the universal ability of representations. This also led us to propose a new framework as well as new quantitative evaluation criteria for universalizing methods.
|
57 |
Fast-NetMF: Graph Embedding Generation on Single GPU and Multi-core CPUs with NetMFShanmugam Sakthivadivel, Saravanakumar 24 October 2019 (has links)
No description available.
|
58 |
Domain Adaptation Applications to Complex High-dimensional Target DataStanojevic, Marija, 0000-0001-8227-6577 January 2023 (has links)
In the last decade, machine learning models have increased in size and amount of data they are using, which has led to improved performance on many tasks. Most notably, there has been a significant development in end-to-end deep learning and reinforcement learning models with new learning algorithms and architectures proposed frequently. Furthermore, while previous methods were focused on supervised learning, in the last five years, many models were proposed that learn in semi-supervised or self-supervised ways. The model is then fine-tuned to a specific task or different data domain. Adapting machine learning models learned on one type of data to similar but different data is called domain adaptation. This thesis discusses various challenges in the domain adaptation of machine learning models to specific tasks and real-world applications and proposes solutions for those challenges.
Data in real-world applications have different properties than clean machine-learning datasets commonly used for the experimental evaluation of proposed models. Learning appropriate representations from high-dimensional complex data with internal dependencies is arduous due to the curse of dimensionality and spurious correlation. However, most real-world data have these properties in addition to a small number of labeled samples since labeling is expensive and tedious. Additionally, accuracy drops drastically if models are applied to domain-specific datasets and unbalanced problems. Moreover, state-of-the-art models are not able to handle missing data. In this thesis, I strive to create frameworks that can learn a good representation of high-dimensional small data with correlations between variables.
The first chapter of this thesis describes the motivation, background, and research objectives. It also gives an overview of contributions and publications. A background needed to understand this thesis is provided in the second chapter and an introduction to domain adaptation is described in chapter three. The fourth chapter discusses domain adaptation with small target data. It describes the algorithm for semi-supervised learning over domain-specific short texts such as reviews or tweets. The proposed framework achieves up to 12.6% improvement when only 5000 labeled examples are available. The fifth chapter explores the influence of unanticipated bias in fine-tuning data. This chapter outlines how the bias in news data influences the classification performance of domain-specific text, where the domain is U.S. politics. It is shown that fine-tuning with domain-specific data is not always beneficial, especially if bias towards one label is present. The sixth chapter examines domain adaptation on datasets with high missing rates. It reviews a system created to learn from high-dimensional small data from psychological studies, which have up to 70% missingness. The proposed framework is achieving 9.3% smaller imputation and 33% lower prediction errors. The seventh chapter discusses the curse of dimensionality problem in domain adaptation. It presents a methodology for discovering research articles containing evolutionary timetrees. That system can search for, download, and filter research articles in which timetrees are imported. It scans 5 million articles in a few days. The proposed method also decreases the error of finding research papers by 21% compared to the baseline, which cannot work with high-dimensional data properly. The last, eighth chapter, summarizes the findings of this thesis and suggests future prospects. / Computer and Information Science
|
59 |
Out of Distribution Representation Learning for Network System ForecastingJianfei Gao (15208960) 12 April 2023 (has links)
<p>Representation learning algorithms, as the cutting edge of modern AIs, has shown their ability to automatically solve complex tasks in diverse fields including computer vision, speech recognition, autonomous driving, biology. Unsurprisingly, representation learning applications in computer networking domains, such as network management, video streaming, traffic forecasting, are enjoying increasing interests in recent years. However, the success of representation learning algorithms is based on consistency between training and test data distribution, which can not be guaranteed in some scenario due to resource limitation, privacy or other infrastructure reasons. Caused by distribution shift in training and test data, representation learning algorithms have to apply tuned models into environments whose data distribution are solidly different from the model training. This issue is addressed as Out-Of-Distribution (OOD) Generalization, and is still an open topic in machine learning. In this dissertation, I present solutions for OOD cases found in cloud services which will be beneficial to improve user experience. First, I implement Infinity SGD which can extrapolate from light-load server log to predict server performance under heavy-load. Infinity SGD builds the bridge between light-load and heavy-load server status through modeling server status under different loads by an unified Continuous Time Markov Chain (CTMC) of same parameters. I show that Infinity SGD can perform extrapolations that no precedent works can do on real-world testbed and synthetic experiments. Next, I propose Veritas, a framework to answer what will be the user experience if a different ABR, a kind of video streaming data transfer algorithm, was used with the same server, client and connection status. Veritas strictly follows Structural Causal Model (SCM) which guarantees its power to answer what-if counterfactual and interventional questions for video streaming. I showcase that Veritas can accurately answer confounders for what-if questions on real-world emulations where on existing works can. Finally, I propose time-then-graph, a provable more expressive temporal graph neural network (TGNN) than precedent works. We empirically show that time-then-graph is a more efficient and accurate framework on forecasting traffic on network data which will serve as an essential input data for Infinity SGD. Besides, paralleling with this dissertation, I formalize Knowledge Graph (KG) as doubly exchangeable attributed graph. I propose a doubly exchangeable representation blueprint based on the formalization which enables a complex logical reasoning task with no precedent works. This work may also find potential traffic classification applications in networking field.</p>
|
60 |
Network Representation Theory in Materials Science and Global Value Chain AnalysisHaneberg, Mats C. 07 April 2023 (has links)
This thesis is divided into two distinct chapters. In the first chapter, we apply network representation learning to the field of materials science in order to predict aluminum grain boundaries' properties and locate the most influential atoms and subgraphs within each grain boundary. We create fixed-length representations of the aluminum grain boundaries that successfully capture grain boundary structure and allow us to accurately predict grain boundary energy. We do this through two distinct methods. The first method we use is a graph convolutional neural network, a semi-supervised deep learning algorithm, and the second method is graph2vec, an unsupervised representation learning algorithm. The second chapter presents our dynamic global value chain network, the combination of the dynamic global supply chain network and the dynamic global strategic alliance network. Our global value chain network provides a level of scope and accessibility not found in any other global value chain network, commercial or academic. Through applications of network theory, we discover business applications that would increase the robustness and resilience of the global value chain. We accomplish this through an analysis of the static, dynamic, and community structure of our global value chain network.
|
Page generated in 0.139 seconds