Global ETD Search

31	An experimental analysis of Link Prediction methods over Microservices Knowledge Graphs Ruberto, Gianluca January 2023 (has links) Graphs are a powerful way to represent data. They can be seen as a collection of objects (nodes) and the relationships between them (edges or links). The power of this structure has its intrinsic value in the relationship between data points that can even provide more information than the data properties. An important type of graph is Knowledge Graphs in which each node and edge has a type associated. Often graph data is incomplete and in this case, it is not possible to retrieve useful information. Link prediction, also known as knowledge graph completion, is the task of inferring if there are missing edges or nodes in a graph. Models of different types, including Machine Learning-based, Rule-based, and Neural Network-based models have been developed to address this problem. The goal of this research is to understand how link prediction methods perform in a real use-case scenario. Therefore, multiple models have been compared on different accuracy metrics and production case requirements on a microservice tracing dataset. Models have been trained and tested on two different knowledge graphs obtained from the data, one that takes into account the temporal information, and the other that does not. Moreover, the prediction of the models has been evaluated with what is usually done in the literature, and also mimicking a real use-case scenario. The comparison showed that too complex models cannot be used when the time, at training, and/or inference phase, is critical. The best model for traditional prediction has been RotatE which usually doubled the score of the second- best model. Considering the use-case scenario, RotatE was tied with QuatE, which required a lot more time for training and predicting. They scored 20% to 40% better than the third-best performing model, depending on the case. Moreover, most of the models required less than a millisecond for predicting a triplet, with NodePiece that was the fastest, beating ConvE by a 4% margin. For the training time, NodePiece beats AnyBURL by 40%. Considering the memory usage, again NodePiece is the best, by an order of magnitude of at least 10 when compared to most of the other models. RotatE has been considered the best model overall because it had the best accuracy and an above-average performance on the other requirements. Additionally, a simulation of the integration of RotatE with a dynamic sampling tracing tool has been carried out, showing similar results to the ones previously obtained. Lastly, a thorough analysis of the results and suggestions for future work are presented. / Grafer är ett kraftfullt sätt att representera data. De kan ses som en samling objekt (noder) och förhållandet mellan dem (kanter eller länkar). Kraften i denna struktur har sitt inneboende värde i förhållandet mellan datapunkter som till och med kan ge mer information än dataegenskaperna. En viktig typ av graf är Knowledge Graphs där varje nod och kant har en typ associerad. Ofta är grafdata ofullständiga och i det här fallet är det inte möjligt att hämta användbar information. Länkprediktion, även känd som färdigställande av kunskapsdiagram, är uppgiften att förutsäga om det saknas kanter eller noder i en graf. Modeller av olika typer, inklusive Machine Learning-baserade, Regelbaserade och Neural Network-baserade modeller har utvecklats för att lösa detta problem. Målet med denna forskning är att förstå hur länkprediktionsmetoder fungerar i ett verkligt use-case scenario. Därför har flera modeller jämförts med olika noggrannhetsmått och produktionsfallskrav på en mikrotjänstspårningsdatauppsättning. Modeller har tränats och testats på två olika kunskapsgrafer som erhållits från data, en som tar hänsyn till tidsinformationen och den andra som inte gör det. Dessutom har förutsägelsen av modellerna utvärderats med vad som vanligtvis görs i litteraturen, och även efterlikna ett verkligt use-case scenario. Jämförelsen visade att alltför komplexa modeller inte kan användas när tiden, vid träning och/eller slutledningsfasen, är kritisk. Den bästa modellen för traditionell förutsägelse har varit RotatE som vanligtvis fördubblade poängen för den näst bästa modellen. Med tanke på användningsfallet var RotatE knuten till QuatE, vilket krävde mycket mer tid för träning och förutsägelse. De fick 20% till 40% bättre än den tredje bäst presterande modellen, beroende på fallet. Dessutom krävde de flesta av modellerna mindre än en millisekund för att förutsäga en triplett, med NodePiece som var snabbast och slog ConvE med 4% marginal. För träningstiden slår NodePiece AnyBURL med 40%. Med tanke på minnesanvändningen är återigen NodePiece bäst, med en storleksordning på minst 10 jämfört med de flesta andra modeller. RotatE har ansetts vara den bästa modellen överlag eftersom den hade den bästa noggrannheten och en prestanda över genomsnittet för övriga krav. Dessutom har en simulering av integrationen av RotatE med ett dynamiskt samplingsspårningsverktyg utförts, som visar liknande resultat som de tidigare erhållna. Slutligen presenteras en grundlig analys av resultaten och förslag till framtida arbete. Knowledge Graphs Link Prediction Machine Learning Microservice Tracing Kunskapsdiagram länkförutsägelse maskininlärning mikroservicespårning Engineering and Technology Teknik och teknologier
32	ALGEBRAIC METHODS FOR LINK PREDICTIONIN VERY LARGE NETWORKS Coskun, Mustafa, Coskun 06 September 2017 (has links) No description available. Computer Science
33	Discovery and Analysis of Patterns in Molecular Networks: Link Prediction, Network Analysis, and Applications to Novel Drug Target Discovery Zhang, Minlu 20 April 2012 (has links) No description available. Computer Science network analysis link prediction transcriptional regulation orphan disease rare disease protein-protein interaction
34	Supervised Inference of Gene Regulatory Networks Sen, Malabika Ashit 09 September 2021 (has links) A gene regulatory network (GRN) records the interactions among transcription factors and their target genes. GRNs are useful to study how transcription factors (TFs) control gene expression as cells transition between states during differentiation and development. Scientists usually construct GRNs by careful examination and study of the literature. This process is slow and painstaking and does not scale to large networks. In this thesis, we study the problem of inferring GRNs automatically from gene expression data. Recent data-driven approaches to infer GRNs increasingly rely on single-cell level RNA-sequencing (scRNA-seq) data. Most of these methods rely on unsupervised or association based strategies, which cannot leverage known regulatory interactions by design. To facilitate supervised learning, we propose a novel graph convolutional neural network (GCN) based autoencoder to infer new regulatory edges from a known GRN and scRNA-seq data. As the name suggests, a GCN-based autoencoder consists of an encoder that learns a low-dimensional embedding of the nodes (genes) in the input graph (the GRN) through a series of graph convolution operations and a decoder that aims to reconstruct the original graph as accurately as possible. We investigate several GCN-based architectures to determine the ideal encoder-decoder combination for GRN reconstruction. We systematically study the performance of these and other supervised learning methods on different mouse and human scRNA-seq datasets for two types of evaluation. We demonstrate that our GCN-based approach substantially outperforms traditional machine learning approaches. / Master of Science / In multi-cellular living organisms, stem cells differentiate into multiple cell types. Proteins called transcription factors (TFs) control the activity of genes to effect these transitions. It is possible to represent these interactions abstractly using a gene regulatory network (GRN). In a GRN, each node is a TF or a gene and each edge connects a TF to a gene or TF that it controls. New high-throughput technologies that can measure gene expression (activity) in individual cells provide rich data that can be used to construct GRNs. In this thesis, we take advantage of recent advances in the field of machine learning to develop a new computational method for computationally constructing GRNs. The distinguishing property of our technique is that it is supervised, i.e., it uses experimentally-known interactions to infer new regulatory connections. We investigate several variations of this approach to reconstruct a GRN as close to the original network as possible. We analyze and provide a rationale for the decisions made in designing, evaluating, and choosing the characteristics of our predictor. We show that our predictor has a reconstruction accuracy that is superior to other supervised-learning approaches. Gene Regulatory Networks Network Inference Link Prediction Graph Convolutional Networks Graph Machine Learning
35	A Multimodal Graph Convolutional Approach to Predict Genes Associated with Rare Genetic Diseases Sahasrabudhe, Dhruva Shrikrishna 11 September 2020 (has links) There exist a large number of rare genetic diseases in humans. Our knowledge of the specific gene variants whose presence in the genome of a person predisposes them towards developing a disease, called gene associations, is incomplete. Computational tools which can predict genes which may be associated with a rare disease have great utility in healthcare. However, a majority of existing prediction algorithms require a set of already known "seed genes'' to further discover novel associations for a disease. This drawback becomes more serious for rare genetic diseases, since a large proportion do not have any known gene associations. In this work, we develop an approach for disease-gene association prediction that overcomes the reliance on seed genes. Our approach uses the similarity of the observable biological characteristics of diseases (i.e., phenotypes) along with a global map of direct and indirect human protein interactions, to transfer associations from diseases whose gene associations have been discovered to diseases with no known gene associations. We formulate disease-gene association prediction over a multimodal network of diseases and genes, and develop an approach based on graph convolutional networks. We show how our model design considerations impact prediction performance. We demonstrate that our approach outperforms simpler graph machine learning and traditional machine learning approaches, as well as a competitive network propagation based approach for the task of predicting disease-gene associations. / Master of Science / There exist a large number of rare genetic diseases in humans. Our knowledge of the specific gene variants whose presence in the genome of a person predisposes them towards developing a disease, called gene associations, is incomplete. Computational tools which can predict genes which may be associated with a rare disease have great utility in healthcare. However, a majority of existing prediction algorithms require a set of already known "seed genes'' to further discover novel associations for a disease. This drawback becomes more serious for rare genetic diseases, since a large proportion do not have any known gene associations. In this work, we develop an approach for disease-gene association prediction that overcomes the reliance on seed genes. Our approach uses the similarity of the observable biological characteristics of diseases (i.e. disease phenotypes) along with a global map of direct and indirect human protein interactions, to transfer gene associations from diseases whose gene associations have been discovered, to diseases with no known associations. We implement an approach based on the field of graph machine learning, namely graph convolutional networks, to predict the genes associated with rare genetic diseases. We show how our predictor performs, compared to other approaches, and analyze some of the choices made in the design of the predictor, along with some properties of the outputs of our predictor. Graph Machine Learning Disease Gene Prediction Graph Convolutional Networks Link Prediction Multimodal Networks
36	Analýza grafových dat pomocí metod hlubokého učení / Graph data analysis using deep learning methods Vancák, Vladislav January 2019 (has links) The goal of this thesis is to investigate the existing graph embedding methods. We aim to represent the nodes of undirected weighted graphs as low-dimensional vectors, also called embeddings, in order to create a rep- resentation suitable for various analytical tasks such as link prediction and clustering. We first introduce several contemporary approaches allowing to create such network embeddings. We then propose a set of modifications and improvements and assess the performance of the enhanced models. Finally, we present a set of evaluation metrics and use them to experimentally evalu- ate and compare the presented techniques on a series of tasks such as graph visualisation and graph reconstruction. 1
37	Learning representations in multi-relational graphs : algorithms and applications / Apprentissage de représentations en données multi-relationnelles : algorithmes et applications García Durán, Alberto 06 April 2016 (has links) Internet offre une énorme quantité d’informations à portée de main et dans une telle variété de sujets, que tout le monde est en mesure d’accéder à une énorme variété de connaissances. Une telle grande quantité d’information pourrait apporter un saut en avant dans de nombreux domaines (moteurs de recherche, réponses aux questions, tâches NLP liées) si elle est bien utilisée. De cette façon, un enjeu crucial de la communauté d’intelligence artificielle a été de recueillir, d’organiser et de faire un usage intelligent de cette quantité croissante de connaissances disponibles. Heureusement, depuis un certain temps déjà des efforts importants ont été faits dans la collecte et l’organisation des connaissances, et beaucoup d’informations structurées peuvent être trouvées dans des dépôts appelés Bases des Connaissances (BCs). Freebase, Entity Graph Facebook ou Knowledge Graph de Google sont de bons exemples de BCs. Un grand problème des BCs c’est qu’ils sont loin d’êtres complets. Par exemple, dans Freebase seulement environ 30% des gens ont des informations sur leur nationalité. Cette thèse présente plusieurs méthodes pour ajouter de nouveaux liens entre les entités existantes de la BC basée sur l’apprentissage des représentations qui optimisent une fonction d’énergie définie. Ces modèles peuvent également être utilisés pour attribuer des probabilités à triples extraites du Web. On propose également une nouvelle application pour faire usage de cette information structurée pour générer des informations non structurées (spécifiquement des questions en langage naturel). On pense par rapport à ce problème comme un modèle de traduction automatique, où on n’a pas de langage correct comme entrée, mais un langage structuré. Nous adaptons le RNN codeur-décodeur à ces paramètres pour rendre possible cette traduction. / Internet provides a huge amount of information at hand in such a variety of topics, that now everyone is able to access to any kind of knowledge. Such a big quantity of information could bring a leap forward in many areas if used properly. This way, a crucial challenge of the Artificial Intelligence community has been to gather, organize and make intelligent use of this growing amount of available knowledge. Fortunately, important efforts have been made in gathering and organizing knowledge for some time now, and a lot of structured information can be found in repositories called Knowledge Bases (KBs). A main issue with KBs is that they are far from being complete. This thesis proposes several methods to add new links between the existing entities of the KB based on the learning of representations that optimize some defined energy function. We also propose a novel application to make use of this structured information to generate questions in natural language. Apprentissage relationnel Fonctions d'énergie Relational learning Tensor factorization Embedding models Energy functions Link prediction Question generation Knowledge bases Deep learning
38	Prévision de liens dans des grands graphes de terrain (application aux réseaux bibliographiques) / Link Prediction in Large-scale Complex Networks (Application to bibliographical Networks) Pujari, Manisha 04 March 2015 (has links) Nous nous intéressons dans ce travail au problème de prévision de nouveaux liens dans des grands graphes de terrain. Nous explorons en particulier les approches topologiques dyadiques pour la prévision de liens. Différentes mesures de proximité topologique ont été étudiées dans la littérature pour prédire l’apparition de nouveaux liens. Des techniques d’apprentissage supervisé ont été aussi utilisées afin de combiner ces différentes mesures pour construire des modèles prédictifs. Le problème d’apprentissage supervisé est ici un problème difficile à cause notamment du fort déséquilibre de classes. Dans cette thèse, nous explorons différentes approches alternatives pour améliorer les performances des approches dyadiques pour la prévision de liens. Nous proposons d’abord, une approche originale de combinaison des prévisions fondée sur des techniques d’agrégation supervisée de listes triées (ou agrégation de préférences). Nous explorons aussi différentes approches pour améliorer les performances des approches supervisées pour la prévision de liens. Une première approche consiste à étendre l’ensemble des attributs décrivant un exemple (paires de noeuds) par des attributs calculés dans un réseau multiplexe qui englobe le réseau cible. Un deuxième axe consiste à évaluer l’apport destechniques de détection de communautés pour l’échantillonnage des exemples. Des expérimentations menées sur des réseaux réels extraits de la base bibliographique DBLP montrent l’intérêt des approaches proposées. / In this work, we are interested to tackle the problem of link prediction in complex networks. In particular, we explore topological dyadic approaches for link prediction. Different topological proximity measures have been studied in the scientific literature for finding the probability of appearance of new links in a complex network. Supervided learning methods have also been used to combine the predictions made or information provided by different topological measures. The create predictive models using various topological measures. The problem of supervised learning for link prediction is a difficult problem especially due to the presence of heavy class imbalance. In this thesis, we search different alternative approaches to improve the performance of different dyadic approaches for link prediction. We propose here, a new approach of link prediction based on supervised rank agregation that uses concepts from computational social choice theory. Our approach is founded on supervised techniques of aggregating sorted lists (or preference aggregation). We also explore different ways of improving supervised link prediction approaches. One approach is to extend the set of attributes describing an example (pair of nodes) by attributes calculated in a multiplex network that includes the target network. Multiplex networks have a layered structure, each layer having different kinds of links between same sets of nodes. The second way is to use community information for sampling of examples to deal with the problem of classe imabalance. Experiments conducted on real networks extracted from well known DBLP bibliographic database. Réseaux complexes Prévisions de liens Agrégation supervisée de préférences Analyse de réseaux multiplexes Complex networks Link prediction Supervised rank agregation Multiplex network analysis
39	Dynamické sociální sítě a jejich analýza / Dynamic Social Networks and their Analysis Hudeček, Ján January 2021 (has links) For a long time, there has been little research on dynamic social networks. However, in recent years, there has been much more focus on this field and many techniques for analyzing temporal aspects of social networks were proposed. In this work, we studied a dynamic social network based on data retrieved from the Commercial Register. This registry contains information about all economic entities that operate in the Czech Republic, including people who hold functions in entities and their addresses of living. We applied several data analysis techniques including community tracing, clustering, and methods for identifying key actors to find important entities and individuals in the social network and inspect their changes over time. 1
40	Predikce spojení v odvozených sociálních sítích / Link Prediction in Inferred Social Networks Měkota, Ondřej January 2021 (has links) Social networks can be helpful for the analysis of behaviour of people. An existing social network is rarely available, and its nodes and edges have to be inferred from not necessarily graph data. Link prediction can be used to either correct inaccuracies or to forecast links about to appear in the future. In this work, we study the prediction of miss- ing links in a social network inferred from real-world bank data. We review and compare both verified and modern approaches to link prediction. Following the advancements of deep learning in recent years, we primarily focus on graph neural networks, and their ability to scale to large networks. We propose an adjustment to an existing graph neural network method and show that its performance is either comparable with or outperform- ing the original method. The comparison is performed on two social networks inferred from the same data. We show that it is relatively hard to outperform the verified link prediction methods with graph neural networks. 1

Search results