• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 17
  • Tagged with
  • 19
  • 19
  • 7
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Interpreting embedding models of knowledge bases. / Interpretando modelos de embedding de bases de conhecimento.

Gusmão, Arthur Colombini 26 November 2018 (has links)
Knowledge bases are employed in a variety of applications, from natural language processing to semantic web search; alas, in practice, their usefulness is hurt by their incompleteness. To address this issue, several techniques aim at performing knowledge base completion, of which embedding models are efficient, attain state-of-the-art accuracy, and eliminate the need for feature engineering. However, embedding models predictions are notoriously hard to interpret. In this work, we propose model-agnostic methods that allow one to interpret embedding models by extracting weighted Horn rules from them. More specifically, we show how the so-called \"pedagogical techniques\", from the literature on neural networks, can be adapted to take into account the large-scale relational aspects of knowledge bases, and show experimentally their strengths and weaknesses. / Bases de conhecimento apresentam diversas aplicações, desde processamento de linguagem natural a pesquisa semântica da web; contudo, na prática, sua utilidade é prejudicada por não serem totalmente completas. Para solucionar esse problema, diversas técnicas focam em completar bases de conhecimento, das quais modelos de embedding são eficientes, atingem estado da arte em acurácia, e eliminam a necessidade de fazer-se engenharia de características dos dados de entrada. Entretanto, as predições dos modelos de embedding são notoriamente difíceis de serem interpretadas. Neste trabalho, propomos métodos agnósticos a modelo que permitem interpretar modelos de embedding através da extração de regras Horn ponderadas por pesos dos mesmos. Mais espeficicamente, mostramos como os chamados \"métodos pedagógicos\", da literatura de redes neurais, podem ser adaptados para lidar com os aspectos relacionais e de larga escala de bases de conhecimento, e mostramos experimentalmente seus pontos fortes e fracos.
12

Search Engine Optimization and the connection with Knowledge Graphs

Marshall, Oliver January 2021 (has links)
Aim: The aim of this study is to analyze the usage of Search Engine Optimization and Knowledge Graphs and the connection between them to achieve profitable business visibility and reach. Methods: Following a qualitative method together with an inductive approach, ten marketing professionals were interviewed via an online questionnaire. To conduct this study both primary and secondary data was utilized. Scientific theory together with empirical findings were linked and discussed in the analysis chapter. Findings: This study establishes current Search Engine Optimization utilization by businesses regarding common techniques and methods. We demonstrate their effectiveness on the Google Knowledge Graph, Google My Business and resulting positive business impact for increased visibility and reach. Difficulties remain in accurate tracking procedures to analyze quantifiable results. Contribution of the thesis: This study contributes to the literature of both Search Engine Optimization and Knowledge Graphs by providing a new perspective on how these subjects have been utilized in modern marketing. In addition, this study provides an understanding of the benefits of SEO utilization on Knowledge Graphs. Suggestions for further research: We suggest more extensive investigation on the elements and utilization of Knowledge Graphs; how the structure can be affected; which techniques are most effective on a bigger scale and how effectively the benefits can be measured. Key Words: Search Engine, Search Engine Optimization, SEO, Knowledge Graphs, Google My Business, Google Search Engine, Online Marketing.
13

Using Semantic Data for Penetration Testing : A Study on Utilizing Knowledge Graphs for Offensive Cybersecurity / Användning av Semantisk Teknologi för Sårbarhetstestning : En Studie för att Applicera Kunskapsgrafer för Offensiv Cybersäkerhet

Wei, Björn January 2022 (has links)
Cybersecurity is an expanding and prominent field in the IT industry. As the amount of vulnerabilities and breaches continue to increase, there is a need to properly test these systems for internal weaknesses in order to prevent intruders proactively. Penetration testing is the act of emulating an adversary in order to test a system’s behaviour. However, due to the amount of possible vulnerabilities and attack methods that exists, the prospect of efficiently choosing a viable weakness to test or selecting a fairly adequate attack method becomes a cumbersome task for the penetration tester. The main objective of this thesis is to explore and show how the semantic data concept of Knowledge Graphs can assist a penetration tester during decision-making and vulnerability analysis. Such as providing insight to attacks a system could experience based on a set of discovered vulnerabilities, and emulate these attacks in order to test the system. Additionally, design aspects for developing a Knowledge Graph based penetration testing system are made and discussions on challenges and complications for the combined fields are also addressed. In this work, three design proposals are made based on inspiration from Knowledge Graph standards and related work. A prototype is also created, based on a penetration testing tool for web applications, OWASP ZAP. Which is then connected to a vulnerability database in order to gain access to various cybersecurity related data, such as attack descriptions on specific types of vulnerabilities. The analysis of the implemented prototype illustrates that Knowledge Graphs display potential for improving data extracted from a vulnerability scan. By connecting a Knowledge Graph to a vulnerability database, penetration testers can extract information and receive suggestions of attacks, reducing their cognitive burden. The drawbacks of this works prototype indicate that in order for a Knowledge Graph penetration testing system to work, the method of extracting information needs to be interfaced in a more user-friendly manner. Additionally, the reliance on specific standardizations create the need to develop several integration ­modules.
14

Identifying and Minimizing Underspecification in Breast Cancer Subtyping

Tang, Jonathan Cheuk-Kiu 01 December 2022 (has links) (PDF)
In the realm of biomedical technology, both accuracy and consistency are crucial to the development and deployment of these tools. While accuracy is easy to measure, consistency metrics are not so simple to measure, especially in the scope of biomedicine where prediction consistency can be difficult to achieve. Typically, biomedical datasets contain a significantly larger amount of features compared to the amount of samples, which goes against ordinary data mining practices. As a result, predictive models may fail to find valid pathways for prediction during training on such datasets. This concept is known as underspecification. Underspecification has been more accepted as a concept in recent years, with a handful of recent works exploring underspecification in different applications and a handful of past works experiencing underspecification prior to its declaration. However, underspecification is still under-addressed, to the point where some academics might even claim that it is not a significant problem. With this in mind, this thesis aims to identify and minimize underspecification of deep learning cancer subtype predictors. To address these goals, this work details the development of Predicting Underspecification Monitoring Pipeline (PUMP), a software tool to provide methodology for data analysis, stress testing, and model evaluation. In this context, the hope is that PUMP can be applied to deep learning training such that any user can ensure that their models are able to generalize to new data as best as possible.
15

Getting Graphical with Knowledge Graphs : A proof-of-concept for extending and modifying knowledge graphs

Granberg, Roberth, Hellman, Anton January 2022 (has links)
Knowledge Graph (KG) is an emerging topic of research. The promise of KGs is to be able to turn data into knowledge by supplying the data with context at the source. This could in turn allow machines to make sense of data by inference; looking at the context of the data and being able to derive knowledge from its context and relations, thus allowing for new ways of finding value in the sea of data that the world produces today. Working with KGs today involves many steps that are open to simplification and improvement, especially in regards to usability. In this thesis, we've aimed to design and produce an application that can be used to modify, extend and build KGs. The work includes the front-end library VueJS, the Scalable Vector Graphics (SVG) library D3 and the graph database Stardog. The project has made use of Scrum methodology to distribute and plan the work that took place over a span of six months, with two developers working halftime (20 hours/week). The result of the project is a working application that can be used by developers within the KG domain who want to be able to test and modify their graphs in a visual manner.
16

Multimodal Representation Learning for Textual Reasoning over Knowledge Graphs

Choudhary, Nurendra 18 May 2023 (has links)
Knowledge graphs (KGs) store relational information in a flexible triplet schema and have become ubiquitous for information storage in domains such as web search, e-commerce, social networks, and biology. Retrieval of information from KGs is generally achieved through logical reasoning, but this process can be computationally expensive and has limited performance due to the large size and complexity of relationships within the KGs. Furthermore, to extend the usage of KGs to non-expert users, retrieval over them cannot solely rely on logical reasoning but also needs to consider text-based search. This creates a need for multi-modal representations that capture both the semantic and structural features from the KGs. The primary objective of the proposed work is to extend the accessibility of KGs to non-expert users/institutions by enabling them to utilize non-technical textual queries to search over the vast amount of information stored in KGs. To achieve this objective, the research aims to solve four limitations: (i) develop a framework for logical reasoning over KGs that can learn representations to capture hierarchical dependencies between entities, (ii) design an architecture that can effectively learn the logic flow of queries from natural language text, (iii) create a multi-modal architecture that can capture inherent semantic and structural features from the entities and KGs, respectively, and (iv) introduce a novel hyperbolic learning framework to enable the scalability of hyperbolic neural networks over large graphs using meta-learning. The proposed work is distinct from current research because it models the logical flow of textual queries in hyperbolic space and uses it to perform complex reasoning over large KGs. The models developed in this work are evaluated on both the standard research setting of logical reasoning, as well as, real-world scenarios of query matching and search, specifically, in the e-commerce domain. In summary, the proposed work aims to extend the accessibility of KGs to non-expert users by enabling them to use non-technical textual queries to search vast amounts of information stored in KGs. To achieve this objective, the work proposes the use of multi-modal representations that capture both semantic and structural features from the KGs, and a novel hyperbolic learning framework to enable scalability of hyperbolic neural networks over large graphs. The work also models the logical flow of textual queries in hyperbolic space to perform complex reasoning over large KGs. The models developed in this work are evaluated on both the standard research setting of logical reasoning and real-world scenarios in the e-commerce domain. / Doctor of Philosophy / Knowledge graphs (KGs) are databases that store information in a way that allows computers to easily identify relationships between different pieces of data. They are widely used in domains such as web search, e-commerce, social networks, and biology. However, retrieving information from KGs can be computationally expensive, and relying solely on logical reasoning can limit their accessibility to non-expert users. This is where the proposed work comes in. The primary objective is to make KGs more accessible to non-experts by enabling them to use natural language queries to search the vast amounts of information stored in KGs. To achieve this objective, the research aims to address four limitations. Firstly, a framework for logical reasoning over KGs that can learn representations to capture hierarchical dependencies between entities is developed. Secondly, an architecture is designed that can effectively learn the logic flow of queries from natural language text. Thirdly, a multi-modal architecture is created that can capture inherent semantic and structural features from the entities and KGs, respectively. Finally, a novel hyperbolic learning framework is introduced to enable the scalability of hyperbolic neural networks over large graphs using meta-learning. The proposed work is unique because it models the logical flow of textual queries in hyperbolic space and uses it to perform complex reasoning over large KGs. The models developed in this work are evaluated on both the standard research setting of logical reasoning, as well as, real-world scenarios of query matching and search, specifically, in the e-commerce domain. In summary, the proposed work aims to make KGs more accessible to non-experts by enabling them to use natural language queries to search vast amounts of information stored in KGs. To achieve this objective, the work proposes the use of multi-modal representations that capture both semantic and structural features from the KGs, and a novel hyperbolic learning framework to enable scalability of hyperbolic neural networks over large graphs. The work also models the logical flow of textual queries in hyperbolic space to perform complex reasoning over large KGs. The results of this work have significant implications for the field of information retrieval, as it provides a more efficient and accessible way to retrieve information from KGs. Additionally, the multi-modal approach taken in this work has potential applications in other areas of machine learning, such as image recognition and natural language processing. The work also contributes to the development of hyperbolic geometry as a tool for modeling complex networks, which has implications for fields such as network science and social network analysis. Overall, this work represents an important step towards making the vast amounts of information stored in KGs more accessible and useful to a wider audience.
17

CONNECTING THE DOTS : Exploring gene contexts through knowledge-graph representations of gene-information derived from scientific literature

Hellberg, Henrietta January 2023 (has links)
Analyzing the data produced by next-generation sequencing technologies relies on access to information synthesized based on previous research findings. The volume of data available in the literature is growing rapidly, and it is becoming increasingly necessary for researchers to use AI or other statistics-based approaches in the analysis of their datasets. In this project, knowledge graphs are explored as a tool for providing access to contextual gene-information available in scientific literature. The explorative method described in this thesis is based on the implementation and comparison of two approaches for knowledge graph construction, a rule-based statistical as well as a neural-network and co-occurrence based approach, -based on specific literature contexts. The results are presented both in the form of a quantitative comparison between approaches as well as in the form of a qualitative expert evaluation of the quantitative result. The quantitative comparison suggested that contrasting knowledge graphs constructed based on different approaches can provide valuable information for the interpretation and contextualization of key genes. It also demonstrated the limitations of some approaches e.g. in terms of scalability as well as the volume and type of information that can be extracted. The result further suggested that metrics based on the overlap of nodes and edges, as well as metrics that leverage the global topology of graphs are valuable for representing and comparing contextual information between knowledge graphs. The result based on the qualitative expert evaluation demonstrated that literature-derived knowledge graphs of gene-information can be valuable tools for identifying research biases related to genes and also shed light on the challenges related to biological entity normalization in the context of knowledge graph development. In light of these findings, automatic knowledge-graph construction presents as a promising approach for improving access to contextual information about genes in scientific literature. / För att analysera de stora mängder data som produceras med hjälp av next-generation sequencing krävs det att forskare har tillgång till och kan sammanställa information från tidigare forskning. I takt med att mängden data som finns tillgänglig i den vetenskapliga litteraturen ökar, så ökar även behovet av att använda AI och andra statistiska metoder för att få tillgång till denna data i analysen. I detta projekt utforskas kunskapsgrafer som verktyg för att tillgängliggöra kontextuell geninformation i vetenskapliga artiklar. Den explorativa metod som beskrivs i detta projekt är baserad på implementationen och jämförelsen av två olika tekniker för kunskapsgrafgenerering, en regelbaserad-statistisk metod samt en metod baserad på neurala-nätverk och co-occurrence, baserade på specifika kontexter inom litteraturen. Resultatet presenteras både i form av en kvantitativ jämförelse mellan metoder samt genom en kvalitativ expertutvärdering baserad på det kvantitativa resultatet. Den kvantitativa jämförelsen antydde att jämförelsen mellan kunskapsgrafer genererade med hjälp av olika metoder kan bidra med värdefull information för tolkningen och kontextualiseringen av viktiga gener. Resultatet visade även på begränsningar hos vissa metoder, till exempel gällande skalbarhet samt den mängd och typ av information som kan extraheras. Men även att metrics baserade på överlappning av hörn och kanter, samt metrics som tar hänsyn till den globala topologin i grafer kan vara användbara i jämförelsen av, samt för att representera skillnader mellan biologiska kunskapsgrafer. Resultatet från den kvalitativa expertutvärderingen visade att kunskapsgrafer baserade på geninformation extraherad från vetenskapliga artiklar kan vara värdefulla verktyg för att identifiera forskningsbias gällande gener, samt framhävde viktiga utmaningar gällande normalisering av biologiska entiteter inom området kunskapsgrafsutveckling. Baserat på dessa fynd framstår automatisk kunskapsgrafsgenerering som ett lovande tillvägagångssätt för att förbättra tillgängligheten av kontextuell geninformation i vetenskaplig litteratur.
18

A Cross-domain and Cross-language Knowledge-based Representation of Text and its Meaning

Franco Salvador, Marc 03 July 2017 (has links)
Natural Language Processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human languages. One of its most challenging aspects involves enabling computers to derive meaning from human natural language. To do so, several meaning or context representations have been proposed with competitive performance. However, these representations still have room for improvement when working in a cross-domain or cross-language scenario. In this thesis we study the use of knowledge graphs as a cross-domain and cross-language representation of text and its meaning. A knowledge graph is a graph that expands and relates the original concepts belonging to a set of words. We obtain its characteristics using a wide-coverage multilingual semantic network as knowledge base. This allows to have a language coverage of hundreds of languages and millions human-general and -specific concepts. As starting point of our research we employ knowledge graph-based features - along with other traditional ones and meta-learning - for the NLP task of single- and cross-domain polarity classification. The analysis and conclusions of that work provide evidence that knowledge graphs capture meaning in a domain-independent way. The next part of our research takes advantage of the multilingual semantic network and focuses on cross-language Information Retrieval (IR) tasks. First, we propose a fully knowledge graph-based model of similarity analysis for cross-language plagiarism detection. Next, we improve that model to cover out-of-vocabulary words and verbal tenses and apply it to cross-language document retrieval, categorisation, and plagiarism detection. Finally, we study the use of knowledge graphs for the NLP tasks of community questions answering, native language identification, and language variety identification. The contributions of this thesis manifest the potential of knowledge graphs as a cross-domain and cross-language representation of text and its meaning for NLP and IR tasks. These contributions have been published in several international conferences and journals. / El Procesamiento del Lenguaje Natural (PLN) es un campo de la informática, la inteligencia artificial y la lingüística computacional centrado en las interacciones entre las máquinas y el lenguaje de los humanos. Uno de sus mayores desafíos implica capacitar a las máquinas para inferir el significado del lenguaje natural humano. Con este propósito, diversas representaciones del significado y el contexto han sido propuestas obteniendo un rendimiento competitivo. Sin embargo, estas representaciones todavía tienen un margen de mejora en escenarios transdominios y translingües. En esta tesis estudiamos el uso de grafos de conocimiento como una representación transdominio y translingüe del texto y su significado. Un grafo de conocimiento es un grafo que expande y relaciona los conceptos originales pertenecientes a un conjunto de palabras. Sus propiedades se consiguen gracias al uso como base de conocimiento de una red semántica multilingüe de amplia cobertura. Esto permite tener una cobertura de cientos de lenguajes y millones de conceptos generales y específicos del ser humano. Como punto de partida de nuestra investigación empleamos características basadas en grafos de conocimiento - junto con otras tradicionales y meta-aprendizaje - para la tarea de PLN de clasificación de la polaridad mono- y transdominio. El análisis y conclusiones de ese trabajo muestra evidencias de que los grafos de conocimiento capturan el significado de una forma independiente del dominio. La siguiente parte de nuestra investigación aprovecha la capacidad de la red semántica multilingüe y se centra en tareas de Recuperación de Información (RI). Primero proponemos un modelo de análisis de similitud completamente basado en grafos de conocimiento para detección de plagio translingüe. A continuación, mejoramos ese modelo para cubrir palabras fuera de vocabulario y tiempos verbales, y lo aplicamos a las tareas translingües de recuperación de documentos, clasificación, y detección de plagio. Por último, estudiamos el uso de grafos de conocimiento para las tareas de PLN de respuesta de preguntas en comunidades, identificación del lenguaje nativo, y identificación de la variedad del lenguaje. Las contribuciones de esta tesis ponen de manifiesto el potencial de los grafos de conocimiento como representación transdominio y translingüe del texto y su significado en tareas de PLN y RI. Estas contribuciones han sido publicadas en diversas revistas y conferencias internacionales. / El Processament del Llenguatge Natural (PLN) és un camp de la informàtica, la intel·ligència artificial i la lingüística computacional centrat en les interaccions entre les màquines i el llenguatge dels humans. Un dels seus majors reptes implica capacitar les màquines per inferir el significat del llenguatge natural humà. Amb aquest propòsit, diverses representacions del significat i el context han estat proposades obtenint un rendiment competitiu. No obstant això, aquestes representacions encara tenen un marge de millora en escenaris trans-dominis i trans-llenguatges. En aquesta tesi estudiem l'ús de grafs de coneixement com una representació trans-domini i trans-llenguatge del text i el seu significat. Un graf de coneixement és un graf que expandeix i relaciona els conceptes originals pertanyents a un conjunt de paraules. Les seves propietats s'aconsegueixen gràcies a l'ús com a base de coneixement d'una xarxa semàntica multilingüe d'àmplia cobertura. Això permet tenir una cobertura de centenars de llenguatges i milions de conceptes generals i específics de l'ésser humà. Com a punt de partida de la nostra investigació emprem característiques basades en grafs de coneixement - juntament amb altres tradicionals i meta-aprenentatge - per a la tasca de PLN de classificació de la polaritat mono- i trans-domini. L'anàlisi i conclusions d'aquest treball mostra evidències que els grafs de coneixement capturen el significat d'una forma independent del domini. La següent part de la nostra investigació aprofita la capacitat\hyphenation{ca-pa-ci-tat} de la xarxa semàntica multilingüe i se centra en tasques de recuperació d'informació (RI). Primer proposem un model d'anàlisi de similitud completament basat en grafs de coneixement per a detecció de plagi trans-llenguatge. A continuació, vam millorar aquest model per cobrir paraules fora de vocabulari i temps verbals, i ho apliquem a les tasques trans-llenguatges de recuperació de documents, classificació, i detecció de plagi. Finalment, estudiem l'ús de grafs de coneixement per a les tasques de PLN de resposta de preguntes en comunitats, identificació del llenguatge natiu, i identificació de la varietat del llenguatge. Les contribucions d'aquesta tesi posen de manifest el potencial dels grafs de coneixement com a representació trans-domini i trans-llenguatge del text i el seu significat en tasques de PLN i RI. Aquestes contribucions han estat publicades en diverses revistes i conferències internacionals. / Franco Salvador, M. (2017). A Cross-domain and Cross-language Knowledge-based Representation of Text and its Meaning [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/84285 / TESIS
19

On the use of knowledge graph embeddings for business expansion / Om användandet av kunskapsgrafinbäddningar för företagsexpansion

Rydberg, Niklas January 2022 (has links)
The area of Knowledge Graphs has grown significantly during recent time and has found many different applications both in industrial and academic settings. Despite this, many large Knowledge Graphs are in fact incomplete, which leads to the problem of finding the missing facts in the graphs using Link Prediction. There are several ways of performing Link prediction, the most common one that has emerged recently being using Machine learning techniques to learn low-dimensional representations of the Knowledge Graph called Knowledge Graph embeddings. This project attempts to explore whether or not this is a viable method to use in order to give suggestions for companies that want to expand their businesses. In order to test this hypothesis, a Knowledge Graph was built using real company data from open sources. Then different Knowledge Graph embedding models were trained on the data in order to predict missing elements in the Knowledge Graph. The models were then compared to see which one is most suitable for this task and data set. The geometric based models were found to perform the best for the specific data set used in this project. In this category there are models such as TransE, TransR and RotatE. The results point to the method being a valid option for giving expansion suggestions to companies using a Knowledge Graph of other companies and their products. However, to be certain of this, further research needs to be done where the method needs to be implemented on a larger scale using more diverse data. / Området kunskapsgrafer har växt mycket under de senaste åren och har många olika tillämpningar både inom akademiska och industriella områden. Trots denna tillväxt så är många kunskapsgrafer ofullständiga, vilket leder till problemet att hitta den faktan i kunskapsgraferna som saknas genom något som kallas länkförutsägelser. Det finns många olika metoder för att göra länkförutägelser, men den populäraste metoden som uppkommit de senaste åren är att använda maskininlärning för att lära in lågdimensionerade representationer av kunskapsgrafen i något som kallas kunskapsgrafsinbäddningar. I det här projektet försöker vi ta reda på om den här metoden går att använda för att ge förslag för företag som vill expandera och etablera sig på nya marknader. För att testa om detta är möjligt byggdes en kunskapsgraf med hjälp av data från öppna källor. Sedan fick olika kunskapsgrafsinbäddningsmodeller träna på data från kunskapsgrafen för att sedan kunna hitta fakta i grafen som saknades. De olika modellerna jämfördes sedan för att se vilken som var mest lämplig för att klara av uppgiften på vår kunskapsgraf. De modeller som är geometribaserade visade sig prestera bäst, bland dom fanns modeller som TransE, TransR och RotatE. Resultaten från projektet visar på att metoden är användbar för uppgiften att ge förslag om områden som ett företag kan expandera till. Dock skulle detta behöva undersökas mer med en större mer mångfaldig mängd data för att vara säker på att detta går att använda i fler marknadsområden än dem som ingick i projektet.

Page generated in 0.4437 seconds