Global ETD Search

21	Knowledge Graph Creation and Software Testing Kyasa, Aishwarya January 2023 (has links) Background: With the burgeoning volumes of data, efficient data transformation techniques are crucial. RDF mapping language has been recognized as a conventional method, whileIKEA the Knowledge graph’s approach brings a new perspective with tailored functions and schema definitions. Objectives: This study aims to compare the efficiency and effectiveness of the RDF mapping language (RML) and IKEA Knowledge graph(IKG) approaches in transforming JSON data into RDF format. It explores their performance across different complexity levels to provide insights into their strengths and limitations. Methods: We began our research by studying how professionals in the industry currently transform JSON data into Resource description framework(RDF) formats through a literature review. After gaining this understanding, we conducted practical experiments to compare the RDF mapping language (RML) and IKEA Knowledge graph(IKG)approaches at various complexity levels. We assessed user-friendliness, adaptability, execution time, and overall performance. This combined approach aimed to connect theoretical knowledge with experimental data transformation practices. Results: The results demonstrate the superiority of the IKEA Knowledge graph approach(IKG), particularly in intricate scenarios involving conditional mapping and external graph data lookup. It showcases the IKEA Knowledge Graph (IKG) method’s versatility and efficiency in managing diverse data transformation tasks. Conclusions: Through practical experimentation and thorough analysis, this study concludes that the IKEA Knowledge graph approach demonstrates superior performance in handling complex data transformations compared to the RDF mapping language (RML) approach. This research provides valuable insights for choosing an optimal data trans-formation approach based on the specific task complexities and requirements RDF IKEA Approach RML Data Transformation Knowledge Graph Creation. Computer Sciences Datavetenskap (datalogi)
22	ON CONVOLUTIONAL NEURAL NETWORKS FOR KNOWLEDGE GRAPH EMBEDDING AND COMPLETION Shen, Chen, 0000-0002-8465-6204 January 2020 (has links) Data plays the key role in almost every field of computer sciences, including knowledge graph field. The type of data varies across fields. For example, the data type of knowledge graph field is knowledge triples, while it is visual data like images and videos in computer vision field, and textual data like articles and news in natural language processing field. Data could not be utilized directly by machine learning models, thus data representation learning and feature design for various types of data are two critical tasks in many computer sciences fields. Researchers develop various models and frameworks to learn and extract features, and aim to represent information in defined embedding spaces. The classic models usually embed the data in a low-dimensional space, while neural network models are able to generate more meaningful and complex high-dimensional deep features in recent years. In knowledge graph field, almost every approach represent entities and relations in a low-dimensional space, because there are too many knowledge and triples in real-world. Recently a few approaches apply neural networks on knowledge graph learning. However, these models are only able to capture local and shallow features. We observe the following three important issues with the development of feature learning with neural networks. On one side, neural networks are not black boxes that work well in every case without specific design. There is still a lot of work to do about how to design and propose more powerful and robust neural networks for different types of data. On the other side, more studies about utilizing these representations and features in many applications are necessary. What's more, traditional representations and features work better in some domains, while deep representations and features perform better on other domains. Transfer learning is introduced to bridge the gap between domains and adapt various type of features for many tasks. In this dissertation, we aim to solve the above issues. For knowledge graph learning task, we propose a few important observations both theoretically and practically for current knowledge graph learning approaches, especially for knowledge graph learning based on Convolutional Neural Networks. Besides the work in knowledge graph field, we not only develop different types of feature and representation learning frameworks for various data types, but also develop effective transfer learning algorithm to utilize the features and representations. The obtained features and representations by neural networks are utilized successfully in multiple fields. Firstly, we analyze the current issues on knowledge graph learning models, and present eight observations for existing knowledge graph embedding approaches, especially for approaches based on Convolutional Neural Networks. Secondly, we proposed a novel unsupervised heterogeneous domain adaptation framework that could deal with features in various types. Multimedia features are able to be adapted, and the proposed algorithm could bridge the representation gap between the source and target domains. Thirdly, we propose a novel framework to learn and embed user comments and online news data in unit of sessions. We predict the article of interest for users with deep neural networks and attention models. Lastly, we design and analyze a large number of features to represent dynamics of user comments and news article. The features span a broad spectrum of facets including news article and comment contents, temporal dynamics, sentiment/linguistic features, and user behaviors. Our main insight is that the early dynamics from user comments contribute the most to an accurate prediction, while news article specific factors have surprisingly little influence. / Computer and Information Science Artificial Intelligence Computer Science Feature Learning Knowledge Graph Machine Learning Neural Networks
23	Automatic Question Answering and Knowledge Discovery from Electronic Health Records Wang, Ping 25 August 2021 (has links) Electronic Health Records (EHR) data contain comprehensive longitudinal patient information, which is usually stored in databases in the form of either multi-relational structured tables or unstructured texts, e.g., clinical notes. EHR provides a useful resource to assist doctors' decision making, however, they also present many unique challenges that limit the efficient use of the valuable information, such as large data volume, heterogeneous and dynamic information, medical term abbreviations, and noisy nature caused by misspelled words. This dissertation focuses on the development and evaluation of advanced machine learning algorithms to solve the following research questions: (1) How to seek answers from EHR for clinical activity related questions posed in human language without the assistance of database and natural language processing (NLP) domain experts, (2) How to discover underlying relationships of different events and entities in structured tabular EHRs, and (3) How to predict when a medical event will occur and estimate its probability based on previous medical information of patients. First, to automatically retrieve answers for natural language questions from the structured tables in EHR, we study the question-to-SQL generation task by generating the corresponding SQL query of the input question. We propose a translation-edit model driven by a language generation module and an editing module for the SQL query generation task. This model helps automatically translate clinical activity related questions to SQL queries, so that the doctors only need to provide their questions in natural language to get the answers they need. We also create a large-scale dataset for question answering on tabular EHR to simulate a more realistic setting. Our performance evaluation shows that the proposed model is effective in handling the unique challenges about clinical terminologies, such as abbreviations and misspelled words. Second, to automatically identify answers for natural language questions from unstructured clinical notes in EHR, we propose to achieve this goal by querying a knowledge base constructed based on fine-grained document-level expert annotations of clinical records for various NLP tasks. We first create a dataset for clinical knowledge base question answering with two sets: clinical knowledge base and question-answer pairs. An attention-based aspect-level reasoning model is developed and evaluated on the new dataset. Our experimental analysis shows that it is effective in identifying answers and also allows us to analyze the impact of different answer aspects in predicting correct answers. Third, we focus on discovering underlying relationships of different entities (e.g., patient, disease, medication, and treatment) in tabular EHR, which can be formulated as a link prediction problem in graph domain. We develop a self-supervised learning framework for better representation learning of entities across a large corpus and also consider local contextual information for the down-stream link prediction task. We demonstrate the effectiveness, interpretability, and scalability of the proposed model on the healthcare network built from tabular EHR. It is also successfully applied to solve link prediction problems in a variety of domains, such as e-commerce, social networks, and academic networks. Finally, to dynamically predict the occurrence of multiple correlated medical events, we formulate the problem as a temporal (multiple time-points) and multi-task learning problem using tensor representation. We propose an algorithm to jointly and dynamically predict several survival problems at each time point and optimize it with the Alternating Direction Methods of Multipliers (ADMM) algorithm. The model allows us to consider both the dependencies between different tasks and the correlations of each task at different time points. We evaluate the proposed model on two real-world applications and demonstrate its effectiveness and interpretability. / Doctor of Philosophy / Healthcare is an important part of our lives. Due to the recent advances of data collection and storing techniques, a large amount of medical information is generated and stored in Electronic Health Records (EHR). By comprehensively documenting the longitudinal medical history information about a large patient cohort, this EHR data forms a fundamental resource in assisting doctors' decision making including optimization of treatments for patients and selection of patients for clinical trials. However, EHR data also presents a number of unique challenges, such as (i) large-scale and dynamic data, (ii) heterogeneity of medical information, and (iii) medical term abbreviation. It is difficult for doctors to effectively utilize such complex data collected in a typical clinical practice. Therefore, it is imperative to develop advanced methods that are helpful for efficient use of EHR and further benefit doctors in their clinical decision making. This dissertation focuses on automatically retrieving useful medical information, analyzing complex relationships of medical entities, and detecting future medical outcomes from EHR data. In order to retrieve information from EHR efficiently, we develop deep learning based algorithms that can automatically answer various clinical questions on structured and unstructured EHR data. These algorithms can help us understand more about the challenges in retrieving information from different data types in EHR. We also build a clinical knowledge graph based on EHR and link the distributed medical information and further perform the link prediction task, which allows us to analyze the complex underlying relationships of various medical entities. In addition, we propose a temporal multi-task survival analysis method to dynamically predict multiple medical events at the same time and identify the most important factors leading to the future medical events. By handling these unique challenges in EHR and developing suitable approaches, we hope to improve the efficiency of information retrieval and predictive modeling in healthcare. Electronic Health Records Question Answering Knowledge Discovery Knowledge Graph Survival Analysis
24	Continuously Extensible Information Systems: Extending the 5S Framework by Integrating UX and Workflows Chandrasekar, Prashant 11 June 2021 (has links) In Virginia Tech's Digital Library Research Laboratory, we support subject-matter-experts (SMEs) in their pursuit of research goals. Their goals include everything from data collection to analysis to reporting. Their research commonly involves an analysis of an extensive collection of data such as tweets or web pages. Without support -- such as by our lab, developers, or data analysts/scientists -- they would undertake the data analysis themselves, using available analytical tools, frameworks, and languages. Then, to extract and produce the information needed to achieve their goals, the researchers/users would need to know what sequences of functions or algorithms to run using such tools, after considering all of their extensive functionality. Our research addresses these problems directly by designing a system that lowers the information barriers. Our approach is broken down into three parts. In the first two parts, we introduce a system that supports discovery of both information and supporting services. In the first part, we describe the methodology that incorporates User eXperience (UX) research into the process of workflow design. Through the methodology, we capture (a) what are the different user roles and goals, (b) how we break down the user goals into tasks and sub-tasks, and (c) what functions and services are required to solve each (sub-)task. In the second part, we identify and describe key components of the infrastructure implementation. This implementation captures the various goals/tasks/services associations in a manner that supports information inquiry of two types: (1) Given an information goal as query, what is the workflow to derive this information? and (2) Given a data resource, what information can we derive using this data resource as input? We demonstrate both parts of the approach, describing how we teach and apply the methodology, with three case studies. In the third part of this research, we rely on formalisms used in describing digital libraries to explain the components that make up the information system. The formal description serves as a guide to support the development of information systems that generate workflows to support SME information needs. We also specifically describe an information system meant to support information goals that relate to Twitter data. / Doctor of Philosophy / In Virginia Tech's Digital Library Research Laboratory, we support subject-matter-experts (SMEs) in their pursuit of research goals. This includes everything from data collection to analysis to reporting. Their research commonly involves an analysis of an extensive collection of data such as tweets or web pages. Without support -- such as by our lab, developers, or data analysts/scientists -- they would undertake the data analysis themselves, using available analytical tools, frameworks, and languages. Then, to extract and produce the information needed to achieve their goals, the researchers/users would need to know what sequences of functions or algorithms to run using such tools, after considering all of their extensive functionality. Further, as more algorithms are being discovered and datasets are getting larger, the information processing effort is getting more and more complicated. Our research aims to address these problems directly by attempting to lower the barriers, through a methodology that integrates the full life cycle, including the activities carried out by User eXperience (UX), analysis, development, and implementation experts. We devise a three part approach to this research. The first two parts concern building a system that supports discovery of both information and supporting services. First, we describe the methodology that introduces UX research into the process of workflow design. Second, we identify and describe key components of the infrastructure implementation. We demonstrate both parts of the approach, describing how we teach and apply the methodology, with three case studies. In the third part of this research, we extend formalisms used in describing digital libraries to encompass the components that make up our new type of extensible information system.
25	Evolving graphs and similarity-based graphs with applications Zhang, Weijian January 2018 (has links) A graph is a mathematical structure for modelling the pairwise relations between objects. This thesis studies two types of graphs, namely, similarity-based graphs and evolving graphs. We look at ways to traverse an evolving graph. In particular, we examine the influence of temporal information on node centrality. In the process, we develop EvolvingGraphs.jl, a software package for analyzing time-dependent networks. We develop Etymo, a search system for discovering interesting research papers. Etymo utilizes both similarity-based graphs and evolving graphs to build a knowledge graph of research articles in order to help users to track the development of ideas. We construct content similarity-based graphs using the full text of research papers. And we extract key concepts from research papers and exploit the temporal information in research papers to construct a concepts evolving graph. 510
26	A Natural Language Interface for Querying Linked Data Akrin, Christoffer, Tham, Simon January 2020 (has links) The thesis introduces a proof of concept idea that could spark great interest from many industries. The idea consists of a remote Natural Language Interface (NLI), for querying Knowledge Bases (KBs). The system applies natural language technology tools provided by the Stanford CoreNLP, and queries KBs with the use of the query language SPARQL. Natural Language Processing (NLP) is used to analyze the semantics of a question written in natural language, and generates relational information about the question. With correctly defined relations, the question can be queried on KBs containing relevant Linked Data. The Linked Data follows the Resource Description Framework (RDF) model by expressing relations in the form of semantic triples: subject-predicate-object. With our NLI, any KB can be understood semantically. By providing correct training data, the AI can learn to understand the semantics of the RDF data stored in the KB. The ability to understand the RDF data allows for the process of extracting relational information from questions about the KB. With the relational information, questions can be translated to SPARQL and be queried on the KB. SPARQL NLP RDF Semantic Web Knowledge Base Knowledge Graph Computer Sciences Datavetenskap (datalogi)
27	Capturing Knowledge of Emerging Entities from the Extended Search Snippets Ngwobia, Sunday C. January 2019 (has links) No description available. Computer Science Information Systems Emerging entities Capturing Knowledge Knowledge Graph search snippets Entity embedding Enhanced corpus, entity types entailment
28	Mobility Knowledge Graph and its Application in Public Transport Zhang, Qi January 2023 (has links) Efficient public transport planning, operations, and control rely on a deep understanding of human mobility in urban areas. The availability of extensive and diverse mobility data sources, such as smart card data and GPS data, provides opportunities to quantitatively study individual behavior and collective mobility patterns. However, analyzing and organizing these vast amounts of data is a challenging task. The Knowledge Graph (KG) is a graph-based method for knowledge representation and organization that has been successfully applied in various applications, yet the applications of KG in urban mobility are still limited. To further utilize the mobility data and explore human mobility patterns, the included papers constructed the Mobility Knowledge Graph (MKG), a general learning framework, and demonstrated its potential applications in public transport. Paper I introduces the concept of MKG and proposes a learning framework to construct MKG from smart card data in public transport networks. The framework captures the spatiotemporal travel pattern correlations between stations using both rule-based linear decomposition and neural network-based nonlinear decomposition methods. The paper validates the MKG construction framework and explores the value of MKG in predicting individual trip destinations using only tap-in records. Paper II proposes an application of user-station attention estimation to understand human mobility in urban areas, which facilitates downstream applications such as individual mobility prediction and location recommendation. To estimate the 'real' user-station attention from station visit counts data, the paper proposes a matrix decomposition method that captures both user similarity and station-station relations using the mobility knowledge graph (MKG). A neural network-based nonlinear decomposition approach was used to extract MKG relations capturing the latent spatiotemporal travel dependencies. The proposed framework is validated using synthetic and real-world data, demonstrating its significant value in contributing to user-station attention inference. / Effektiv planering, drift och kontroll av kollektivtrafik är beroende av end jup förståelse för mänsklig rörlighet i stadsområden. Tillgången till omfattande och varierande källor av rörlighetsdata, såsom data från smarta kort och GPS-data, ger möjligheter att kvantitativt studera individuellt beteende och kollektiva rörlighetsmönster. Att analysera och organisera dessa stora mängder data är dock en utmanande uppgift. Kunskapsgrafen (KG) är en grafba serad metod för kunskapsrepresentation och organisering som har tillämpats framgångsrikt inom olika områden, men användningen av KG inom urbana rörlighetsområden är fortfarande begränsad. För att ytterligare utnyttja rörlighetsdata och utforska mänskliga rörlighetsmönster har de inkluderade artiklarna konstruerat Mobility Knowledge Graph (MKG), en allmän inlärningsram, och visat dess potentiella tillämpningar inom kollektivtrafiken. Artikel I introducerar begreppet MKG och föreslår en inlärningsram för att konstruera MKG från data från smarta kort i kollektivtrafiknätverk. Ramverket fångar de rumsligt-temporala resmönstersambanden mellan stationer genom att använda både regelbaserade linjära dekomponeringsmetoder och neurala nätverksbaserade icke-linjära dekomponeringsmetoder. Artikeln validerar MKG-konstruktionsramverket och utforskar värdet av MKG för att förutsäga enskilda resmål med endast tap-in-register. Artikel II föreslår en tillämpning av uppskattning av användar-stations uppmärksamhet för att förstå mänsklig rörlighet i stadsområden, vilket underlättar efterföljande tillämpningar såsom individuell rörlighetsförutsägelse och platsrekommendationer. För att uppskatta den ’verkliga’ användar-stations uppmärksamheten från data om besöksantal på stationer föreslår artikeln en matrisdekomponeringsmetod som fångar både användarlikhet och station-stationsrelationer med hjälp av Mobility Knowledge Graph (MKG). En neural nätverksbaserad icke-linjär dekomponeringsmetod användes för att extrahera MKG-relationer som fångar de latenta rumsligt-temporala resberoendena. Det föreslagna ramverket valideras med hjälp av syntetiska och verkliga data och visar på dess betydande värde för att bidra till inferens av användar-stationsuppmärksamhet. / <p>QC231116</p> Knowledge graph Smart card data Public transport User-station attention Relation extraction Transport Systems and Logistics Transportteknik och logistik
29	Semantic Web Foundations for Representing, Reasoning, and Traversing Contextualized Knowledge Graphs Nguyen, Vinh Thi Kim January 2017 (has links) No description available. Computer Science semantic web contextualized knowledge graph knowledge representation and reasoning logical inferences formal graph RDF RDFS W3C
30	Von Open Access zu Open Knowledge - wie wir Informationsflüsse der Wissenschaft in der Digitalen Welt organisieren können Auer, Sören 14 November 2019 (has links) Trotz eines verbesserten digitalen Zugangs zu wissenschaftlichen Publikationen in den letzten Jahren bleiben die Grundprinzipien der wissenschaftlichen Kommunikation unverändert und sind weiterhin weitgehend dokumentenbasiert. Die dokumentorientierten Arbeitsabläufe in der Wissenschaft haben die Grenzen der Angemessenheit erreicht, wie die jüngsten Diskussionen über das ungebremste Wachstum wissenschaftlicher Literatur, die Mängel des Peer-Review und die Reproduzierbarkeitskrise zeigen. Open Access ist eine wichtige Voraussetzung diesen Herausforderungen zu begegnen, aber auch nur der erste Schritt. Wir müssen die wissenschaftliche Kommunikation stärker wissensbasiert organisieren, indem wir wissenschaftliche Beiträge und verwandte Artefakte durch semantisch reichhaltige, vernetzte Wissensgraphen ausdrücken und miteinander vernetzen. In diesem Vortrag werden wir mit der Open Research Knowledge Graph Initiative erste Schritte in diese Richtung vorstellen. info:eu-repo/classification/ddc/000 ddc:000

Search results