Querying and extracting heterogeneous graphs from structured data and unstrutured content

The present work introduces a set of solutions to extract graphs from enterprise data and facilitate the process of information search on these graphs. First of all we have defined a new graph model called the SPIDER-Graph, which models complex objects and permits to define heterogeneous graphs. Furthermore, we have developed a set of algorithms to extract the content of a database from an enterprise and to represent it in this new model. This latter representation allows us to discover relations that exist in the data but are hidden due to their poor compatibility with the classical relational model. Moreover, in order to unify the representation of all the data of the enterprise, we have developed a second approach which extracts from unstructured data an enterprise's ontology containing the most important concepts and relations that can be found in a given enterprise. Having extracted the graphs from the relational databases and documents using the enterprise ontology, we propose an approach which allows the users to extract an interaction graph between a set of chosen enterprise objects. This approach is based on a set of relations patterns extracted from the graph and the enterprise ontology concepts and relations. Finally, information retrieval is facilitated using a new visual graph query language called GraphVQL, which allows users to query graphs by drawing a pattern visually for the query. This language covers different query types from the simple selection and aggregation queries to social network analysis queries.

Identiferoai:union.ndltd.org:CCSD/oai:tel.archives-ouvertes.fr:tel-00740663
Date22 June 2012
CreatorsSoussi, Rania
PublisherEcole Centrale Paris
Source SetsCCSD theses-EN-ligne, France
LanguageEnglish
Detected LanguageEnglish
TypePhD thesis

Page generated in 0.0023 seconds