Global ETD Search

1	Imperfect RDF Databases : From Modelling to Querying / Bases de données RDF imparfaites : de la modélisation à l'interrogation Abidi, Amna 11 June 2019 (has links) L’intérêt sans cesse croissant des données RDF disponibles sur le Web a conduit à l’émergence de multiple et importants efforts de recherche pour enrichir le formalisme traditionnel des données RDF à des fins d’exploitation et d’analyse. Le travail de cette thèse s’inscrit dans la continuation de ces efforts en abordant la problématique de la gestion des données RDF en présence d’imperfections (manque de confiance/validité, incertitude, etc.). Les contributions de la thèse sont comme suit: (1) Nous avons proposé d’appliquer l’opérateur skyline sur les données RDF pondérées par des mesures de confiance (Trust-RDF) dans le but d’extraire les ressources les plus confiantes selon des critères définis par l’utilisateur. (2) Nous avons discuté via des méthodes statistiques l’impact des mesures de confiance sur le Trust-skyline.(3) Nous avons intégré à la structure des données RDF un quatrième élément, exprimant une mesure de possibilité. Pour gérer cette mesure de possibilité, un cadre langagier appropriée est étudié, à savoir Pi-SPARQL, qui étend le langage SPARQL aux requêtes permettant de traiter des distributions de possibilités. (4) Nous avons étudié une variante d’opérateur skyline pour extraire les ressources RDF possibilistes qui ne sont éventuellement dominées par aucune autre ressource dans le sens de l’optimalité de Pareto. / The ever-increasing interest of RDF data on the Web has led to several and important research efforts to enrich traditional RDF data formalism for the exploitation and analysis purpose. The work of this thesis is a part of the continuation of those efforts by addressing the issue of RDF data management in presence of imperfection (untruthfulness, uncertainty, etc.). The main contributions of this dissertation are as follows. (1) We tackled the trusted RDF data model. Hence, we proposed to extend the skyline queries over trust RDF data, which consists in extracting the most interesting trusted resources according to user-defined criteria. (2) We studied via statistical methods the impact of the trust measure on the Trust-skyline set.(3) We integrated in the structure of RDF data (i.e., subject-property-object triple) a fourth element expressing a possibility measure to reflect the user opinion about the truth of a statement.To deal with possibility requirements, appropriate framework related to language is introduced, namely Pi-SPARQL, that extends SPARQL to be possibility-aware query language.Finally, we studied a new skyline operator variant to extract possibilistic RDF resources that are possibly dominated by no other resources in the sense of Pareto optimality Bases de degré de confiance Requêtes à préférences Trust Preference queries
2	An Advanced Skyline Approach for Imperfect Data Exploitation and Analysis / Modèle Skyline pour l'analyse et l'exploitation des données incertaines Elmi, Saïda 15 September 2017 (has links) Ce travail de thèse porte sur un modèle de requête de préférence, appelée l'opérateur Skyline, pour l'exploitation de données imparfaites. L'imperfection de données peut être modélisée au moyen de la théorie de l'évidence. Ce type de données peut être géré dans des bases de données imparfaites appelées bases de données évidentielles. D'autre part, l'opérateur skyline est un outil puissant pour extraire les objets les plus intéressants dans une base de données.Dans le cadre de cette thèse, nous définissons une nouvelle sémantique de l'opérateur Skyline appropriée aux données imparfaites modélisées par la théorie de l'évidence. Nous introduisons par la suite la notion de points marginaux pour optimiser le calcul distribué du Skyline ainsi que la maintenance des objets Skyline en cas d'insertion ou de suppression d'objets dans la base de données.Nous modélisons aussi une fonction de score pour mesurer le degré de dominance de chaque objet skyline et définir le top-k Skyline. Une dernière contribution porte sur le raffinement de la requête Skyline pour obtenir les meilleurs objets skyline appelés objets Etoile ou Skyline stars. / The main purpose of this thesis is to study an advanced database tool named the skyline operator in the context of imperfect data modeled by the evidence theory. In this thesis, we first address, on the one hand, the fundamental question of how to extend the dominance relationship to evidential data, and on the other hand, it provides some optimization techniques for improving the efficiency of the evidential skyline. We then introduce efficient approach for querying and processing the evidential skyline over multiple and distributed servers. ln addition, we propose efficient methods to maintain the skyline results in the evidential database context wben a set of objects is inserted or deleted. The idea is to incrementally compute the new skyline, without reconducting an initial operation from the scratch. In the second step, we introduce the top-k skyline query over imperfect data and we develop efficient algorithms its computation. Further more, since the evidential skyline size is often too large to be analyzed, we define the set SKY² to refine the evidential skyline and retrieve the best evidential skyline objects (or the stars). In addition, we develop suitable algorithms based on scalable techniques to efficiently compute the evidential SKY². Extensive experiments were conducted to show the efficiency and the effectiveness of our approaches. Données imparfaites Requêtes de préférence Opérateur Skyline Bases de données évidentielles Maintenance du Skyline Top-k Skyline Imperfect data Preference Queries Skyline operator Evidential databases Skyline maintenance Top-k Skyline
3	Representative Subsets for Preference Queries Chester, Sean 26 August 2013 (has links) We focus on the two overlapping areas of preference queries and dataset summarization. A (linear) preference query specifies the relative importance of the attributes in a dataset and asks for the tuples that best match those preferences. Dataset summarization is the task of representing an entire dataset by a small, representative subset. Within these areas, we focus on three important sub-problems, significantly advancing the state-of-the-art in each. We begin with an investigation into a new formulation of preference queries, identifying a neglected and important subclass that we call threshold projection queries. While literature typically constrains the attribute preferences (which are real-valued weights) such that their sum is one, we show that this introduces bias when querying by threshold rather than cardinality. Using projection, rather than inner product as in that literature, removes the bias. We then give algorithms for building and querying indices for this class of query, based, in the general case, on geometric duality and halfspace range searching, and, in an important special case, on stereographic projection. In the second part of the dissertation, we investigate the monochromatic reverse top-k (mRTOP) query in two dimensions. A mRTOP query asks for, given a tuple and a dataset, the linear preference queries on the dataset that will include the given tuple. Towards this goal, we consider the novel scenario of building an index to support mRTOP queries, using geometric duality and plane sweep. We show theoretically and empirically that the index is quick to build, small on disk, and very efficient at answering mRTOP queries. As a corollary to these efforts, we defined the top-k rank contour, which encodes the k-ranked tuple for every possible linear preference query. This is tremendously useful in answering mRTOP queries, but also, we posit, of significant independent interest for its relation to myriad related linear preference query problems. Intuitively, the top-k rank contour is the minimum possible representation of knowledge needed to identify the k-ranked tuple for any query, without apriori knowledge of that query. We also introduce k-regret minimizing sets, a very succinct approximation of a numeric dataset. The purpose of the approximation is to represent the entire dataset by just a small subset that nonetheless will contain a tuple within or near to the top-k for any linear preference query. We show that the problem of finding k-regret minimizing sets—and, indeed, the problem in literature that it generalizes—is NP-Hard. Still, for the special case of two dimensions, we provide a fast, exact algorithm based on the top-k rank contour. For arbitrary dimension, we introduce a novel greedy algorithm based on linear programming and randomization that does excellently in our empirical investigation. / Graduate / 0984 databases computational geometry top-k queries preference queries k-regret minimizing sets depth contours indexing reverse data management stereographic projection plane sweep linear programming computational complexity algorithms NP-hardness randomization summarization duality
4	Consulta espacial preferencial por palavra-chave Almeida, Jo?o Paulo Dias de 17 December 2015 (has links) Submitted by Luis Ricardo Andrade da Silva (lrasilva@uefs.br) on 2016-03-01T21:58:16Z No. of bitstreams: 1 disserta??o.pdf: 1075417 bytes, checksum: 1ac0911a0f45578306a02c8eae7a090f (MD5) / Made available in DSpace on 2016-03-01T21:58:16Z (GMT). No. of bitstreams: 1 disserta??o.pdf: 1075417 bytes, checksum: 1ac0911a0f45578306a02c8eae7a090f (MD5) Previous issue date: 2015-12-17 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior - CAPES / With the popularity of devices that are able to annotate data with spatial information (latitude and longitude), the processing of spatial queries has received a lot of attention from the research community recently. In this dissertation, we study a new query type named Top-k Spatial Keyword Preference Query that selects objects of interest based on the textual relevance of other spatio-textual objects in their spatial neighborhood. This work introduces this new query type, presents three algorithms for processing the query efficiently and performs an experimental evaluation using real databases to study the performance of the proposed algorithms. / Com a popularidade de dispositivos capazes de anotar dados com coordenadas espaciais (latitude e longitude), o processamento de consultas espaciais tem recebido bastante aten??o da comunidade cient?fica recentemente. Esta disserta??o apresenta uma nova consulta, chamada Consulta Espacial Preferencial por Palavra-chave, que seleciona objetos de interesse de acordo com a relev?ncia textual de outros objetos espa?o-textuais presentes na sua vizinhan?a espacial. Este trabalho introduz esta nova consulta, apresenta tr?s algoritmos para process?-la de forma eficiente e avalia o desempenho dos algoritmos propostos atrav?s de um estudo experimental, utilizando bases de dados reais. Processamento de consultas Bases de dados espaciais ?ndices h?bridos Consultas preferenciais Sistemas de informa??o Recupera??o de informa??o Query processing Spatial databases Hybrid indexes Preference queries Information systems Information retrieval

1

Page generated in 0.0658 seconds