Spelling suggestions: "subject:"topaz"" "subject:"top10""
31 |
Techniques avancées pour l'optimisation de requêtes de services WebBenouaret, Karim 09 October 2012 (has links) (PDF)
De nos jours, nous assistons à l'émigration du Web de données vers le Web orienté services. L'amélioration des capacités et fonctionnalités des moteurs actuels de recherche sur le Web, par des techniques efficaces de recherche et de sélection de services, devient de plus en plus importante. Dans cette thèse, dans un premier temps, nous proposons un cadre de composition de services Web en tenant compte des préférences utilisateurs. Le modèle fondé sur la théorie des ensembles flous est utilisé pour représenter les préférences. L'approche proposée est basée sur une version étendue du principe d'optimalité de Pareto. Ainsi, la notion des top-k compositions est introduite pour répondre à des requêtes utilisateurs de nature complexe. Afin d'améliorer la qualité de l'ensemble des compositions retournées, un second filtre est appliqué à cet ensemble en utilisant le critère de diversité. Dans un second temps, nous avons considéré le problème de la sélection des services Web en présence de préférences émanant de plusieurs utilisateurs. Une nouvelle variante, appelée Skyline de services à majorité, du Skyline de services traditionnel est défini. Ce qui permet aux utilisateurs de prendre une décision " démocratique " conduisant aux services les plus appropriés. Un autre type de Skyline de services est également discuté dans cette thèse. Il s'agit d'un Skyline de Services de nature graduelle et se fonde sur une relation de dominance floue. Comme résultat, les services Web présentant un meilleur compromis entre les paramètres QoS sont retenus, alors que les services Web ayant un mauvais compromis entre les QoS sont exclus. Finalement, nous avons aussi absorbé le cas où les QoS décrivant les services Web sont entachés d'incertitude. La théorie des possibilités est utilisée comme modèle de l'incertain. Ainsi, un Skyline de Services possibilité est proposé pour permettre à l'utilisateur de sélectionner les services Web désirés en présence de QoS incertains. De riches expérimentations ont été conduites afin d'évaluer et de valider toutes les approches proposées dans cette thèse.
|
32 |
Efficient and Reliable In-Network Query Processing in Wireless Sensor NetworksMalhotra, Baljeet Singh Unknown Date
No description available.
|
33 |
Ranked Retrieval in Uncertain and Probabilistic DatabasesSoliman, Mohamed January 2011 (has links)
Ranking queries are widely used in data exploration, data analysis and decision
making scenarios. While most of the currently proposed ranking techniques focus
on deterministic data, several emerging applications involve data that are imprecise
or uncertain. Ranking uncertain data raises new challenges in query semantics and
processing, making conventional methods inapplicable. Furthermore, the interplay
between ranking and uncertainty models introduces new dimensions for ordering query
results that do not exist in the traditional settings.
This dissertation introduces new formulations and processing techniques for ranking queries on uncertain data. The formulations are based on marriage of traditional ranking semantics with possible worlds semantics under widely-adopted uncertainty models. In particular, we focus on studying the impact of tuple-level and attribute-level uncertainty on the semantics and processing techniques of ranking queries.
Under the tuple-level uncertainty model, we introduce a processing framework leveraging the capabilities of relational database systems to recognize and handle data
uncertainty in score-based ranking. The framework encapsulates a state space model,
and efficient search algorithms that compute query answers by lazily materializing the
necessary parts of the space. Under the attribute-level uncertainty model, we give a new probabilistic ranking model, based on partial orders, to encapsulate the space of possible rankings originating from uncertainty in attribute values. We present a set of efficient query evaluation algorithms, including sampling-based techniques based on the theory of Markov chains and Monte-Carlo method, to compute query answers.
We build on our techniques for ranking under attribute-level uncertainty to support
rank join queries on uncertain data. We show how to extend current rank join methods
to handle uncertainty in scoring attributes. We provide a pipelined query operator
implementation of uncertainty-aware rank join algorithm integrated with sampling
techniques to compute query answers.
|
34 |
Αποδοτική ιεραρχημένη ανάκτηση κοινωνικού περιεχομένου με χρήση ταξονομιών ετικετών / TREATS: optimal ranked retrieval with tag taxonomies in social media environmentsΚοντοτάσιου, Ιωάννα 15 May 2012 (has links)
Μία διαδεδομένη τεχνική που χρησιμοποιείται για την επίτευξη
αποδοτικής αναζήτησης περιεχομένου είναι η κατηγοριοποίηση αυτού σε
ταξονομίες ετικετών, δηλαδή σε δενδρικές <<ΕΙΝΑΙ-ΕΝΑ>> ιεραρχίες
λέξεων-κλειδιών που παρέχουν οι χρήστες. Κάθε κόμβος της δενδρικής
δομής αντιστοιχεί σε μία ετικέτα της ταξονομίας.
Στην παρούσα διπλωματική εργασία θα γίνει χρήση τέτοιων ταξονομιών
ετικετών, όπου κάθε αντικείμενο επισημαίνεται από τους χρήστες με μία
ή περισσότερες ετικέτες. Το περιβάλλον το οποίο θα ορίσουμε είναι
ιδιαίτερα δυναμικό, με την έννοια ότι η προσθαφαίρεση και τροποποίηση
των ετικετών από τους χρήστες είναι συνεχής καθώς και ότι αντικείμενα
μπορούν να προσθαφαιρούνται συνεχώς. Στο περιβάλλον αυτό θα
στοχεύσουμε στην αποδοτική ιεραρχημένη ανάκτηση περιεχομένου.
Πρωταρχικό στόχο αποτελεί η δημιουργία μετρικών ομοιότητας μεταξύ
ερωτημάτων, τα οποία υποβάλλονται από χρήστες, και του αποθηκευμένου
και κατηγοριοποιημένου περιεχομένου. Οι μετρικές αυτές θα βασίζονται
στη σημασιολογική απόσταση των κόμβων των ταξονομιών από τους όρους
των υποβληθέντων ερωτημάτων (οι οποίοι όροι θα πρέπει επίσης να
αποτελούν κόμβους της ταξονομίας).
Βάσει των παραπάνω μετρικών θα σχεδιαστούν και θα υλοποιηθούν
αλγόριθμοι για την ανάκτηση των k πιο σχετικών αντικειμένων, οι οποίοι
θα αποτελούν επεκτάσεις των βασικών αλγορίθμων κατωφλίου του Fagin
(Fagin's Threshold Algorithms - TA). Στην προτεινόμενη προσέγγιση θα
καμφθεί η απαίτηση της προΰπαρξης ανεστραμμένων ευρετηρίων. Αντίθετα,
τα απαιτούμενα (από τους αλγορίθμους του Fagin) ανεστραμμένα ευρετήρια
να κατασκευάζονται δυναμικά κατά την απάντηση των ερωτημάτων. / The spark for this work stems from the recent explosion in social media production, the proven interest of users to tag this media, and on the proven capability of semantically rich taxonomies to appropriately classify content.
The rich annotations/tags provided for social media offer a great basis for taxonomies.
Noting that web search increasingly involves taxonomies,
and that there exists already a rich set of taxonomies for many different fields, which can help classify tags, we target the problems associated with efficient taxonomy-based ranked retrieval in social web environments.
In a social-tag taxonomies environment, each tag (taxonomy node) is associated with all documents tagged with this tag. Queries are formulated using tags. The environment is highly dynamic, as documents and tags-documents associations are being added and/or deleted constantly. This dynamism can render as highly inefficient the traditional approaches to ranked retrieval, which are based on text indices, due to the high index creation, maintenance, and use costs.
We first adapt similarity measures between tag queries and documents, which are based on well-established principles of
taxonomy-based search.
We then develop algorithms for top-k queries exploiting taxonomic knowledge.
We contribute a suit of top-k algorithms, coined TREATS (ThREshold Algorithms on TaxonomieS).
Our first algorithm shows how to build per-tag inverted indices (required by the well-established Threshold Algorithms (TA) for top-k query processing). In this way, we port optimal ranked retrieval algorithms into the taxonomy realm.
Our second algorithm, TREATS-sorted, shares the same principles as TA-sorted, but without the need to maintain any inverted text indices! This introduces significant savings: First, in terms of storage required to store the indices. Second, for the overhead for building and maintaining indices. And third, for the overhead incurred during query execution for accessing indices.
Our third algorithm, TREATS-Labelled, further exploits the taxonomic structure in order to introduce large additional performance benefits.
We also prove the correctness and (instance-)optimality of TREATS.
Finally, we have implemented all algorithms and evaluated their efficiency against the baseline TA-random and TA-sorted algorithms, using real data sets with different characteristics.
|
35 |
Improving movie recommendations through social media matchingKuroptev, Roman, Lagerlöf, Anton January 2019 (has links)
Rekommendationssystem är idag väsentliga för att navigera den enorma mängd produkter tillgängliga via internet. Då social media i form av Twitter vid tidigare tillfällen använts för att generera filmrekommendationer har detta främst varit för att hantera cold-start, ett vanligt drabbande problem för collaborative-filtering. I detta arbete adresseras istället hur top-k rekommendationer påverkas vid integrering av social media data i rekommendationssystemet. För att svara på denna fråga har en prototyp av nytt slag utvecklats inom processmodellen för Design Science. Systemet rankar om top-k rekommendationer baserat på resultatet av social matchning där användares Tweets matchas med nyckelord för filmer genom latent semantic indexing (LSI) similarity. Prototypen evalueras genom experiment som adresserar funktionalitet, noggrannhet, konsekvens och prestanda. Resultatet visar att mätetalen NDCG och MAP för top-k rekommendationer förbättras med social matching jämfört med att enbart använda collaborative filtering. / Recommender systems are a crucial part of navigating the vast number of products on the internet. Social media, in the form of Twitter microblogs, has been previously used to produce movie recommendations, yet this has mainly been to solve cold-start, a common problem in collaborative filtering environments. This work addresses how top-k recommendations in a collaborative filtering environment are affected when augmented with social media data. To answer this question a novel prototype is developed following a design science process model. This system re-ranks top-k recommendations based on a social matching process where Tweets are matched with movie keywords through latent semantic indexing (LSI) similarity. The prototype is evaluated through experiments regarding functionality, accuracy, consistency, and performance. The results show that NDCG and MAP metrics of the top-k recommendations improve with social matching compared to only using the collaborative filtering algorithms.
|
36 |
Analysis of Agreement Between Two Long Ranked ListsSampath, Srinath January 2013 (has links)
No description available.
|
37 |
Efficient and Effective Local Algorithms for Analyzing Massive GraphsWu, Yubao 31 May 2016 (has links)
No description available.
|
38 |
Semantically-enabled stream processing and complex event processing over RDF graph streams / Traitement de flux sémantiquement activé et traitement d'évènements complexes sur des flux de graphe RDFGillani, Syed 04 November 2016 (has links)
Résumé en français non fourni par l'auteur. / There is a paradigm shift in the nature and processing means of today’s data: data are used to being mostly static and stored in large databases to be queried. Today, with the advent of new applications and means of collecting data, most applications on the Web and in enterprises produce data in a continuous manner under the form of streams. Thus, the users of these applications expect to process a large volume of data with fresh low latency results. This has resulted in the introduction of Data Stream Processing Systems (DSMSs) and a Complex Event Processing (CEP) paradigm – both with distinctive aims: DSMSs are mostly employed to process traditional query operators (mostly stateless), while CEP systems focus on temporal pattern matching (stateful operators) to detect changes in the data that can be thought of as events. In the past decade or so, a number of scalable and performance intensive DSMSs and CEP systems have been proposed. Most of them, however, are based on the relational data models – which begs the question for the support of heterogeneous data sources, i.e., variety of the data. Work in RDF stream processing (RSP) systems partly addresses the challenge of variety by promoting the RDF data model. Nonetheless, challenges like volume and velocity are overlooked by existing approaches. These challenges require customised optimisations which consider RDF as a first class citizen and scale the processof continuous graph pattern matching. To gain insights into these problems, this thesis focuses on developing scalable RDF graph stream processing, and semantically-enabled CEP systems (i.e., Semantic Complex Event Processing, SCEP). In addition to our optimised algorithmic and data structure methodologies, we also contribute to the design of a new query language for SCEP. Our contributions in these two fields are as follows: • RDF Graph Stream Processing. We first propose an RDF graph stream model, where each data item/event within streams is comprised of an RDF graph (a set of RDF triples). Second, we implement customised indexing techniques and data structures to continuously process RDF graph streams in an incremental manner. • Semantic Complex Event Processing. We extend the idea of RDF graph stream processing to enable SCEP over such RDF graph streams, i.e., temporalpattern matching. Our first contribution in this context is to provide a new querylanguage that encompasses the RDF graph stream model and employs a set of expressive temporal operators such as sequencing, kleene-+, negation, optional,conjunction, disjunction and event selection strategies. Based on this, we implement a scalable system that employs a non-deterministic finite automata model to evaluate these operators in an optimised manner. We leverage techniques from diverse fields, such as relational query optimisations, incremental query processing, sensor and social networks in order to solve real-world problems. We have applied our proposed techniques to a wide range of real-world and synthetic datasets to extract the knowledge from RDF structured data in motion. Our experimental evaluations confirm our theoretical insights, and demonstrate the viability of our proposed methods
|
39 |
Considering User Intention in Differential Graph QueriesVasilyeva, Elena, Thiele, Maik, Bornhövd, Christof, Lehner, Wolfgang 30 November 2020 (has links)
Empty answers are a major problem by processing pattern matching queries in graph databases. Especially, there can be multiple reasons why a query failed. To support users in such situations, differential queries can be used that deliver missing parts of a graph query. Multiple heuristics are proposed for differential queries, which reduce the search space. Although they are successful in increasing the performance, they can discard query subgraphs relevant to a user. To address this issue, the authors extend the concept of differential queries and introduce top-k differential queries that calculate the ranking based on users’ preferences and significantly support the users’ understanding of query database management systems. A user assigns relevance weights to elements of a graph query that steer the search and are used for the ranking. In this paper the authors propose different strategies for selection of relevance weights and their propagation. As a result, the search is modelled along the most relevant paths. The authors evaluate their solution and both strategies on the DBpedia data graph.
|
40 |
Top-k Entity Augmentation using Consistent Set CoveringEberius, Julian, Thiele, Maik, Braunschweig, Katrin, Lehner, Wolfgang 19 September 2022 (has links)
Entity augmentation is a query type in which, given a set of entities and a large corpus of possible data sources, the values of a missing attribute are to be retrieved. State of the art methods return a single result that, to cover all queried entities, is fused from a potentially large set of data sources. We argue that queries on large corpora of heterogeneous sources using information retrieval and automatic schema matching methods can not easily return a single result that the user can trust, especially if the result is composed from a large number of sources that user has to verify manually. We therefore propose to process these queries in a Top-k fashion, in which the system produces multiple minimal consistent solutions from which the user can choose to resolve the uncertainty of the data sources and methods used. In this paper, we introduce and formalize the problem of consistent, multi-solution set covering, and present algorithms based on a greedy and a genetic optimization approach. We then apply these algorithms to Web table-based entity augmentation. The publication further includes a Web table corpus with 100M tables, and a Web table retrieval and matching system in which these algorithms are implemented. Our experiments show that the consistency and minimality of the augmentation results can be improved using our set covering approach, without loss of precision or coverage and while producing multiple alternative query results.
|
Page generated in 0.0339 seconds