Global ETD Search

11	Secure and Reliable Data Outsourcing in Cloud Computing Cao, Ning 31 July 2012 (has links) "The many advantages of cloud computing are increasingly attracting individuals and organizations to outsource their data from local to remote cloud servers. In addition to cloud infrastructure and platform providers, such as Amazon, Google, and Microsoft, more and more cloud application providers are emerging which are dedicated to offering more accessible and user friendly data storage services to cloud customers. It is a clear trend that cloud data outsourcing is becoming a pervasive service. Along with the widespread enthusiasm on cloud computing, however, concerns on data security with cloud data storage are arising in terms of reliability and privacy which raise as the primary obstacles to the adoption of the cloud. To address these challenging issues, this dissertation explores the problem of secure and reliable data outsourcing in cloud computing. We focus on deploying the most fundamental data services, e.g., data management and data utilization, while considering reliability and privacy assurance. The first part of this dissertation discusses secure and reliable cloud data management to guarantee the data correctness and availability, given the difficulty that data are no longer locally possessed by data owners. We design a secure cloud storage service which addresses the reliability issue with near-optimal overall performance. By allowing a third party to perform the public integrity verification, data owners are significantly released from the onerous work of periodically checking data integrity. To completely free the data owner from the burden of being online after data outsourcing, we propose an exact repair solution so that no metadata needs to be generated on the fly for the repaired data. The second part presents our privacy-preserving data utilization solutions supporting two categories of semantics - keyword search and graph query. For protecting data privacy, sensitive data has to be encrypted before outsourcing, which obsoletes traditional data utilization based on plaintext keyword search. We define and solve the challenging problem of privacy-preserving multi- keyword ranked search over encrypted data in cloud computing. We establish a set of strict privacy requirements for such a secure cloud data utilization system to become a reality. We first propose a basic idea for keyword search based on secure inner product computation, and then give two improved schemes to achieve various stringent privacy requirements in two different threat models. We also investigate some further enhancements of our ranked search mechanism, including supporting more search semantics, i.e., TF Ã— IDF, and dynamic data operations. As a general data structure to describe the relation between entities, the graph has been increasingly used to model complicated structures and schemaless data, such as the personal social network, the relational database, XML documents and chemical compounds. In the case that these data contains sensitive information and need to be encrypted before outsourcing to the cloud, it is a very challenging task to effectively utilize such graph-structured data after encryption. We define and solve the problem of privacy-preserving query over encrypted graph-structured data in cloud computing. By utilizing the principle of filtering-and-verification, we pre-build a feature-based index to provide feature-related information about each encrypted data graph, and then choose the efficient inner product as the pruning tool to carry out the filtering procedure." cloud computing security privacy preserving cloud storage data reliability data availability keyword search ranked search graph query
12	Benchmark para métodos de consultas por palavras-chave a bancos de dados relacionais / Benchmark for query methods by keywords to relational databases Oliveira Filho, Audir da Costa 21 June 2018 (has links) Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2018-08-03T11:37:48Z No. of bitstreams: 2 Dissertação - Audir da Costa Oliveira Filho - 2018.pdf: 1703675 bytes, checksum: f21c9ff479b840d0cdd37dfc9827c0dd (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2018-08-03T11:41:39Z (GMT) No. of bitstreams: 2 Dissertação - Audir da Costa Oliveira Filho - 2018.pdf: 1703675 bytes, checksum: f21c9ff479b840d0cdd37dfc9827c0dd (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-08-03T11:41:39Z (GMT). No. of bitstreams: 2 Dissertação - Audir da Costa Oliveira Filho - 2018.pdf: 1703675 bytes, checksum: f21c9ff479b840d0cdd37dfc9827c0dd (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2018-06-21 / Keyword query techniques have been proven to be very effective due of their user-friendliness on the Web. However, much of the data is stored in relational databases, being necessary knowledge of a structured language to access this data. In this sense, during the last decade some works have been proposed with the intention of performing keyword queries to relational databases. However, systems that implement this approach have been validated using ad hoc methods that may not reflect real-world workloads. The present work proposes a benchmark for evaluation of the methods of keyword queries to relational databases defining a standardized form with workloads that are consistent with the real world. This proposal assists in assessing the effectiveness of current and future systems. The results obtained with the benchmark application suggest that there are still many gaps to be addressed by keyword query techniques. / Técnicas de consultas por palavras-chave se mostraram muito eficazes devido à sua facilidade de utilização por usuário na Web. Contudo, grande parte dos dados estão armazenados em bancos de dados relacionais, sendo necessário conhecimento de uma linguagem estruturada para acesso a esses dados. Nesse sentido, durante a última década alguns trabalhos foram propostos com intuito de realizar consultas por palavras-chaves a bancos de dados relacionais. No entanto, os sistemas que implementam essa abordagem foram validados utilizando métodos ad hoc com bancos de dados que podem não refletir as cargas utilizadas no mundo real. O presente trabalho propõe um benchmark para avaliação dos métodos de consultas por palavras-chave a bancos de dados relacionais definindo uma forma padronizada com cargas de trabalhos condizentes com a do mundo real. Esta proposta auxilia na avaliaçãode eficácia dos sistemas atuais e futuros. Os resultados obtidos com a aplicação do benchmark sugerem que ainda existe muitas lacunas a serem tratadas pelas técnicas de consultas por palavras-chave. Consultas por palavras-chave Bancos de dados relacionais Benchmark Keyword search Relational database Benchmark
13	Semantisk eller keywords? : En studie av interna sökfunktioner och användarens upplevelse Strand, Charlotte January 2023 (has links) The idea for this study is based on a collaboration with Södra Skogsägarna Ekonomisk Förening, one of Sweden's leading forest industries, who wanted to investigate the possibilities of a new internal search function on its public website, primarily with the help of Azure Cognitive Search. Before and in connection with the implementation of a new search function, the following questions aimed to be answered: • RQ1: How does semantic search differ from keyword search? What are the limitations of semantic search today? • RQ2: In what ways does the user experience of the new search function differ from the old search function? To find answers to the questions, a literature study was conducted and case studies consisting of a survey among the website's visitors and two different user surveys. The literature study aimed to answer RQ1 and form a knowledge base for the design of the new search function by examining the search engine's history, the difference between a keyword-based search function and a semantic search function, and looking at how one expects today's smart search functions to develop. The survey included questions about visitors' use of the existing search function and perception of it. User survey number 1 was conducted with a select group of participants. The survey consisted of a number of tasks that would be performed using the existing search function to get a better picture of the user experience and help answering RQ2. When the new search function was ready for testing, User Survey number 2 was conducted where participants compared the old and the new search function by performing the same tasks with both solutions open in parallel windows. The study showed that the majority of the participants in the survey perceived the old search function as effective enough to make them satisfied. User survey 1 suggested that relevant results came too far down the results list or no relevant results were obtained at all. After implementing Azure Cognitive Search with a semantic feature enabled, test participants were able to ask questions in the search box and get answers directly at the top of the results list, which made the new search feature preferred over the old one. The literature study showed how keyword-based search is based on the principle of keywords and its occurrence in the searchable index, while a semantic search function tries to interpret the meaning behind the search term instead. / Idén till denna studie grundar sig ett samarbete med Södra Skogsägarna Ekonomisk Förening, en av Sveriges ledande skogsindustrier, som ville undersöka möjligheterna med en ny intern sökfunktion på sin publika webbplats, främst med hjälp av Azure Cognitive Search. Inför och i samband med implementeringen av en ny sökfunktion ville man besvara följande frågeställningar: · RQ1: Hur skiljer sig semantisk sökning i jämförelse med sökning mot nyckelord (keywords)? Vilka begränsningar finns det med semantisk sökning idag?’ · RQ2: På vilka sätt skiljer sig användarupplevelsen av den nya sökfunktionen med semantisk funktion i jämförelse med den gamla, nyckelordsbaserade sökfunktionen? För att söka svar på frågeställningarna gjordes en litteraturstudie samt fallstudier bestående av en enkät bland webbplatsens besökare och två olika användarundersökningar. Litteraturstudien ämnade besvara RQ1 och utgöra en kunskapsgrund inför utformningen av den nya sökfunktionen genom att undersöka sökmotorns historia, skillnaden mellan en nyckelordsbaserad sökfunktion och en semantisk sökfunktion samt se på hur man förväntar sig att dagens smarta sökfunktioner kommer att utvecklas. Enkäten innehöll frågor om besökarnas användande av den befintliga sökfunktionen och uppfattningen om den. Användarundersökning 1 utfördes med en utvald skara deltagare. Undersökningen bestod av ett antal uppgifter som skulle utföras med hjälp av den befintliga sökfunktionen för att få en bättre bild av användarupplevelsen och hjälpa till att besvara RQ2. När den nya sökfunktionen var klar för test gjordes Användarundersökning 2 där man lät deltagarna jämför den gamla och den nya sökfunktionen genom att utföra samma uppgifter med båda lösningarna parallellt. Studien visade att majoriteten av deltagarna i enkäten upplevde den gamla sökfunktionen som tillräckligt effektiv för att göra dem nöjda. Användarundersökning 1 antydde att relevanta resultat kom för långt ner i resultatlistan eller så fick man inga relevanta resultat alls. Efter implementering av Azure Cognitive Search med en semantisk funktion påkopplad kunde testdeltagarna ställa frågor i sökrutan och få svar direkt högst upp i resultatlistan, vilket gjorde att den nya sökfunktionen föredrogs framför en gamla. Litteraturstudien visade på hur nyckelordsbaserat sök grundar sig på principen om nyckelord, keywords och dess förekomst i det sökbara indexet medan en semantisk sökfunktion försöker tolka meningen bakom söktermen i stället. semantic search azure cognitive search keyword search user experience search engines Software Engineering Programvaruteknik Computer Sciences Datavetenskap (datalogi)
14	[pt] NOVAS MEDIDAS DE IMPORTÂNCIA DE VÉRTICES PARA APERFEIÇOAR A BUSCA POR PALAVRAS-CHAVE EM GRAFOS RDF / [en] NOVEL NODE IMPORTANCE MEASURES TO IMPROVE KEYWORD SEARCH OVER RDF GRAPHS ELISA SOUZA MENENDEZ 15 April 2019 (has links) [pt] Um ponto importante para o sucesso de sistemas de busca por palavras-chave é um mecanismo de ranqueamento que considera a importância dos documentos recuperados. A noção de importância em grafos é tipicamente computada usando medidas de centralidade, que dependem amplamente do grau dos nós, como o PageRank. Porém, em grafos RDF, a noção de importância não é necessariamente relacionada com o grau do nó. Sendo assim, esta tese aborda dois problemas: (1) como definir uma medida de importância em grafos RDF; (2) como usar essas medidas para ajudar a compilar e ranquear respostas a consultas por palavras-chave sobre grafos RDF. Para resolver estes problemas, esta tese propõe uma nova família de medidas, chamada de InfoRank, e um sistema de busca por palavras-chave, chamado QUIRA, para grafos RDF. Esta tese é concluída com experimentos que mostram que a solução proposta melhora a qualidade dos resultados em benchmarks de busca por palavras-chave. / [en] A key contributor to the success of keyword search systems is a ranking mechanism that considers the importance of the retrieved documents. The notion of importance in graphs is typically computed using centrality measures that highly depend on the degree of the nodes, such as PageRank. However, in RDF graphs, the notion of importance is not necessarily related to the node degree. Therefore, this thesis addresses two problems: (1) how to define importance measures for RDF graphs; (2) how to use these measures to help compile and rank results of keyword queries over RDF graphs. To solve these problems, the thesis proposes a novel family of measures, called InfoRank, and a keyword search system, called QUIRA, for RDF graphs. Finally, this thesis concludes with experiments showing that the proposed solution improves the quality of the results in two keyword search benchmarks. [pt] RDF [en] RDF [pt] SPARQL [en] SPARQL [pt] RANQUEAMENTO [en] RANKING [pt] PAGERANK [en] PAGERANK [pt] BUSCA POR PALAVRA CHAVE [en] KEYWORD SEARCH
15	Exploration et interrogation de données RDF intégrant de la connaissance métier / Integrating domain knowledge for RDF dataset exploration and interrogation Ouksili, Hanane 21 October 2016 (has links) Un nombre croissant de sources de données est publié sur le Web, décrites dans les langages proposés par le W3C tels que RDF, RDF(S) et OWL. Une quantité de données sans précédent est ainsi disponible pour les utilisateurs et les applications, mais l'exploitation pertinente de ces sources constitue encore un défi : l'interrogation des sources est en effet limitée d'abord car elle suppose la maîtrise d'un langage de requêtes tel que SPARQL, mais surtout car elle suppose une certaine connaissance de la source de données qui permet de cibler les ressources et les propriétés pertinentes pour les besoins spécifiques des applications. Le travail présenté ici s'intéresse à l'exploration de sources de données RDF, et ce selon deux axes complémentaires : découvrir d'une part les thèmes sur lesquels porte la source de données, fournir d'autre part un support pour l'interrogation d'une source sans l'utilisation de langage de requêtes, mais au moyen de mots clés. L'approche d'exploration proposée se compose ainsi de deux stratégies complémentaires : l'exploration thématique et la recherche par mots clés. La découverte de thèmes dans une source de données RDF consiste à identifier un ensemble de sous-graphes, non nécessairement disjoints, chacun représentant un ensemble cohérent de ressources sémantiquement liées et définissant un thème selon le point de vue de l'utilisateur. Ces thèmes peuvent être utilisés pour permettre une exploration thématique de la source, où les utilisateurs pourront cibler les thèmes pertinents pour leurs besoins et limiter l'exploration aux seules ressources composant les thèmes sélectionnés. La recherche par mots clés est une façon simple et intuitive d'interroger les sources de données. Dans le cas des sources de données RDF, cette recherche pose un certain nombre de problèmes, comme l'indexation des éléments du graphe, l'identification des fragments du graphe pertinents pour une requête spécifique, l'agrégation de ces fragments pour former un résultat, et le classement des résultats obtenus. Nous abordons dans cette thèse ces différents problèmes, et nous proposons une approche qui permet, en réponse à une requête mots clés, de construire une liste de sous-graphes et de les classer, chaque sous-graphe correspondant à un résultat pertinent pour la requête. Pour chacune des deux stratégies d'exploration d'une source RDF, nous nous sommes intéressés à prendre en compte de la connaissance externe, permettant de mieux répondre aux besoins des utilisateurs. Cette connaissance externe peut représenter des connaissances du domaine, qui permettent de préciser le besoin exprimé dans le cas d'une requête, ou de prendre en compte des connaissances permettant d'affiner la définition des thèmes. Dans notre travail, nous nous sommes intéressés à formaliser cette connaissance externe et nous avons pour cela introduit la notion de pattern. Ces patterns représentent des équivalences de propriétés et de chemins dans le graphe représentant la source. Ils sont évalués et intégrés dans le processus d'exploration pour améliorer la qualité des résultats. / An increasing number of datasets is published on the Web, expressed in languages proposed by the W3C to describe Web data such as RDF, RDF(S) and OWL. The Web has become a unprecedented source of information available for users and applications, but the meaningful usage of this information source is still a challenge. Querying these data sources requires the knowledge of a formal query language such as SPARQL, but it mainly suffers from the lack of knowledge about the source itself, which is required in order to target the resources and properties relevant for the specific needs of the application. The work described in this thesis addresses the exploration of RDF data sources. This exploration is done according to two complementary ways: discovering the themes or topics representing the content of the data source, and providing a support for an alternative way of querying the data sources by using keywords instead of a query formulated in SPARQL. The proposed exploration approach combines two complementary strategies: thematic-based exploration and keyword search. Theme discovery from an RDF dataset consists in identifying a set of sub-graphs which are not necessarily disjoints, and such that each one represents a set of semantically related resources representing a theme according to the point of view of the user. These themes can be used to enable a thematic exploration of the data source where users can target the relevant theme and limit their exploration to the resources composing this theme. Keyword search is a simple and intuitive way of querying data sources. In the case of RDF datasets, this search raises several problems, such as indexing graph elements, identifying the relevant graph fragments for a specific query, aggregating these relevant fragments to build the query results, and the ranking of these results. In our work, we address these different problems and we propose an approach which takes as input a keyword query and provides a list of sub-graphs, each one representing a candidate result for the query. These sub-graphs are ordered according to their relevance to the query. For both keyword search and theme identification in RDF data sources, we have taken into account some external knowledge in order to capture the users needs, or to bridge the gap between the concepts invoked in a query and the ones of the data source. This external knowledge could be domain knowledge allowing to refine the user's need expressed by a query, or to refine the definition of themes. In our work, we have proposed a formalization to this external knowledge and we have introduced the notion of pattern to this end. These patterns represent equivalences between properties and paths in the dataset. They are evaluated and integrated in the exploration process to improve the quality of the result. Patterns Données RDF(S)/OWL Web Sémantique Exploration de données Identification de thèmes Recherche mots clés Patterns RDF(S)/OWL data Semantic Web Data Exploration Theme discovery Keyword search 005.75
16	[en] A KEYWORD-BASED QUERY PROCESSING METHOD FOR DATASETS WITH SCHEMAS / [pt] MÉTODO PARA O PROCESSAMENTO DE CONSULTAS POR PALAVRAS-CHAVES PARA BASES DE DADOS COM ESQUEMAS GRETTEL MONTEAGUDO GARCÍA 23 June 2020 (has links) [pt] Usuários atualmente esperam consultar dados de maneira semelhante ao Google, digitando alguns termos, chamados palavras-chave, e deixando para o sistema recuperar os dados que melhor correspondem ao conjunto de palavras-chave. O cenário é bem diferente em sistemas de gerenciamento de banco de dados em que os usuários precisam conhecer linguagens de consulta sofisticadas para recuperar dados, ou em aplicações de banco de dados em que as interfaces de usuário são projetadas como inúmeras caixas que o usuário deve preencher com seus parâmetros de pesquisa. Esta tese descreve um algoritmo e um framework projetados para processar consultas baseadas em palavras-chave para bases de dados com esquema, especificamente bancos relacionais e bases de dados em RDF. O algoritmo primeiro converte uma consulta baseada em palavras-chave em uma consulta abstrata e, em seguida, compila a consulta abstrata em uma consulta SPARQL ou SQL, de modo que cada resultado da consulta SPARQL (resp. SQL) seja uma resposta para a consulta baseada em palavras-chave. O algoritmo explora o esquema para evitar a intervenção do usuário durante o processo de busca e oferece um mecanismo de feedback para gerar novas respostas. A tese termina com experimentos nas bases de dados Mondial, IMDb e Musicbrainz. O algoritmo proposto obtém resultados satisfatórios para os benchmarks. Como parte dos experimentos, a tese também compara os resultados e o desempenho obtidos com bases de dados em RDF e bancos de dados relacionais. / [en] Users currently expect to query data in a Google-like style, by simply typing some terms, called keywords, and leaving it to the system to retrieve the data that best match the set of keywords. The scenario is quite different in database management systems, where users need to know sophisticated query languages to retrieve data, and in database applications, where the user interfaces are designed as a stack of pages with numerous boxes that the user must fill with his search parameters. This thesis describes an algorithm and a framework designed to support keywordbased queries for datasets with schema, specifically RDF datasets and relational databases. The algorithm first translates a keyword-based query into an abstract query, and then compiles the abstract query into a SPARQL or a SQL query such that each result of the SPARQL (resp. SQL) query is an answer for the keywordbased query. It explores the schema to avoid user intervention during the translation process and offers a feedback mechanism to generate new answers. The thesis concludes with experiments over the Mondial, IMDb, and Musicbrainz databases. The proposed translation algorithm achieves satisfactory results and good performance for the benchmarks. The experiments also compare the RDF and the relational alternatives. [pt] RDF [pt] SQL [pt] BUSCA POR PALAVRAS CHAVE [pt] ARVORES DE STEINER [pt] SPARQL [en] RDF [en] SQL [en] KEYWORD SEARCH [en] STEINER TREE [en] SPARQL
17	[pt] BUSCA POR PALAVRAS-CHAVE SOBRE GRAFOS RDF FEDERADOS EXPLORANDO SEUS ESQUEMAS / [en] KEYWORD SEARCH OVER FEDERATED RDF GRAPHS BY EXPLORING THEIR SCHEMAS YENIER TORRES IZQUIERDO 28 July 2017 (has links) [pt] O Resource Description Framework (RDF) foi adotado como uma recomendação do W3C em 1999 e hoje é um padrão para troca de dados na Web. De fato, uma grande quantidade de dados foi convertida em RDF, muitas vezes em vários conjuntos de dados fisicamente distribuídos ao longo de diferentes localizações. A linguagem de consulta SPARQL (sigla do inglês de SPARQL Protocol and RDF Query Language) foi oficialmente introduzido em 2008 para recuperar dados RDF e fornecer endpoints para consultar fontes distribuídas. Uma maneira alternativa de acessar conjuntos de dados RDF é usar consultas baseadas em palavras-chave, uma área que tem sido extensivamente pesquisada, com foco recente no conteúdo da Web. Esta dissertação descreve uma estratégia para compilar consultas baseadas em palavras-chave em consultas SPARQL federadas sobre conjuntos de dados RDF distribuídos, assumindo que cada conjunto de dados RDF tem um esquema e que a federação tem um esquema mediado. O processo de compilação da consulta SPARQL federada é explicado em detalhe, incluindo como computar o conjunto de joins externos entre as subconsultas locais geradas, como combinar, com a ajuda de cláusulas UNION, os resultados de consultas locais que não têm joins entre elas, e como construir a cláusula TARGET, de acordo com a composição da cláusula WHERE. Finalmente, a dissertação cobre experimentos com dados do mundo real para validar a implementação. / [en] The Resource Description Framework (RDF) was adopted as a W3C recommendation in 1999 and today is a standard for exchanging data in the Web. Indeed, a large amount of data has been converted to RDF, often as multiple datasets physically distributed over different locations. The SPARQL Protocol and RDF Query Language (SPARQL) was officially introduced in 2008 to retrieve RDF datasets and provide endpoints to query distributed sources. An alternative way to access RDF datasets is to use keyword-based queries, an area that has been extensively researched, with a recent focus on Web content. This dissertation describes a strategy to compile keyword-based queries into federated SPARQL queries over distributed RDF datasets, under the assumption that each RDF dataset has a schema and that the federation has a mediated schema. The compilation process of the federated SPARQL query is explained in detail, including how to compute a set of external joins between the local subqueries, how to combine, with the help of the UNION clauses, the results of local queries which have no external joins between them, and how to construct the TARGET clause, according to the structure of the WHERE clause. Finally, the dissertation covers experiments with real-world data to validate the implementation. [pt] ESQUEMA MEDIADO [en] MEDIATED SCHEMA [pt] RDF [en] RDF [pt] DADOS CONECTADOS [en] LINKED DATA [pt] SPARQL [en] SPARQL [pt] CONSULTAS FEDERADAS [en] FEDERATED QUERY [pt] BUSCA POR PALAVRA CHAVE [en] KEYWORD SEARCH
18	Leyline : a provenance-based desktop search system using graphical sketchpad user interface Ghorashi, Seyed Soroush 07 December 2011 (has links) While there are powerful keyword search systems that index all kinds of resources including emails and web pages, people have trouble recalling semantic facts such as the name, location, edit dates and keywords that uniquely identifies resources in their personal repositories. Reusing information exasperates this problem. A rarely used approach is to leverage episodic memory of file provenance. Provenance is traditionally defined as "the history of ownership of a valued object". In terms of documents, we consider not only the ownership, but also the operations performed on the document, especially those that related it to other people, events, or resources. This thesis investigates the potential advantages of using provenance data in desktop search, and consists of two manuscripts. First, a numerical analysis using field data from a longitudinal study shows that provenance information can effectively be used to identify files and resources in realistic repositories. We introduce the Leyline, the first provenance-based search system that supports dynamic relations between files and resources such as copy/paste, save as, file rename. The Leyline allows users to search by drawing search queries as graphs in a sketchpad. The Leyline overlays provenance information that may help users identify targets or explore information flow. A limited controlled experiment showed that this approach is feasible in terms of time and effort. Second, we explore the design of the Leyline, compare it to previous provenance-based desktop search systems, including their underlying assumptions and focus, search coverage and flexibility, and features and limitations. / Graduation date: 2012 Provenance Documents User Interface Keyword Search Information Search and Retrieval Query formulation Search process File Organization Desktop Search User interaction styles User-centered design
19	[pt] CONTRIBUIÇÕES AO PROBLEMA DE BUSCA POR PALAVRAS-CHAVE EM CONJUNTOS DE DADOS E TRAJETÓRIAS SEMÂNTICAS BASEADOS NO RESOURCE DESCRIPTION FRAMEWORK / [en] CONTRIBUTIONS TO THE PROBLEM OF KEYWORD SEARCH OVER DATASETS AND SEMANTIC TRAJECTORIES BASED ON THE RESOURCE DESCRIPTION FRAMEWORK YENIER TORRES IZQUIERDO 18 May 2021 (has links) [pt] Busca por palavras-chave fornece uma interface fácil de usar para recuperar informação. Esta tese contribui para os problemas de busca por palavras chave em conjuntos de dados sem esquema e trajetórias semânticas baseados no Resource Description Framework. Para endereçar o problema da busca por palavras-chave em conjuntos de dados RDF sem esquema, a tese introduz um algoritmo para traduzir automaticamente uma consulta K baseada em palavras-chave especificadas pelo usuário em uma consulta SPARQL Q de tal forma que as respostas que Q retorna também são respostas para K. O algoritmo não depende de um esquema RDF, mas sintetiza as consultas SPARQL explorando a semelhança entre os domínios e contradomínios das propriedades e os conjuntos de instâncias de classe observados no grafo RDF. O algoritmo estima a similaridade entre conjuntos com base em sinopses, que podem ser precalculadas, com eficiência, em uma única passagem sobre o conjunto de dados RDF. O trabalho inclui dois conjuntos de experimentos com uma implementação do algoritmo. O primeiro conjunto de experimentos mostra que a implementação supera uma ferramenta de pesquisa por palavras-chave sobre grafos RDF que explora o esquema RDF para sintetizar as consultas SPARQL, enquanto o segundo conjunto indica que a implementação tem um desempenho melhor do que sistemas de pesquisa por palavras-chave em conjuntos de dados RDF baseados na abordagem de documentos virtuais denominados TSA+BM25 e TSA+VDP. Finalmente, a tese também computa a eficácia do algoritmo proposto usando uma métrica baseada no conceito de relevância do grafo resposta. O segundo problema abordado nesta tese é o problema da busca por palavras-chave sobre trajetórias semânticas baseadas em RDF. Trajetórias semânticas são trajetórias segmentadas em que as paradas e os deslocamentos de um objeto móvel são semanticamente enriquecidos com dados adicionais. Uma linguagem de consulta para conjuntos de trajetórias semânticas deve incluir seletores para paradas ou deslocamentos com base em seus enriquecimentos e expressões de sequência que definem como combinar os resultados dos seletores com a sequência que a trajetória semântica define. A tese inicialmente propõe um framework formal para definir trajetórias semânticas e introduz expressões de sequências de paradas-e-deslocamentos (stop-and-move sequences), com sintaxe e semântica bem definidas, que atuam como uma linguagem de consulta expressiva para trajetórias semânticas. A tese descreve um modelo concreto de trajetória semântica em RDF, define expressões de sequências de paradas-e-deslocamentos em SPARQL e discute estratégias para compilar tais expressões em consultas SPARQL. A tese define consultas sobre trajetórias semânticas com base no uso de palavras-chave para especificar paradas e deslocamentos e a adoção de termos com semântica predefinida para compor expressões de sequência. Em seguida, descreve como compilar tais expressões em consultas SPARQL, mediante o uso de padrões predefinidos. Finalmente, a tese apresenta uma prova de conceito usando um conjunto de trajetórias semânticas construído com conteúdo gerado pelos usuários do Flickr, combinado com dados da Wikipedia. / [en] Keyword search provides an easy-to-use interface for retrieving information. This thesis contributes to the problems of keyword search over schema-less datasets and semantic trajectories based on RDF. To address the keyword search over schema-less RDF datasets problem, this thesis introduces an algorithm to automatically translate a user-specified keyword-based query K into a SPARQL query Q so that the answers Q returns are also answers for K. The algorithm does not rely on an RDF schema, but it synthesizes SPARQL queries by exploring the similarity between the property domains and ranges, and the class instance sets observed in the RDF dataset. It estimates set similarity based on set synopses, which can be efficiently precomputed in a single pass over the RDF dataset. The thesis includes two sets of experiments with an implementation of the algorithm. The first set of experiments shows that the implementation outperforms a baseline RDF keyword search tool that explores the RDF schema, while the second set of experiments indicate that the implementation performs better than the stateof- the-art TSA+BM25 and TSA+VDP keyword search systems over RDF datasets based on the virtual documents approach. Finally, the thesis also computes the effectiveness of the proposed algorithm using a metric based on the concept of graph relevance. The second problem addressed in this thesis is the keyword search over RDF semantic trajectories problem. Stop-and-move semantic trajectories are segmented trajectories where the stops and moves are semantically enriched with additional data. A query language for semantic trajectory datasets has to include selectors for stops or moves based on their enrichments, and sequence expressions that define how to match the results of selectors with the sequence the semantic trajectory defines. The thesis first proposes a formal framework to define semantic trajectories and introduces stop and move sequence expressions, with well-defined syntax and semantics, which act as an expressive query language for semantic trajectories. Then, it describes a concrete semantic trajectory model in RDF, defines SPARQL stop-and-move sequence expressions, and discusses strategies to compile such expressions into SPARQL queries. Next, the thesis specifies user-friendly keyword search expressions over semantic trajectories based on the use of keywords to specify stop and move queries, and the adoption of terms with predefined semantics to compose sequence expressions. It then shows how to compile such keyword search expressions into SPARQL queries. Finally, it provides a proof-of-concept experiment over a semantic trajectory dataset constructed with user-generated content from Flickr, combined with Wikipedia data. [pt] SPARQL [pt] EXPRESSOES DE SEQUENCIA [pt] SINOPSES KMV [pt] GRAFO RDF [pt] PESQUISA POR PALAVRAS-CHAVE [pt] TRAJETORIAS SEMANTICAS [en] SPARQL [en] SEQUENCE EXPRESSIONS [en] KMV-SYNOPSES [en] RDF GRAPH [en] KEYWORD SEARCH [en] SEMANTIC TRAJECTORIES

Search results