Global ETD Search

321	En tesaurus som ledsagare : En jämförande studie av tre sökstrategiers inverkan på återvinningsresultatet i en bibliografisk databas. / The thesaurus as a companion : A comparative study of three search strategies and their influence on information retrieval results in a bibliographic database. Hagberg, Lena, Müntzing, Johanna January 2006 (has links) This Master’s thesis is a comparative study of information retrieval results between three distinct search strategies in simulated automatic query expansion in a bibliographic database. Our purpose is to investigate which of the search strategies score the most effective precision and to what extent the same relevant documents are retrieved (overlapped). A thesaurus attached to the database is used to select appropriate descriptors for the baseline query formulations which subsequently are expanded with hierarchical relations. The search strategies are s1: A baseline query with two or three descriptors, s2: The baseline descriptors combined with at least one Narrower Term, s3: The baseline descriptors combined with Narrower Term and at least one Broader Term. A Document Cutoff Value of 15 is used and only the 15 highest ranked documents are judged by relevancy. The measurements used are precision for effectiveness and Jaccard’s index for overlap. In terms of precision, results reveal that s1 scores the highest value (average 84,8 %) with s2 and s3 in decreasing order (average 81,94 % and 61,41 % respectively). The overlap varies greatly depending on topic and the average is between s1 and s2 78,81 %, between s2 and s3 58,48 % and between s3 and s1 40,41 %. In short, average precision decreases as well as average overlap. The use of thesaurus in the applied strategy of automatic query expansion is not recommended in this specific database, if the aim is to increase precision. However, in single searches with the structure like s1 the thesaurus can be of assistance in the selection of specific search terms. / Uppsatsnivå: D information retrieval query expansion tesaurus bibliografisk databas informationsåtervinning kontrollerad vokabulär Social Sciences Samhällsvetenskap
322	Evaluation des requêtes hybrides basées sur la coordination des services / Evaluation of hybrid queries based on service coordination Cuevas Vicenttin, Victor 08 July 2011 (has links) Les récents progrès réalisés en matière de communication (réseaux hauts débits, normalisation des protocoles et des architectures à objets répartis, explosion de l'internet) conduisent à l'apparition de systèmes de gestion de données et services largement répartis. Les données sont produites à la demande ou de manière continue au travers de divers dispositifs statiques ou mobiles. Cette thèse présente une approche pour l'évaluation de requêtes dites hybrides car intégrant différents aspects des données mobiles, continues, cachées rencontrées dans des environnements dynamiques. Notre approche consiste à représenter une telle requête comme une coordination de services comprenant des services de données et de calcul. Une telle coordination est définie par le flux de la requête et ceux d'opérateurs sur les données (e.g. join, select, union). Un flux de requête représente une expression construite avec les opérateurs de notre modèle de données. Ce flux est construit par un algorithme de ré-écriture à partir de la requête spécifiée dans notre langage de requête HSQL Les flux dit opérateurs composent des services de calcul afin de permettre l'évaluation d'un opérateur particulier. Le processeur de requêtes basées sur les services hybrides que nous avons développé met en mise en œuvre et valide nos propositions. / Recent trends in information technologies result in a massive proliferation of data which are carried over different kinds of networks, produced in either on-demand or streaming fashion, generated and accessible by a variety of devices, and that can involve mobility aspects. This thesis presents an approach for the evaluation of hybrid queries that integrate the various aspects involved in querying continuous, mobile and hidden data in dynamic environments. Our approach consists of representing such an hybrid query as a service coordination comprising data and computation services. A service coordination is specified by a query workflow and additional operator workflows. A query workflow represents an expression built with the operators of our data model. This workflow is constructed from a query specified in our proposed SQL-like query language, HSQL, by an algorithm we developed based on known results of database theory. Operator workflows enable to compose computation services to enable the evaluation of a particular operator. HYPATIA, a service-based hybrid query processor, implements and validates our approach. Traitement de requêtes Coordination des services Bases de données Query processing Service coordination Databases 004
323	Att sjunga en fråga. En jämförelse av tre Query-by-Humming-system och deras användare. / To sing a question. A comparison of three Query-by-Humming systems and their different users. Eriksson, Madeleine January 2012 (has links) The aim of this study was to compare the Query-by-Humming systems Midomi, Musicline and Tunebot regarding their retrieval effectiveness. The aim was to see if there were differences between the systems but also between the user groups common users, musicians and singers. Query-by-Humming system means that the user sings a tune that the system then use to find the right melody.To compare the systems and their users, queries where collected from the different user groups and replayed for the systems. Mean Reciprocal Rank and Friedman test was used to do the comparison.The results showed that the system did not achieve equivalent and that there were no difference between the user groups. The Mean Reciprocal Rank showed that the systems had very different retrieval effectiveness, where Midomi was the system with best result and Musicline with the lowest result. / Program: Bibliotekarie Query-by-Humming Music Information Retrieval Informationsåtervinning Återvinnginseffektivitet Social Sciences Samhällsvetenskap
324	"Uma linguagem visual de consulta a banco de dados utilizando o paradigma de fluxo de dados" / One visual query language using data flow paradigm Appel, Ana Paula 02 April 2003 (has links) Apesar de muito trabalho ter sido dispendido sobre linguagens de consulta a Sistemas de Gerenciamento de Bancos de Dados Relacionais, existem somente dois paradigmas básicos para essas linguagens, que são representados pela Structured Query Language SQL e pela Query by Example QBE. Apesar dessas linguagens de consultas serem computacionalmente completas, elas tem a desvantagem de não permitir ao usuário nenhuma interação gráfica com a informação contida na base de dados. Um dos principais desenvolvimentos na área de base de dados diz respeito às ferramentas que proveêm aos usuários um entendimento simples da base de dados e uma extração amigável da informação. A linguagem descrita neste trabalho possibilita que usuários criem consultas graficamente por meio de diagramas de fluxo de dados. Além da linguagem de consulta gráfica, este trabalho mostra também a ferramenta de apoio Data Flow Query Language - DFQL, que é um editor/executor de consultas construído para suportar essa linguagem, através de um conjunto de operadores representados graficamente, e a execução desses diagramas, analisando a rede e gerando os comandos correspondentes em SQL para realização da consulta. Esses comandos são submetidos ao sistema de gerenciamento de banco de dados e o resultado é mostrado/gravado conforme a consulta feita. / In spite of many works done on query languages, all existing languages are direct extensions of Structured Query Language SQL and query-By-Example QBE. These two languages were developed in the beginning of the Relational Database Management Systems RDBMS development. Althoug these languages are computationally complete, they take the disadvantage of not supporting graphical interaction with data. One of the the main developments in the database area concerns tools to provide users a simple understand of database content, and friendly extraction of the information. The language described in this work enables users to create graphical queries using data flow diagrams. Besides the graphical query language, this work also shows the Data Flow Query Language - DFQL tool. This tool is a query editor/executer that supports this language, using a set of operators represented graphicaly, and the diagram execution is done by analising the network and producing the respective commands in SQL to realize the query. This commands are sent to the DBMS and the result is shown/recorded according to the query. álgebra relacional banco de dados database linguagem de consulta visual relational algebra Visual query language
325	Efficient Query Processing Over Web-Scale RDF Data Amgad M. Madkour (5930015) 17 January 2019 (has links) The Semantic Web, or the Web of Data, promotes common data formats for representing structured data and their links over the web. RDF is the defacto standard for semantic data where it provides a flexible semi-structured model for describing concepts and relationships. RDF datasets consist of entries (i.e, triples) that range from thousands to Billions. The astronomical growth of RDF data calls for scalable RDF management and query processing strategies. This dissertation addresses efficient query processing over web-scale RDF data. The first contribution is WORQ, an online, workload-driven, RDF query processing technique. Based on the query workload, reduced sets of intermediate results (or reductions, for short) that are common for specific join pattern(s) are computed in an online fashion. Also, we introduce an efficient solution for RDF queries with unbound properties. The second contribution is SPARTI, a scalable technique for computing the reductions offline. SPARTI utilizes a partitioning schema, termed SemVP, that enables efficient management of the reductions. SPARTI uses a budgeting mechanism with a cost model to determine the worthiness of partitioning. The third contribution is KC, an efficient RDF data management system for the cloud. KC uses generalized filtering that encompasses both exact and approximate set membership structures that are used for filtering irrelevant data. KC defines a set of common operations and introduces an efficient method for managing and constructing filters. The final contribution is semantic filtering where data can be reduced based on the spatial, temporal, or ontological aspects of a query. We present a set of encoding techniques and demonstrate how to use semantic filters to reduce irrelevant data in a distributed setting. Applied Computer Science query processing techniques Resource Description Framework (RDF) data management systems
326	Efficient Matrix-aware Relational Query Processing in Big Data Systems Yongyang Yu (5930462) 03 January 2019 (has links) <div>In the big data era, the use of large-scale machine learning methods is becoming ubiquitous in data exploration tasks ranging from business intelligence and bioinformatics to self-driving cars. In these domains, a number of queries are composed of various kinds of operators, such as relational operators for preprocessing input data, and machine learning models for complex analysis. Usually, these learning methods heavily rely on matrix computations. As a result, it is imperative to develop novel query processing approaches and systems that are aware of big matrix data and corresponding operators, scale to clusters of hundreds of machines, and leverage distributed memory for high-performance computation. This dissertation introduces and studies several matrix-aware relational query processing strategies, analyzes and optimizes their performance.</div><div><br></div><div><div>The first contribution of this dissertation is MatFast, a matrix computation system for efficiently processing and optimizing matrix-only queries in a distributed in-memory environment. We introduce a set of heuristic rules to rewrite special features of a matrix query for less memory footprint, and cost models to estimate the sparsity of sparse matrix multiplications, and to distribute the matrix data partitions among various compute workers for a communication-efficient execution. We implement and test the query processing strategies in an open-source distributed dataflow</div><div>engine (Apache Spark).</div></div><div><br></div><div><div>In the second contribution of this dissertation, we extend MatFast to MatRel, where we study how to efficiently process queries that involve both matrix and relational operators. We identify a series of equivalent transformation rules to rewrite a logical plan when both relational and matrix operations are present. We introduce selection, projection, aggregation, and join operators over matrix data, and propose optimizations to reduce computation overhead. We also design a cost model to distribute matrix data among various compute workers for communication-efficient</div><div>evaluation of relational join operations.</div></div><div><br></div><div><div>In the third and last contribution of this dissertation, we demonstrate how to leverage MatRel for optimizing complex matrix-aware relational query evaluation pipelines. Especially, we showcase how to efficiently learn model parameters for deep neural networks of various applications with MatRel, e.g., Word2Vec.</div></div> Applied Computer Science big data query optimization matrix computation distributed computing
327	Efficient Xpath query processing in native XML databases. / CUHK electronic theses & dissertations collection January 2007 (has links) A general XML index can itself be sizable leading to low efficiency. To alleviate this predicament, frequently asked queries can be indexed by a database system. They are referred to as views. Answering queries using materialized views is always cheaper than evaluating over the base data. Traditional techniques solve this problem by considering only a single view. We approach this problem by exploiting the potential relationships of multiple views, which can be used together to answer a given query. Experiments show that significant performance gain can be achieved from multiple views. / A XML query can be decomposed to a sequence of structural joins (e.g., parent/child and ancestor/descendant) and content joins. Thus, structural join optimization is a key to improving join-based evaluation. We optimize structural join with two orthogonal methods: partition-based method exploits the spatial specialities of XML encodings by projecting them on a plane; and location-based method improves structural join by accurately pruning all irrelevant nodes, which cannot produce results. / As XML (eXtensible Markup Language) becomes a universal medium for data exchange over the Internet, efficient XML query processing is now the focus of considerable research and development activities. This thesis describes works toward efficient XML query evaluation and optimization in native XML databases. / XML indexes are widely studied to evaluate XML queries and in particular to accelerate join-based approaches. Index-based approaches outperform join-based approaches (e.g., holistic twig join) if the queries match the index. Existing XML indexes can only support a small set of XML queries because of the varieties in XML query representations. A XML query may involve child-axis only, both child-axis and branches, or additional descendant-or-self-axis but only in the query root. We propose novel indexes to efficiently support a much wider range of XML queries (with /, //, [], *). / Tang, Nan. / "December 2007." / Advisers: Kam-Fei Wong; Jeffrey Xu Yu. / Source: Dissertation Abstracts International, Volume: 69-08, Section: B, page: 4861. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (p. 152-163). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307. Query languages (Computer science) XML (Document markup language) XPath (Computer program language)
328	Modelo de consulta de dados relacionais baseada em contexto para sistemas ubíquos / Model of relational data querying based on context modelling for ubiquitous systems Maran, Vinícius January 2016 (has links) A computação ubíqua define que a computação deve estar presente em ambientes para auxiliar o usuário na realização de suas tarefas diárias de forma eficiente. Para que isto aconteça, sistemas considerados ubíquos devem ser conhecedores do contexto e devem adaptar seu funcionamento em relação aos contextos capturados do ambiente. Informações de contexto podem ser representadas de diversas formas em sistemas computacionais e pesquisas recentes demonstram que a representação destas informações baseada em ontologias apresenta vantagens importantes se comparada à outras soluções, destacando-se principalmente o alto nível de expressividade e a padronização de linguagens para a representação de ontologias. Informações consideradas específicas de domínio são frequentemente representadas em bancos de dados relacionais. Esta diferença em relação a modelos de representação, com o uso de ontologias para representação de contexto e representação relacional para informações de domínio, implica em uma série de problemas no que se refere à adaptação e distribuição de conteúdo em arquiteturas ubíquas. Dentre os principais problemas pode-se destacar a dificuldade de alinhamento entre as informações de domínio e de contexto, a dificuldade na distribuição destas informações entre arquiteturas ubíquas e as diferenças entre modelagens de contexto e de domínio (o conhecimento sobre os objetos do domínio). Este trabalho apresenta um framework de consulta entre informações de contexto e informações de domínio. Com a aplicação deste framework, a recuperação contextualizada de informações se tornou possível, utilizando a expressividade necessária para a modelagem de contexto através de ontologias e utilizando esquemas relacionais previamente definidos e utilizados por sistemas de informação. Para realizar a avaliação do framework, o mesmo foi aplicado em um ambiente baseado no cenário motivador de pesquisa, que descreve possíveis situações de utilização de tecnologias ubíquas. Através da aplicação do framework no cenário motivador, foi possível verificar que a proposta foi capaz de realizar a integração entre contexto e domínio e permitiu estender a filtragem de consultas relacionais. / Ubiquitous computing defines the computer must be present in environments to assist the user to perform their daily tasks efficiently. Thus, ubiquitous systems must be aware of the context and should adapt its operation in relation to the captured environment contexts. Context information can be represented in different ways in computer systems, and recent research shows that the representation of context in ontologies offers important advantages when compared to other solutions, in particular, the high level of expressiveness and the standardization of languages for representation of ontologies. Domain specific information is frequently maintained in relational databases. This difference of representation models, using ontologies for context representation and relational representation to domain information, involves a number of problems as the adjustment and distribution of content in ubiquitous architectures. Related problems include the difficulty of alignment between field and context information, the difficulty in the distribution of information between ubiquitous architectures, and differences between the context and domain modeling (knowledge about the domain objects). This PhD thesis presents a framework of query for context information and domain information. On applying this framework, contextualized information retrieval becomes possible using the expressiveness required for context modeling using ontologies, and using relational schemas previously defined and used by information systems. In order to evaluate the framework, it was applied in an environment based on the motivating scenario. It was possible to verify that the framework was able to accomplish the integration of context and domain, and allowed the extension of the filtering relational queries. Banco : Dados relacionais Ontologias Computação pervasiva Context-awareness Ontology Query Ubiquitous computing Information systems Database
329	Estudo sobre o impacto da adição de vocabulários estruturados da área de ciências da saúde no Currículo Lattes Araújo, Charles Henrique de January 2016 (has links) A busca de informações em bases de dados de instituições que possuem grande volume de dados necessita cada vez mais de processos mais eficientes para realização dessa tarefa. Problemas de grafia, idioma, sinonímia, abreviação de termos e a falta de padronização dos termos, tanto nos argumentos de busca, quanto na indexação dos documentos, interferem diretamente nos resultados. Diante disso, este estudo teve como objetivo avaliar o impacto da adição de vocabulários estruturados da área de Ciências da Saúde no Currículo Lattes, na recuperação de perfis similares de pesquisadores das áreas de Ciências Biológicas e Ciências da Saúde, utilizando técnicas de mineração de dados, expansão de consultas, modelos vetoriais de consultas e utilização de algoritmo de trigramas. Foram realizados cruzamentos de informações entre as palavras-chaves de artigos publicados registrados no Currículo Lattes e as informações contidas no Medical Subject Headings (MeSH) e nos Descritores em Ciências da Saúde (DeCS), bem como comparações entre os resultados das consultas, utilizando as palavras-chaves originais e adicionando-lhes os termos resultantes do processo de expansão de consultas. Os resultados mostram que a metodologia adotada neste estudo pode incrementar qualitativamente o universo de perfis recuperados, podendo dessa forma contribuir para a melhoria dos Sistemas de Informações do Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq. / Information retrieval in large databases need increasingly more efficient ways for accomplishing this task. There are many problems, like spelling, language, synonym, acronyms, lack of standardization of terms, both in the search arguments, as in the indexing of documents. They directly interfere in the results. Thus, this study aimed to evaluate the impact of the addition of structured vocabularies of Health Sciences area in Lattes Database, in the recovery of similar profiles of researchers that work in Biological Sciences and Health Sciences, using Query Expansion, Data Mining procedures, Vector Models and Trigram Phrase Matching algorithm. Crosschecking keywords of articles registered in Lattes Database and Medical Subject Headings (MeSH) and Health Sciences Descriptors (DeCS) terms, as well as comparisons between the results of queries using the original keywords and adding them to query expansion terms. The results show that the methodology used in this study can qualitatively increase the set of recovered profiles, contributing to the improvement of CNPq Information Systems. Vocabulário controlado Sistemas de recomendação Recuperação da informação Ciências da saúde Query expansion Data mining Recommendation systems
330	Relevance feedback-based optimization of search queries for Patents Cheng, Sijin January 2019 (has links) In this project, we design a search query optimization system based on the user’s relevance feedback by generating customized query strings for existing patent alerts. Firstly, the Rocchio algorithm is used to generate a search string by analyzing the characteristics of related patents and unrelated patents. Then the collaborative filtering recommendation algorithm is used to rank the query results, which considering the previous relevance feedback and patent features, instead of only considering the similarity between query and patents as the traditional method. In order to further explore the performance of the optimization system, we design and conduct a series of evaluation experiments regarding TF-IDF as a baseline method. Experiments show that, with the use of generated search strings, the proportion of unrelated patents in search results is significantly reduced over time. In 4 months, the precision of the retrieved results is optimized from 53.5% to 72%. What’s more, the rank performance of the method we proposed is better than the baseline method. In terms of precision, top10 of recommendation algorithm is about 5 percentage points higher than the baseline method, and top20 is about 7.5% higher. It can be concluded that the approach we proposed can effectively optimize patent search results by learning relevance feedback. Patent Search Query Reformulation Recommendation System Matrix Decomposition Text Processing Computer Systems Datorsystem

Search results