Global ETD Search

591	Processamento de consultas baseado em ontologias para sistemas de biodiversidade / Ontology based query processing for biodiversity systems Vilar, Bruno Siqueira Campos Mendonça, 1982- 15 August 2018 (has links) Orientador: Claudia Maria Bauzer Medeiros / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-15T00:34:29Z (GMT). No. of bitstreams: 1 Vilar_BrunoSiqueiraCamposMendonca_M.pdf: 1763680 bytes, checksum: 5a3ddb611bfab6ec3f16246598a66a5b (MD5) Previous issue date: 2009 / Resumo: Sistemas de informação de biodiversidade lidam com um conjunto heterogêneo de informações providas por diferentes grupos de pesquisa. A diversificação pode ocorrer com relação 'as espécies estudadas, 'a estruturação das informações coletadas, ao local de estudo, metodologias de trabalho ou objetivos dos pesquisadores, dentre outros fatores. Esta heterogeneidade de dados, usuários e procedimentos dificulta o reuso e o compartilhamento de informações. Este trabalho contribui para diminuir tal obstáculo, melhorando o processo de consulta 'as informações em sistemas de biodiversidade. Para tanto, propõe um mecanismo de expansão de consultas que pré-processa uma consulta de usuário (cientista) agregando informações adicionais, provenientes de ontologias, para aproximar o resultado da intenção do usuário. Este mecanismo é baseado em serviços Web e foi implementado e testado usados dados e casos de uso reais. / Abstract: Biodiversity information systems need and manage heterogeneous information provided by different research groups. Heterogeneity occur with respect to the species studied, the structure of the information gathered, the region of study, the work methodologies, or the vocabularies and objectives of the researchers, among other factors. This heterogeneity of data, users and procedures hampers information sharing and reuse. This work contributes to reduce this obstacle, improving the query processing mechanisms in biodiversity systems. Its main interpretation is a query expansion mechanism that pre-processes a user (scientist) query aggregating additional information from ontologies, thereby approximating query results to what is intended by the user. This mechanism is based on Web services and was implemented and tested using real case studies. / Mestrado / Banco de Dados / Mestre em Ciência da Computação Processamento de consulta Diversidade biológica Serviços Web Query processing Biodiversity Ontologies (Information retrieval) Web services
592	[en] QEEF-G: ADAPTIVE PARALLEL EXECUTION OF ITERATIVE QUERIES / [pt] QEEF-G: EXECUÇÃO PARALELA ADAPTATIVA DE CONSULTAS ITERATIVAS VINICIUS FONTES VIEIRA DA SILVA 25 April 2007 (has links) [pt] O processamento de consulta paralelo tradicional utilize- se de nós computacionais para reduzir o tempo de processamento de consultas. Com o surgimento das grades computacionais, milhares de nós podem ser utilizados, desafiando as atuais técnicas de processamento de consulta a oferecerem um suporte massivo ao paralelismo em um ambiente onde as condições variam todo a instante. Em adição, as aplicações científicas executadas neste ambiente oferecem novas características de processamento de dados que devem ser integradas em um sistema desenvolvido para este ambiente. Neste trabalho apresentamos o sistema de processamento de consulta paralelo do CoDIMS-G, e seu novo operador Orbit que foi desenvolvido para suportar a avaliação de consultas iterativas. Neste modelo de execução as tuplas são constantemente avaliadas por um fragmento paralelo do plano de execução. O trabalho inclui o desenvolvimento do sistema de processamento de consulta e um novo algoritmo de escalonamento que, considera as variações de rede e o throughput de cada nó, permitindo ao sistema se adaptar constantemente as variações no ambiente. / [en] Traditional parallel query processing uses multiple computing nodes to reduce query response time. Within a Grid computing context, the availability of thousands of nodes challenge current parallel query processing techniques to support massive parallelism in a constantly varying environment conditions. In addition, scientific applications running on Grids offer new data processing characteristics that shall be integrated in such a framework. In this work we present the CoDIMS-G parallel query processing system with a full-fledged new query execution operator named Orbit. Orbit is designed for evaluating massive iterative based data processing. Tuples in Orbit iterate over a parallelized fragment of the query execution plan. This work includes the development of the query processing system and a new scheduling algorithm that considers variation on network and the throughput of each node. Such algorithm permits the system to adapt constantly to the changes in the environment. [pt] PARALELISMO [en] PARALLELISM [pt] BANCO DE DADOS [en] DATABASE [pt] PROCESSAMENTO DE CONSULTAS [en] QUERY PROCESSING [pt] PROCESSAMENTO DISTRIBUIDO [en] DISTRIBUTED COMPUTING
593	Efektivní vyhledávání ve videu pomocí komplexních skic a explorace založené na sémantických deskriptorech / Efficient video retrieval using complex sketches and exploration based on semantic descriptors Blažek, Adam January 2016 (has links) This thesis focuses on novel video retrieval scenarios. More particularly, we aim at the Known-item Search scenario wherein users search for a short video segment known either visually or by a textual description. The scenario assumes that there is no ideal query example available. Our former known- item search tool relying on color feature signatures is extended with major enhancements. Namely, we introduce a multi-modal sketching tool, the exploration of video content with semantic descriptors derived from deep convolutional networks, new browsing/visualization methods and two orthogonal approaches for textual search. The proposed approaches are embodied in our video retrieval tool Enhanced Sketch-based Video Browser (ESBVB). To evaluate ESBVB performance, we participated in international competitions comparing our tool with the state-of-the-art approaches. Repeatedly, our tool outperformed the other methods. Furthermore, we show in our user study that even novice users are able to effectively employ ESBVB capabilities to search and browse known video clips. Powered by TCPDF (www.tcpdf.org)
594	Using Wikipedia Knowledge and Query Types in a New Indexing Approach for Web Search Engines Al-Akashi, Falah Hassan Ali January 2014 (has links) The Web is comprised of a vast quantity of text. Modern search engines struggle to index it independent of the structure of queries and type of Web data, and commonly use indexing based on Web‘s graph structure to identify high-quality relevant pages. However, despite the apparent widespread use of these algorithms, Web indexing based on human feedback and document content is controversial. There are many fundamental questions that need to be addressed, including: How many types of domains/websites are there in the Web? What type of data is in each type of domain? For each type, which segments/HTML fields in the documents are most useful? What are the relationships between the segments? How can web content be indexed efficiently in all forms of document configurations? Our investigation of these questions has led to a novel way to use Wikipedia to find the relationships between the query structures and document configurations throughout the document indexing process and to use them to build an efficient index that allows fast indexing and searching, and optimizes the retrieval of highly relevant results. We consider the top page on the ranked list to be highly important in determining the types of queries. Our aim is to design a powerful search engine with a strong focus on how to make the first page highly relevant to the user, and on how to retrieve other pages based on that first page. Through processing the user query using the Wikipedia index and determining the type of the query, our approach could trace the path of a query in our index, and retrieve specific results for each type. We use two kinds of data to increase the relevancy and efficiency of the ranked results: offline and real-time. Traditional search engines find it difficult to use these two kinds of data together, because building a real-time index from social data and integrating it with the index for the offline data is difficult in a traditional distributed index. As a source of offline data, we use data from the Text Retrieval Conference (TREC) evaluation campaign. The web track at TREC offers researchers chance to investigate different retrieval approaches for web indexing and searching. The crawled offline dataset makes it possible to design powerful search engines that extends current methods and to evaluate and compare them. We propose a new indexing method, based on the structures of the queries and the content of documents. Our search engine uses a core index for offline data and a hash index for real-time V data, which leads to improved performance. The TREC Web track evaluation of our experiments showed that our approach can be successfully employed for different types of queries. We evaluated our search engine on different sets of queries from TREC 2010, 2011 and 2012 Web tracks. Our approach achieved very good results in the TREC 2010 training queries. In the TREC 2011 testing queries, our approach was one of the six best compared to all other approaches (including those that used a very large corpus of 500 million documents), and it was second best when compared to approaches that used only part of the corpus (50 million documents), as ours did. In the TREC 2012 testing queries, our approach was second best if compared to all the approaches, and first if compared only to systems that used the subset of 50 million documents. Web Search Engine Indexing and Searching Wikipedia Knowledge Term Impact Centralized Index Real Time and Social Index Query Structures User Interface
595	A Common Programming Interface for Managed Heterogeneous Data Analysis Luong, Johannes 28 July 2021 (has links) The widespread success of data analysis in a growing number of application domains has lead to the development of a variety of purpose build data processing systems. Today, many organizations operate whole fleets of different data related systems. Although this differentiation has good reasons there is also a growing need to create holistic perspectives that cut across the borders of individual systems. Application experts that want to create such perspectives are confronted with a variety of programming interfaces, data formats, and the task to combine available systems in an efficient manner. These issues are generally unrelated to the application domain and require a specialized set of skills. As a consequence, development is slowed down and made more expensive which stifles exploration and innovation. In addition, the direct use of specialized system interfaces can couple application code to specific processing systems. In this dissertation, we propose the data processing platform DataCalc which presents users with a unified application oriented programming interface and which automatically executes this interface in an efficient manner on a variety of processing systems. DataCalc offers a managed environment for data analyses that enables domain experts to concentrate on their application logic and decouples code from specific processing technology. The basis of this managed processing environment are the high-level domain oriented program representation DCIL and a flexible and extensible cost based optimization component. In addition to traditional up-front optimization, the optimizer also supports dynamic re-optimization of partially executed DCIL programs. This enables the system to benefit from dynamic information that only becomes available during execution of queries. DataCalc assigns workloads to available processing systems using a fine grained task scheduling model to enable efficient exploitation of available resources. In the second part of the dissertation we present a prototypical implementation of the DataCalc platform which includes connectors for the relational DBMS PostgreSQL, the document store MongoDB, the graph database Neo4j, and for the custom build PyProc processing system. For the evaluation of this prototype we have implemented an extended application scenario. Our experiments demonstrate that DataCalc is able to find and execute efficient execution strategies that minimize cross system data movement. The system achieves much better results than a naive implementation and it comes close to the performance of a hand-optimized solution. Based on these findings we are confident to conclude that the DataCalc platform architecture provides an excellent environment for cross domain data analysis on a heterogeneous federated processing architecture. info:eu-repo/classification/ddc/004 ddc:004
596	Databázové řešení pro ukládání měřených dat / The database solution for storing measurement data Holeček, Ivan January 2018 (has links) Diploma thesis is focused on elaboration of database solution for saving measured data. In theory part analyses database query language and database management system Microsoft SQL Server 2017. Further in theory part is focused on programming environment for application development using C# .NET. Thesis includes database solution for saving measured data, service console application for saving data into the database and user application for creating new measuring, representation of data and user administration.
597	Jazyk pro dotazování Java AST / Java AST Query Language Bílek, Jiří January 2015 (has links) The purpose of this thesis is to design a Java AST query language and implement tool that uses the query language. This work overviews graph databases and their libraries with focus on Neo4J and Titan. This thesis overviews tools Java bytecode analysis as well. Libraries Procyon and BCEL are described in detail. The work includes a proposal the query language and detailed description of the tool implementation, together with the detailed description of the way how Java entities are stored into the graph databases. In the end, the work deals with experiments and the evaluation of the time complexity of the library.
598	Výpočet viditelnosti v 3D bludišti / Visibility Determination in 3D Maze Petruželka, Jiří January 2014 (has links) The purpose of this thesis is to present methods for visibility determination and to design and implement an application to demonstrate visibility determination in a 3D maze.
599	Concept-Oriented Model and Nested Partially Ordered Sets Savinov, Alexandr 24 April 2014 (has links) Concept-oriented model of data (COM) has been recently defined syntactically by means of the concept-oriented query language (COQL). In this paper we propose a formal embodiment of this model, called nested partially ordered sets (nested posets), and demonstrate how it is connected with its syntactic counterpart. Nested poset is a novel formal construct that can be viewed either as a nested set with partial order relation established on its elements or as a conventional poset where elements can themselves be posets. An element of a nested poset is defined as a couple consisting of one identity tuple and one entity tuple. We formally define main operations on nested posets and demonstrate their usefulness in solving typical data management and analysis tasks such as logic navigation, constraint propagation, inference and multidimensional analysis. info:eu-repo/classification/ddc/004 ddc:004
600	Heterogeneity-Aware Placement Strategies for Query Optimization Karnagel, Tomas 23 May 2017 (has links) Computing hardware is changing from systems with homogeneous CPUs to systems with heterogeneous computing units like GPUs, Many Integrated Cores, or FPGAs. This trend is caused by scaling problems of homogeneous systems, where heat dissipation and energy consumption is limiting further growths in compute-performance. Heterogeneous systems provide differently optimized computing hardware, which allows different operations to be computed on the most appropriate computing unit, resulting in faster execution and less energy consumption. For database systems, this is a new opportunity to accelerate query processing, allowing faster and more interactive querying of large amounts of data. However, the current hardware trend is also a challenge as most database systems do not support heterogeneous computing resources and it is not clear how to support these systems best. In the past, mainly single operators were ported to different computing units showing great results, while missing a system wide application. To efficiently support heterogeneous systems, a systems approach for query processing and query optimization is needed. In this thesis, we tackle the optimization challenge in detail. As a starting point, we evaluate three different approaches on isolated use-cases to assess their advantages and limitations. First, we evaluate a fork-join approach of intra-operator parallelism, where the same operator is executed on multiple computing units at the same time, each execution with different data partitions. Second, we evaluate using one computing unit statically to accelerate one operator, which provides high code-optimization potential, due to this static and pre-known usage of hardware and software. Third, we evaluate dynamically placing operators onto computing units, depending on the operator, the available computing hardware, and the given data sizes. We argue that the first and second approach suffer from multiple overheads or high implementation costs. The third approach, dynamic placement, shows good performance, while being highly extensible to different computing units and different operator implementations. To automate this dynamic approach, we first propose general placement optimization for query processing. This general approach includes runtime estimation of operators on different computing units as well as two approaches for defining the actual operator placement according to the estimated runtimes. The two placement approaches are local optimization, which decides the placement locally at run-time, and global optimization, where the placement is decided at compile-time, while allowing a global view for enhanced data sharing. The main limitation of the latter is the high dependency on cardinality estimation of intermediate results, as estimation errors for the cardinalities propagate to the operator runtime estimation and placement optimization. Therefore, we propose adaptive placement optimization, allowing the placement optimization to become fully independent of cardinalities estimation, effectively eliminating the main source of inaccuracy for runtime estimation and placement optimization. Finally, we define an adaptive placement sequence, incorporating all our proposed techniques of placement optimization. We implement this sequence as a virtualization layer between the database system and the heterogeneous hardware. Our implementation approach bases on preexisting interfaces to the database system and the hardware, allowing non-intrusive integration into existing database systems. We evaluate our techniques using two different database systems and two different OLAP benchmarks, accelerating the query processing through heterogeneous execution. info:eu-repo/classification/ddc/004 ddc:004

Search results