Global ETD Search

621	Query Containment Using a DLR ABox Horrocks, Ian, Tessaris, Sergio, Sattler, Ulrike, Tobies, Stephan 20 May 2022 (has links) Query containment under constraints is the problem of determining whether the result of one query is contained in the result of another query for every database satisfying a given set of constraints. This problem is of particular importance in information integration and warehousing where, in addition to the constraints derived from the source schemas and the global schema, inter-schema constraints can be used to specify relationships between objects in different schemas. A theoretical framework for tackling this problem using the DLR logic has been established, and in this paper we show how the framework can be extended to a practical decision procedure. The proposed technique is to extend DLR with an Abox (a set of assertions about named individuals and tuples), and to transform query subsumption problems into DLR Abox satisfiability problems. We then show how such problems can be decided, via a reification transformation, using a highly optimised reasoner for the SHIQ description logic. info:eu-repo/classification/ddc/004 ddc:004
622	EFFICIENT LSM SECONDARY INDEXING FOR UPDATE-INTENSIVE WORKLOADS Jaewoo Shin (17069089) 29 September 2023 (has links) <p dir="ltr">In recent years, massive amounts of data have been generated from various types of devices or services. For these data, update-intensive workloads where the data update their status periodically and continuously are common. The Log-Structured-Merge (LSM, for short) is a widely-used indexing technique in various systems, where index structures buffer insert operations into the memory layer and flush them into disk when the data size in memory exceeds a threshold. Despite its noble ability to handle write-intensive (i.e., insert-intensive) workloads, LSM suffers from degraded query performance due to its inefficiency on index maintenance of secondary keys to handle update-intensive workloads.</p><p dir="ltr">This dissertation focuses on the efficient support of update-intensive workloads for LSM-based indexes. First, the focus is on the optimization of LSM secondary-key indexes and their support for update-intensive workloads. A mechanism to enable the LSM R-tree to handle update-intensive workloads efficiently is introduced. The new LSM indexing structure is termed the LSM RUM-tree, an LSM R-tree with Update Memo. The key insights are to reduce the maintenance cost of the LSM R-tree by leveraging an additional in-memory memo structure to control the size of the memo to fit in memory. In the experiments, the LSM RUM-tree achieves up to 9.6x speedup on update operations and up to 2400x speedup on query operations.</p><p dir="ltr">Second, the focus is to offer several significant advancements in the context of the LSM RUM-tree. We provide an extended examination of LSM-aware Update Memo (UM) cleaning strategies, elucidating how effectively each strategy reduces UM size and contributes to performance enhancements. Moreover, in recognition of the imperative need to facilitate concurrent activities within the LSM RUM-Tree, particularly in multi-threaded/multi-core environments, we introduce a pivotal feature of concurrency control for the update memo. The novel atomic operation known as Compare and If Less than Swap (CILS) is introduced to enable seamless concurrent operations on the Update Memo. Experimental results attest to a notable 4.5x improvement in the speed of concurrent update operations when compared to existing and baseline implementations.</p><p dir="ltr">Finally, we present a novel technique designed to improve query processing performance and optimize storage management in any secondary LSM tree. Our proposed approach introduces a new framework and mechanisms aimed at addressing the specific challenges associated with secondary indexing in the structure of the LSM tree, especially in the context of secondary LSM B+-tree (LSM BUM-tree). Experimental results show that the LSM BUM-tree achieves up to 5.1x speedup on update-intensive workloads and 107x speedup on update and query mixed workloads over existing LSM B+-tree implementations.</p> Data models, storage and indexing Database systems LSM-based index Secondary index Query Processing R-trees B-Tree spatial data processing
623	SAP HANA distributed in-memory database system: Transaction, session, and metadata management Lehner, Wolfgang, Kwon, Yong Sik, Lee, Juchang, Färber, Franz, Muehle, Michael, Lee, Chulwon, Bensberg, Christian, Lee, Joo Yeon, Lee, Arthur H. 12 January 2023 (has links) One of the core principles of the SAP HANA database system is the comprehensive support of distributed query facility. Supporting scale-out scenarios was one of the major design principles of the system from the very beginning. Within this paper, we first give an overview of the overall functionality with respect to data allocation, metadata caching and query routing. We then dive into some level of detail for specific topics and explain features and methods not common in traditional disk-based database systems. In summary, the paper provides a comprehensive overview of distributed query processing in SAP HANA database to achieve scalability to handle large databases and heterogeneous types of workloads. info:eu-repo/classification/ddc/004 ddc:004
624	Query processing on low-energy many-core processors Lehner, Wolfgang, Ungethüm, Annett, Habich, Dirk, Karnagel, Tomas, Asmussen, Nils, Völp, Marcus, Nöthen, Benedikt, Fettweis, Gerhard 12 January 2023 (has links) Aside from performance, energy efficiency is an increasing challenge in database systems. To tackle both aspects in an integrated fashion, we pursue a hardware/software co-design approach. To fulfill the energy requirement from the hardware perspective, we utilize a low-energy processor design offering the possibility to us to place hundreds to millions of chips on a single board without any thermal restrictions. Furthermore, we address the performance requirement by the development of several database-specific instruction set extensions to customize each core, whereas each core does not have all extensions. Therefore, our hardware foundation is a low-energy processor consisting of a high number of heterogeneous cores. In this paper, we introduce our hardware setup on a system level and present several challenges for query processing. Based on these challenges, we describe two implementation concepts and a comparison between these concepts. Finally, we conclude the paper with some lessons learned and an outlook on our upcoming research directions. info:eu-repo/classification/ddc/004 ddc:004
625	Hierarchisches gruppenbasiertes Sampling Rainer, Gemulla, Berthold, Henrike, Lehner, Wolfgang 12 January 2023 (has links) In Zeiten wachsender Datenbankgrößen ist es unumgänglich, Anfragen näherungsweise auszuwerten um schnelle Antworten zu erhalten. Dieser Artikel stellt verschiedene Methoden vor, dieses Ziel zu erreichen, und wendet sich anschließend dem Sampling zu, welches mit Hilfe einer Stichprobe schnell zu adäquaten Ergebnissen führt. Enthalten Datenbankanfragen Verbund- oder Gruppierungsoperationen, so sinkt die Genauigkeit vieler Sampling-Verfahren sehr stark; insbesondere werden vor allem kleine Gruppen nicht erkannt. Dieser Artikel befasst sich mit hierarchischen gruppenbasiertem Sampling, welches Sampling, Gruppierung und Verbundoperationen kombiniert. / In times of increasing database sizes it is crucial to process queries approximately in order to obtain answers quickly. This article introduces several methods for achieving this goal and afterwards focuses on sampling, yielding appropriate results by using only a subset of the actual data. If database queries contain join or group-by operations, the accuracy of many sampling methods drops significantly; especially small groups are not recognized. This article is concerned with hierarchical group-based sampling, which combines sampling, grouping and joins. info:eu-repo/classification/ddc/004 ddc:004
626	Approaches to Creating Fuzzy Concept Lattices and an Application to Bioinformatics Annotations Kandasamy, Meenakshi January 2010 (has links) No description available. Computer Science FFCA Ontology Fuzzy Ontology Fuzzy Closure Approach Alpha Cut Fuzzy Query Compare Implication Tnorm Factor Analysis
627	Named Entity Recognition for Search Queries in the Music Domain / Identifiering av namngivna enheter för sökfrågor inom musikdomänen Liljeqvist, Sandra January 2016 (has links) This thesis addresses the problem of named entity recognition (NER) in music-related search queries. NER is the task of identifying keywords in text and classifying them into predefined categories. Previous work in the field has mainly focused on longer documents of editorial texts. However, in recent years, the application of NER for queries has attracted increased attention. This task is, however, acknowledged to be challenging due to queries being short, ungrammatical and containing minimal linguistic context. The usage of NER for queries is especially useful for the implementation of natural language queries in domain-specific search applications. These applications are often backed by a database, where the query format otherwise is restricted to keyword search or the usage of a formal query language. In this thesis, two techniques for NER for music-related queries are evaluated; a conditional random field based solution and a probabilistic solution based on context words. As a baseline, the most elementary implementation of NER, commonly applied on editorial text, is used. Both of the evaluated approaches outperform the baseline and demonstrate an overall F1 score of 79.2% and 63.4% respectively. The experimental results show a high precision for the probabilistic approach and the conditional random field based solution demonstrates an F1 score comparable to previous studies from other domains. / Denna avhandling redogör för identifiering av namngivna enheter i musikrelaterade sökfrågor. Identifiering av namngivna enheter innebär att extrahera nyckelord från text och att klassificera dessa till någon av ett antal förbestämda kategorier. Tidigare forskning kring ämnet har framför allt fokuserat på längre redaktionella dokument. Däremot har intresset för tillämpningar på sökfrågor ökat de senaste åren. Detta anses vara ett svårt problem då sökfrågor i allmänhet är korta, grammatiskt inkorrekta och innehåller minimal språklig kontext. Identifiering av namngivna enheter är framför allt användbart för domänspecifika sökapplikationer där målet är att kunna tolka sökfrågor skrivna med naturligt språk. Dessa applikationer baseras ofta på en databas där formatet på sökfrågorna annars är begränsat till att enbart använda nyckelord eller användande av ett formellt frågespråk. I denna avhandling har två tekniker för identifiering av namngivna enheter för musikrelaterade sökfrågor undersökts; en metod baserad på villkorliga slumpfält (eng. conditional random field) och en probabilistisk metod baserad på kontextord. Som baslinje har den mest grundläggande implementationen, som vanligtvis används för redaktionella texter, valts. De båda utvärderade metoderna presterar bättre än baslinjen och ges ett F1-värde på 79,2% respektive 63,4%. De experimentella resultaten visar en hög precision för den probabilistiska implementationen och metoden ba- serad på villkorliga slumpfält visar på resultat på en nivå jämförbar med tidigare studier inom andra domäner. Natural Language Processing Information Extraction Named Entity Recognition Search Query Semantics Conditional Random Field Computer Sciences Datavetenskap (datalogi)
628	To share or not to share vector registers? Pietrzyk, Johannes, Krause, Alexander, Habich, Dirk, Lehner, Wolfgang 04 June 2024 (has links) Query execution techniques in database systems constantly adapt to novel hardware features to achieve high query performance, in particular for analytical queries. In recent years, vectorization based on the Single Instruction Multiple Data parallel paradigm has been established as a state-of-the-art approach to increase single-query performance. However, since concurrent analytical queries running in parallel often access the same columns and perform a same set of vectorized operations, data accesses and computations among different queries may be executed redundantly. Various techniques have already been proposed to avoid such redundancy, ranging from concurrent scans via the construction of materialized views to applying multiple query optimization techniques. Continuing this line of research, we investigate the opportunity of sharing vector registers for concurrently running queries in analytical scenarios in this paper. In particular, our novel sharing approach relies on processing data elements of different queries together within a single vector register. As we are going to show, sharing vector registers to optimize the execution of concurrent analytical queries can be very beneficial in single-threaded as well as multi-thread environments. Therefore, we demonstrate the feasibility and applicability of such a novel work sharing strategy and thus open up a wide spectrum of future research opportunities. info:eu-repo/classification/ddc/004 ddc:004
629	Service-Oriented Sensor-Actuator Networks Rezgui, Abdelmounaam 09 January 2008 (has links) In this dissertation, we propose service-oriented sensor-actuator networks (SOSANETs) as a new paradigm for building the next generation of customizable, open, interoperable sensor-actuator networks. In SOSANETs, nodes expose their capabilities to applications in the form of service profiles. A node's service profile consists of a set of services (i.e., sensing and actuation capabilities) that it provides and the quality of service (QoS) parameters associated with those services (delay, accuracy, freshness, etc.). SOSANETs provide the benefits of both application-specific SANETs and generic SANETs. We first define a query model and an architecture for SOSANETs. The proposed query model offers a simple, uniform query interface whereby applications specify sensing and actuation queries independently from any specific deployment of the underlying SOSANET. We then present μRACER (Reliable Adaptive serviCe-driven Efficient Routing), a routing protocol suite for SOSANETs. μRACER consists of three routing protocols, namely, SARP (Service-Aware Routing Protocol), CARP (Context-Aware Routing Protocol), and TARP (Trust-Aware Routing Protocol). SARP uses an efficient service-aware routing approach that aggressively reduces downstream traffic by translating service profiles into efficient paths. CARP supports QoS by dynamically adapting each node's routing behavior and service profile according to the current context of that node, i.e. number of pending queries and number and type of messages to be routed. Finally, TARP achieves high end-to-end reliability through a scalable reputation-based approach in which each node is able to locally estimate the next hop of the most reliable path to the sink. We also propose query optimization techniques that contribute to the efficient execution of queries in SOSANETs. To evaluate the proposed service-oriented architecture, we implemented TinySOA, a prototype SOSANET built on top of TinyOS with uRACER as its routing mechansim. TinySOA is designed as a set of layers with a loose interaction model that enables several cross-layer optimization options. We conducted an evaluation of TinySOA that included a comparison with TinyDB. The obtained empirical results show that TinySOA achieves significant improvements on many aspects including energy consumption, scalability, reliability and response time. / Ph. D. Context-aware Routing Cross-Layer Optimization Query Processing Trust Reputation Routing Sensor-Actuator Networks uRACER Service-Oriented Architectures
630	Design and Analysis of Adaptive Fault Tolerant QoS Control Algorithms for Query Processing in Wireless Sensor Networks Speer, Ngoc Anh Phan 02 May 2008 (has links) Data sensing and retrieval in WSNs have a great applicability in military, environmental, medical, home and commercial applications. In query-based WSNs, a user would issue a query with QoS requirements in terms of reliability and timeliness, and expect a correct response to be returned within the deadline. Satisfying these QoS requirements requires that fault tolerance mechanisms through redundancy be used, which may cause the energy of the system to deplete quickly. This dissertation presents the design and validation of adaptive fault tolerant QoS control algorithms with the objective to achieve the desired quality of service (QoS) requirements and maximize the system lifetime in query-based WSNs. We analyze the effect of redundancy on the mean time to failure (MTTF) of query-based cluster-structured WSNs and show that an optimal redundancy level exists such that the MTTF of the system is maximized. We develop a hop-by-hop data delivery (HHDD) mechanism and an Adaptive Fault Tolerant Quality of Service Control (AFTQC) algorithm in which we utilize "source" and "path" redundancy with the goal to satisfy application QoS requirements while maximizing the lifetime of WSNs. To deal with network dynamics, we investigate proactive and reactive methods to dynamically collect channel and delay conditions to determine the optimal redundancy level at runtime. AFTQC can adapt to network dynamics that cause changes to the node density, residual energy, sensor failure probability, and radio range due to energy consumption, node failures, and change of node connectivity. Further, AFTQC can deal with software faults, concurrent query processing with distinct QoS requirements, and data aggregation. We compare our design with a baseline design without redundancy based on acknowledgement for data transmission and geographical routing for relaying packets to demonstrate the feasibility. We validate analytical results with extensive simulation studies. When given QoS requirements of queries in terms of reliability and timeliness, our AFTQC design allows optimal "source" and "path" redundancies to be identified and applied dynamically in response to network dynamics such that not only query QoS requirements are satisfied, as long as adequate resources are available, but also the lifetime of the system is prolonged. / Ph. D. energy conservation redundancy query processing timeliness reliability quality of service fault tolerance wireless sensor networks mean time to failure

Search results