Global ETD Search

1	Adaptive Energy-Control for In-Memory Database Systems Kissinger, Thomas, Habich, Dirk, Lehner, Wolfgang 30 May 2022 (has links) The ever-increasing demand for scalable database systems is limited by their energy consumption, which is one of the major challenges in research today. While existing approaches mainly focused on transaction-oriented disk-based database systems, we are investigating and optimizing the energy consumption and performance of data-oriented scale-up in-memory database systems that make heavy use of the main power consumers, which are processors and main memory. We give an in-depth energy analysis of a current mainstream server system and show that modern processors provide a rich set of energy-control features, but lack the capability of controlling them appropriately, because of missing application-specific knowledge. Thus, we propose the Energy-Control Loop (ECL) as an DBMS-integrated approach for adaptive energy-control on scale-up in-memory database systems that obeys a query latency limit as a soft constraint and actively optimizes energy efficiency and performance of the DBMS. The ECL relies on adaptive workload-dependent energy profiles that are continuously maintained at runtime. In our evaluation, we observed energy savings ranging from 20% to 40% for a real-world load profile. info:eu-repo/classification/ddc/004 ddc:004
2	High-Throughput BitPacking Compression Lisa, Nusrat Jahan, Nguyen, Tuan Duy Anh, Habich, Dirk, Kumar, Akash, Lehner, Wolfgang 03 July 2023 (has links) To efficiently support analytical applications from a data management perspective, in-memory column store database systems are state-of-the art. In this kind of database system, lossless lightweight integer compression schemes are crucial to keep the memory storage as low as possible and to speedup query processing. In this specific compression domain, BitPacking is one of the most frequently applied compression scheme. However, (de) compression should not come with any additional cost during run time, but should be provided transparently without compromising the overall system performance. To achieve that, we focus on acceleration of BitPacking using Field Programmable Gate Arrays (FPGAs). Therefore, we outline several FPGA designs for BitPacking in this paper. As we are going to show in our evaluation, our specific designs provide the BitPacking compression scheme with high-throughput. info:eu-repo/classification/ddc/004 ddc:004
3	Make Larger Vector Register Sizes New Challenges?: Lessons Learned from the Area of Vectorized Lightweight Compression Algorithms Habich, Dirk, Damme, Patrick, Ungethüm, Annett, Lehner, Wolfgang 15 September 2022 (has links) The exploitation of data as well as hardware properties is a core aspect for efficient data management. This holds in particular for the field of in-memory data processing. Aside from increasing main memory capacities, in-memory data processing also benefits from novel processing concepts based on lightweight compressed data. To speed up compression as well as decompression, an active research field deals with the specialization of these algorithms to hardware features such as vectorization using SIMD instructions. Most of the vectorized implementations have been proposed for 128 bit vector registers. However, hardware vendors still increase the vector register sizes, whereby a straightforward transformation to these wider vector sizes is possible in most-cases. Thus, we systematically investigated the impact of different SIMD instruction set extensions with wider vector sizes on the behavior of straightforward transformed implementations. In this paper, we will describe our evaluation methodology and present selective results of our exhaustive evaluation. In particular, we will highlight some challenges and present first approaches to tackle them. info:eu-repo/classification/ddc/004 ddc:004
4	Density-Aware Linear Algebra in a Column-Oriented In-Memory Database System Kernert, David 20 September 2016 (has links) (PDF) Linear algebra operations appear in nearly every application in advanced analytics, machine learning, and of various science domains. Until today, many data analysts and scientists tend to use statistics software packages or hand-crafted solutions for their analysis. In the era of data deluge, however, the external statistics packages and custom analysis programs that often run on single-workstations are incapable to keep up with the vast increase in data volume and size. In particular, there is an increasing demand of scientists for large scale data manipulation, orchestration, and advanced data management capabilities. These are among the key features of a mature relational database management system (DBMS). With the rise of main memory database systems, it now has become feasible to also consider applications that built up on linear algebra. This thesis presents a deep integration of linear algebra functionality into an in-memory column-oriented database system. In particular, this work shows that it has become feasible to execute linear algebra queries on large data sets directly in a DBMS-integrated engine (LAPEG), without the need of transferring data and being restricted by hard disc latencies. From various application examples that are cited in this work, we deduce a number of requirements that are relevant for a database system that includes linear algebra functionality. Beside the deep integration of matrices and numerical algorithms, these include optimization of expressions, transparent matrix handling, scalability and data-parallelism, and data manipulation capabilities. These requirements are addressed by our linear algebra engine. In particular, the core contributions of this thesis are: firstly, we show that the columnar storage layer of an in-memory DBMS yields an easy adoption of efficient sparse matrix data types and algorithms. Furthermore, we show that the execution of linear algebra expressions significantly benefits from different techniques that are inspired from database technology. In a novel way, we implemented several of these optimization strategies in LAPEG’s optimizer (SpMachO), which uses an advanced density estimation method (SpProdest) to predict the matrix density of intermediate results. Moreover, we present an adaptive matrix data type AT Matrix to obviate the need of scientists for selecting appropriate matrix representations. The tiled substructure of AT Matrix is exploited by our matrix multiplication to saturate the different sockets of a multicore main-memory platform, reaching up to a speed-up of 6x compared to alternative approaches. Finally, a major part of this thesis is devoted to the topic of data manipulation; where we propose a matrix manipulation API and present different mutable matrix types to enable fast insertions and deletes. We finally conclude that our linear algebra engine is well-suited to process dynamic, large matrix workloads in an optimized way. In particular, the DBMS-integrated LAPEG is filling the linear algebra gap, and makes columnar in-memory DBMS attractive as efficient, scalable ad-hoc analysis platform for scientists. Main-Memory Datenbanksysteme Spaltenorientierte DBMS Lineare Algebra Implementierung Matrixdatenstrukturen Ausdrucksoptimierung in-memory database management systems column-oriented DBMS linear algebra implementation matrix data structures expression optimization ddc:004 rvk:ST 270
5	Überblick und Klassifikation leichtgewichtiger Kompressionsverfahren im Kontext hauptspeicherbasierter Datenbanksysteme Hildebrandt, Juliana 22 July 2015 (has links) (PDF) Im Kontext von In-Memory-Datenbanksystemen nehmen leichtgewichtige Kompressionsalgorithmen eine entscheidende Rolle ein, um eine effiziente Speicherung und Verarbeitung großer Datenmengen im Hauptspeicher zu realisieren. Verglichen mit klassischen Komprimierungstechniken wie z.B. Huffman erzielen leichtgewichtige Kompressionsalgorithmen vergleichbare Kompressionsraten aufgrund der Einbeziehung von Kontextwissen und erlauben eine schnellere Kompression und Dekompression. Die Vielfalt der leichtgewichtigen Kompressionsalgorithmen hat in den letzten Jahren zugenommen, da ein großes Optimierungspotential über die Einbeziehung des Kontextwissens besteht. Um diese Vielfalt zu bewältigen haben wir uns mit der Modularisierung von leichtgewichtigen Kompressionsalgorithmen beschäftigt und ein allgemeines Kompressionsschema entwickelt. Durch den Austausch einzelner Module oder auch nur eingehender Parameter lassen sich verschiedene Algorithmen einfach realisieren. Kompression Kompressionsalgorithmen leichtgewichtige Kompression Modularisierung In-Memory-Datenbanksysteme compression compression algorithms lightweight compression modularization modularisation in-memory database systems main memory database systems ddc:004 rvk:ST 284 Kompression Modularität Algorithmus Datenbanksystem Hauptspeicher
6	Density-Aware Linear Algebra in a Column-Oriented In-Memory Database System Kernert, David 20 September 2016 (has links) Linear algebra operations appear in nearly every application in advanced analytics, machine learning, and of various science domains. Until today, many data analysts and scientists tend to use statistics software packages or hand-crafted solutions for their analysis. In the era of data deluge, however, the external statistics packages and custom analysis programs that often run on single-workstations are incapable to keep up with the vast increase in data volume and size. In particular, there is an increasing demand of scientists for large scale data manipulation, orchestration, and advanced data management capabilities. These are among the key features of a mature relational database management system (DBMS). With the rise of main memory database systems, it now has become feasible to also consider applications that built up on linear algebra. This thesis presents a deep integration of linear algebra functionality into an in-memory column-oriented database system. In particular, this work shows that it has become feasible to execute linear algebra queries on large data sets directly in a DBMS-integrated engine (LAPEG), without the need of transferring data and being restricted by hard disc latencies. From various application examples that are cited in this work, we deduce a number of requirements that are relevant for a database system that includes linear algebra functionality. Beside the deep integration of matrices and numerical algorithms, these include optimization of expressions, transparent matrix handling, scalability and data-parallelism, and data manipulation capabilities. These requirements are addressed by our linear algebra engine. In particular, the core contributions of this thesis are: firstly, we show that the columnar storage layer of an in-memory DBMS yields an easy adoption of efficient sparse matrix data types and algorithms. Furthermore, we show that the execution of linear algebra expressions significantly benefits from different techniques that are inspired from database technology. In a novel way, we implemented several of these optimization strategies in LAPEG’s optimizer (SpMachO), which uses an advanced density estimation method (SpProdest) to predict the matrix density of intermediate results. Moreover, we present an adaptive matrix data type AT Matrix to obviate the need of scientists for selecting appropriate matrix representations. The tiled substructure of AT Matrix is exploited by our matrix multiplication to saturate the different sockets of a multicore main-memory platform, reaching up to a speed-up of 6x compared to alternative approaches. Finally, a major part of this thesis is devoted to the topic of data manipulation; where we propose a matrix manipulation API and present different mutable matrix types to enable fast insertions and deletes. We finally conclude that our linear algebra engine is well-suited to process dynamic, large matrix workloads in an optimized way. In particular, the DBMS-integrated LAPEG is filling the linear algebra gap, and makes columnar in-memory DBMS attractive as efficient, scalable ad-hoc analysis platform for scientists. info:eu-repo/classification/ddc/004 ddc:004
7	Überblick und Klassifikation leichtgewichtiger Kompressionsverfahren im Kontext hauptspeicherbasierter Datenbanksysteme Hildebrandt, Juliana January 2015 (has links) Im Kontext von In-Memory-Datenbanksystemen nehmen leichtgewichtige Kompressionsalgorithmen eine entscheidende Rolle ein, um eine effiziente Speicherung und Verarbeitung großer Datenmengen im Hauptspeicher zu realisieren. Verglichen mit klassischen Komprimierungstechniken wie z.B. Huffman erzielen leichtgewichtige Kompressionsalgorithmen vergleichbare Kompressionsraten aufgrund der Einbeziehung von Kontextwissen und erlauben eine schnellere Kompression und Dekompression. Die Vielfalt der leichtgewichtigen Kompressionsalgorithmen hat in den letzten Jahren zugenommen, da ein großes Optimierungspotential über die Einbeziehung des Kontextwissens besteht. Um diese Vielfalt zu bewältigen haben wir uns mit der Modularisierung von leichtgewichtigen Kompressionsalgorithmen beschäftigt und ein allgemeines Kompressionsschema entwickelt. Durch den Austausch einzelner Module oder auch nur eingehender Parameter lassen sich verschiedene Algorithmen einfach realisieren.:1 Einleitung 1 2 Modularisierung von Komprimierungsmethoden 5 2.1 Zum Literaturstand 5 2.2 Einfaches Schema zur Komprimierung 7 2.3 Weitere Betrachtungen 11 2.3.1 Splitmodul und Wortgenerator mit mehreren Ausgaben 11 2.3.2 Hierarchische Datenorganisation 13 2.3.3 Mehrmaliger Aufruf des Schemas 15 2.4 Bewertung und Begründung der Modularisierung 17 2.5 Zusammenfassung 17 3 Modularisierung für verschiedene Kompressionsmuster 19 3.1 Frame of Reference (FOR) 19 3.2 Differenzkodierung (DELTA) 21 3.3 Symbolunterdrückung 23 3.4 Lauflängenkodierung (RLE) 23 3.5 Wörterbuchkompression (DICT) 24 3.6 Bitvektoren (BV) 26 3.7 Vergleich verschiedener Muster und Techniken 26 3.8 Zusammenfassung 30 4 Konkrete Algorithmen 31 4.1 Binary Packing 31 4.2 FOR mit Binary Packing 33 4.3 Adaptive FOR und VSEncoding 35 4.4 PFOR-Algorithmen 38 4.4.1 PFOR und PFOR2008 38 4.4.2 NewPFD und OptPFD 42 4.4.3 SimplePFOR und FastPFOR 46 4.4.4 Anmerkungen zur differenzkodierten Daten 49 5.4 Simple-Algorithmen 49 4.5.1 Simple-9 49 4.5.2 Simple-16 50 4.5.3 Relative-10 und Carryover-12 52 4.6 Byteorientierte Kodierungen 55 4.6.1 Varint-SU und Varint-PU 56 4.6.2 Varint-GU 56 4.6.3 Varint-PB 59 4.6.4 Varint-GB 61 4.6.5 Vergleich der Module der Varint-Algorithmen 62 4.6.6 RLE VByte 62 4.7 Wörterbuchalgorithmen 63 4.7.1 ZIL 63 4.7.2 Sigmakodierte invertierte Dateien 65 4.8 Zusammenfassung 66 5 Eigenschaften von Komprimierungsmethoden 69 5.1 Anpassbarkeit 69 5.2 Anzahl der Pässe 71 5.3 Genutzte Information 74 5.4 Art der Daten und Arten von Redundanz 74 5.5 Zusammenfassung 77 6 Zusammenfassung und Ausblick 79 info:eu-repo/classification/ddc/004 ddc:004

1

Page generated in 0.0504 seconds