Global ETD Search

1	Überblick und Klassifikation leichtgewichtiger Kompressionsverfahren im Kontext hauptspeicherbasierter Datenbanksysteme Hildebrandt, Juliana 22 July 2015 (has links) (PDF) Im Kontext von In-Memory-Datenbanksystemen nehmen leichtgewichtige Kompressionsalgorithmen eine entscheidende Rolle ein, um eine effiziente Speicherung und Verarbeitung großer Datenmengen im Hauptspeicher zu realisieren. Verglichen mit klassischen Komprimierungstechniken wie z.B. Huffman erzielen leichtgewichtige Kompressionsalgorithmen vergleichbare Kompressionsraten aufgrund der Einbeziehung von Kontextwissen und erlauben eine schnellere Kompression und Dekompression. Die Vielfalt der leichtgewichtigen Kompressionsalgorithmen hat in den letzten Jahren zugenommen, da ein großes Optimierungspotential über die Einbeziehung des Kontextwissens besteht. Um diese Vielfalt zu bewältigen haben wir uns mit der Modularisierung von leichtgewichtigen Kompressionsalgorithmen beschäftigt und ein allgemeines Kompressionsschema entwickelt. Durch den Austausch einzelner Module oder auch nur eingehender Parameter lassen sich verschiedene Algorithmen einfach realisieren. Kompression Kompressionsalgorithmen leichtgewichtige Kompression Modularisierung In-Memory-Datenbanksysteme compression compression algorithms lightweight compression modularization modularisation in-memory database systems main memory database systems ddc:004 rvk:ST 284 Kompression Modularität Algorithmus Datenbanksystem Hauptspeicher
2	Überblick und Klassifikation leichtgewichtiger Kompressionsverfahren im Kontext hauptspeicherbasierter Datenbanksysteme Hildebrandt, Juliana January 2015 (has links) Im Kontext von In-Memory-Datenbanksystemen nehmen leichtgewichtige Kompressionsalgorithmen eine entscheidende Rolle ein, um eine effiziente Speicherung und Verarbeitung großer Datenmengen im Hauptspeicher zu realisieren. Verglichen mit klassischen Komprimierungstechniken wie z.B. Huffman erzielen leichtgewichtige Kompressionsalgorithmen vergleichbare Kompressionsraten aufgrund der Einbeziehung von Kontextwissen und erlauben eine schnellere Kompression und Dekompression. Die Vielfalt der leichtgewichtigen Kompressionsalgorithmen hat in den letzten Jahren zugenommen, da ein großes Optimierungspotential über die Einbeziehung des Kontextwissens besteht. Um diese Vielfalt zu bewältigen haben wir uns mit der Modularisierung von leichtgewichtigen Kompressionsalgorithmen beschäftigt und ein allgemeines Kompressionsschema entwickelt. Durch den Austausch einzelner Module oder auch nur eingehender Parameter lassen sich verschiedene Algorithmen einfach realisieren.:1 Einleitung 1 2 Modularisierung von Komprimierungsmethoden 5 2.1 Zum Literaturstand 5 2.2 Einfaches Schema zur Komprimierung 7 2.3 Weitere Betrachtungen 11 2.3.1 Splitmodul und Wortgenerator mit mehreren Ausgaben 11 2.3.2 Hierarchische Datenorganisation 13 2.3.3 Mehrmaliger Aufruf des Schemas 15 2.4 Bewertung und Begründung der Modularisierung 17 2.5 Zusammenfassung 17 3 Modularisierung für verschiedene Kompressionsmuster 19 3.1 Frame of Reference (FOR) 19 3.2 Differenzkodierung (DELTA) 21 3.3 Symbolunterdrückung 23 3.4 Lauflängenkodierung (RLE) 23 3.5 Wörterbuchkompression (DICT) 24 3.6 Bitvektoren (BV) 26 3.7 Vergleich verschiedener Muster und Techniken 26 3.8 Zusammenfassung 30 4 Konkrete Algorithmen 31 4.1 Binary Packing 31 4.2 FOR mit Binary Packing 33 4.3 Adaptive FOR und VSEncoding 35 4.4 PFOR-Algorithmen 38 4.4.1 PFOR und PFOR2008 38 4.4.2 NewPFD und OptPFD 42 4.4.3 SimplePFOR und FastPFOR 46 4.4.4 Anmerkungen zur differenzkodierten Daten 49 5.4 Simple-Algorithmen 49 4.5.1 Simple-9 49 4.5.2 Simple-16 50 4.5.3 Relative-10 und Carryover-12 52 4.6 Byteorientierte Kodierungen 55 4.6.1 Varint-SU und Varint-PU 56 4.6.2 Varint-GU 56 4.6.3 Varint-PB 59 4.6.4 Varint-GB 61 4.6.5 Vergleich der Module der Varint-Algorithmen 62 4.6.6 RLE VByte 62 4.7 Wörterbuchalgorithmen 63 4.7.1 ZIL 63 4.7.2 Sigmakodierte invertierte Dateien 65 4.8 Zusammenfassung 66 5 Eigenschaften von Komprimierungsmethoden 69 5.1 Anpassbarkeit 69 5.2 Anzahl der Pässe 71 5.3 Genutzte Information 74 5.4 Art der Daten und Arten von Redundanz 74 5.5 Zusammenfassung 77 6 Zusammenfassung und Ausblick 79 info:eu-repo/classification/ddc/004 ddc:004
3	Adaptive Energy-Control for In-Memory Database Systems Kissinger, Thomas, Habich, Dirk, Lehner, Wolfgang 30 May 2022 (has links) The ever-increasing demand for scalable database systems is limited by their energy consumption, which is one of the major challenges in research today. While existing approaches mainly focused on transaction-oriented disk-based database systems, we are investigating and optimizing the energy consumption and performance of data-oriented scale-up in-memory database systems that make heavy use of the main power consumers, which are processors and main memory. We give an in-depth energy analysis of a current mainstream server system and show that modern processors provide a rich set of energy-control features, but lack the capability of controlling them appropriately, because of missing application-specific knowledge. Thus, we propose the Energy-Control Loop (ECL) as an DBMS-integrated approach for adaptive energy-control on scale-up in-memory database systems that obeys a query latency limit as a soft constraint and actively optimizes energy efficiency and performance of the DBMS. The ECL relies on adaptive workload-dependent energy profiles that are continuously maintained at runtime. In our evaluation, we observed energy savings ranging from 20% to 40% for a real-world load profile. info:eu-repo/classification/ddc/004 ddc:004
4	High-Throughput BitPacking Compression Lisa, Nusrat Jahan, Nguyen, Tuan Duy Anh, Habich, Dirk, Kumar, Akash, Lehner, Wolfgang 03 July 2023 (has links) To efficiently support analytical applications from a data management perspective, in-memory column store database systems are state-of-the art. In this kind of database system, lossless lightweight integer compression schemes are crucial to keep the memory storage as low as possible and to speedup query processing. In this specific compression domain, BitPacking is one of the most frequently applied compression scheme. However, (de) compression should not come with any additional cost during run time, but should be provided transparently without compromising the overall system performance. To achieve that, we focus on acceleration of BitPacking using Field Programmable Gate Arrays (FPGAs). Therefore, we outline several FPGA designs for BitPacking in this paper. As we are going to show in our evaluation, our specific designs provide the BitPacking compression scheme with high-throughput. info:eu-repo/classification/ddc/004 ddc:004
5	A Benchmark Framework for Data Compression Techniques Damme, Patrick, Habich, Dirk, Lehner, Wolfgang 03 February 2023 (has links) Lightweight data compression is frequently applied in main memory database systems to improve query performance. The data processed by such systems is highly diverse. Moreover, there is a high number of existing lightweight compression techniques. Therefore, choosing the optimal technique for a given dataset is non-trivial. Existing approaches are based on simple rules, which do not suffice for such a complex decision. In contrast, our vision is a cost-based approach. However, this requires a detailed cost model, which can only be obtained from a systematic benchmarking of many compression algorithms on many different datasets. A naïve benchmark evaluates every algorithm under consideration separately. This yields many redundant steps and is thus inefficient. We propose an efficient and extensible benchmark framework for compression techniques. Given an ensemble of algorithms, it minimizes the overall run time of the evaluation. We experimentally show that our approach outperforms the naïve approach. info:eu-repo/classification/ddc/004 ddc:004
6	Make Larger Vector Register Sizes New Challenges?: Lessons Learned from the Area of Vectorized Lightweight Compression Algorithms Habich, Dirk, Damme, Patrick, Ungethüm, Annett, Lehner, Wolfgang 15 September 2022 (has links) The exploitation of data as well as hardware properties is a core aspect for efficient data management. This holds in particular for the field of in-memory data processing. Aside from increasing main memory capacities, in-memory data processing also benefits from novel processing concepts based on lightweight compressed data. To speed up compression as well as decompression, an active research field deals with the specialization of these algorithms to hardware features such as vectorization using SIMD instructions. Most of the vectorized implementations have been proposed for 128 bit vector registers. However, hardware vendors still increase the vector register sizes, whereby a straightforward transformation to these wider vector sizes is possible in most-cases. Thus, we systematically investigated the impact of different SIMD instruction set extensions with wider vector sizes on the behavior of straightforward transformed implementations. In this paper, we will describe our evaluation methodology and present selective results of our exhaustive evaluation. In particular, we will highlight some challenges and present first approaches to tackle them. info:eu-repo/classification/ddc/004 ddc:004
7	SOFORT: A Hybrid SCM-DRAM Storage Engine for Fast Data Recovery Oukid, Ismail, Booss, Daniel, Lehner, Wolfgang, Bumbulis, Peter, Willhalm, Thomas 19 September 2022 (has links) Storage Class Memory (SCM) has the potential to significantly improve database performance. This potential has been well documented for throughput [4] and response time [25, 22]. In this paper we show that SCM has also the potential to significantly improve restart performance, a shortcoming of traditional main memory database systems. We present SOFORT, a hybrid SCM-DRAM storage engine that leverages full capabilities of SCM by doing away with a traditional log and updating the persisted data in place in small increments. We show that we can achieve restart times of a few seconds independent of instance size and transaction volume without significantly impacting transaction throughput. info:eu-repo/classification/ddc/004 ddc:004
8	Die Datenbankforschungsgruppe der Technischen Universität Dresden stellt sich vor Wolfgang, Lehner 27 January 2023 (has links) Im Herbst 2012 feiert der Lehrstuhl Datenbanken an der Technischen Universität Dresden sein 10-jähriges Bestehen unter der Leitung von Wolfgang Lehner. In diesem Zeitraum wurde die inhaltliche Ausrichtung im Bereich der Datenbankunterstützung zur Auswertung großer Datenbestände weiter fokussiert sowie auf Systemebene deutlich ausgeweitet. Die Forschungsgruppe um Wolfgang Lehner ist dabei sowohl auf internationaler Ebene durch Publikationen und Kooperationen sichtbar als auch in Forschungsverbünden auf regionaler Ebene aktiv, um sowohl an der extrem jungen und agilen Software-Industrie in Dresden zu partizipieren und, soweit eine Forschungsgruppe dies zu leisten vermag, auch unterstützend zu wirken. [Aus: Einleitung] info:eu-repo/classification/ddc/004 ddc:004

Search results