Global ETD Search

1	Decision Tree Model to Support the Successful Selection of a Database Engine for Novice Database Administrators Monjaras, Alvaro, Bcndezu, Enrique, Raymundo, Carlos 09 May 2019 (has links) El texto completo de este trabajo no está disponible en el Repositorio Académico UPC por restricciones de la casa editorial donde ha sido publicado. / There are currently several types of databases that have different ways of manipulating data that affects the performance of transactions when dealing with the information stored. And it is very important for companies to manage information fast, so they do not lose any operation because of a bad performance of a database, in the same way, they need to operate fast while keeping the integrity of the information. Likewise, every database category's purpose is to serve a specific or specifics use cases to perform fast to manage the information when needed, so in this paper, we study and analyze the SQL, NoSQL and In Memory databases to understand their fit uses cases and make performance tests to build a decision tree that can help to take the decision to choose what database category to use to maintain a good performance. The precision of the tests of relational databases was 96.26% in NoSQL databases was 91.83% and finally in IMDBS was 93.87%. component database decision tree in memory database nosql performance SQL
2	Main-memory database VS Traditional database Rehn, Marcus, Sunesson, Emil January 2013 (has links) There has been a surge of new databases in recent years. Applications today create a higher demand on database performance than ever before. Main-memory databases have come into the market quite recently and they are just now catching a lot of interest from many different directions. Main-memory databases are a type of database that stores all of its data in the primary memory. They provide a big increase in performance to a lot of different applications. This work evaluates the difference in performance between two chosen candidates. To represent main memory databases we chose VoltDB and to represent traditional databases we chose MySQL. We have performed several tests on those two databases. We point out differences in functionality, performance and design choices. We want to create a reference where anyone that considers changing from a traditional database to a main memory database, can find support for their decision. What are the advantages and what are the disadvantages of using a main-memory database, and when should we switch from our old database to a newer technology. VoltDB MySQL databases Main-memory database primary memory database performance Computer Sciences Datavetenskap (datalogi) Software Engineering Programvaruteknik
3	Self maintenance of materialized xquery views via query containment and re-writing Nilekar, Shirish K. January 2006 (has links) Thesis (M.S.)--Worcester Polytechnic Institute. / Keywords: XML, Query Re-Writing, View Maintenance, Query Containment. Includes bibliographical references. (p.108-111)
4	Überblick und Klassifikation leichtgewichtiger Kompressionsverfahren im Kontext hauptspeicherbasierter Datenbanksysteme Hildebrandt, Juliana 22 July 2015 (has links) (PDF) Im Kontext von In-Memory-Datenbanksystemen nehmen leichtgewichtige Kompressionsalgorithmen eine entscheidende Rolle ein, um eine effiziente Speicherung und Verarbeitung großer Datenmengen im Hauptspeicher zu realisieren. Verglichen mit klassischen Komprimierungstechniken wie z.B. Huffman erzielen leichtgewichtige Kompressionsalgorithmen vergleichbare Kompressionsraten aufgrund der Einbeziehung von Kontextwissen und erlauben eine schnellere Kompression und Dekompression. Die Vielfalt der leichtgewichtigen Kompressionsalgorithmen hat in den letzten Jahren zugenommen, da ein großes Optimierungspotential über die Einbeziehung des Kontextwissens besteht. Um diese Vielfalt zu bewältigen haben wir uns mit der Modularisierung von leichtgewichtigen Kompressionsalgorithmen beschäftigt und ein allgemeines Kompressionsschema entwickelt. Durch den Austausch einzelner Module oder auch nur eingehender Parameter lassen sich verschiedene Algorithmen einfach realisieren. Kompression Kompressionsalgorithmen leichtgewichtige Kompression Modularisierung In-Memory-Datenbanksysteme compression compression algorithms lightweight compression modularization modularisation in-memory database systems main memory database systems ddc:004 rvk:ST 284 Kompression Modularität Algorithmus Datenbanksystem Hauptspeicher
5	Überblick und Klassifikation leichtgewichtiger Kompressionsverfahren im Kontext hauptspeicherbasierter Datenbanksysteme Hildebrandt, Juliana January 2015 (has links) Im Kontext von In-Memory-Datenbanksystemen nehmen leichtgewichtige Kompressionsalgorithmen eine entscheidende Rolle ein, um eine effiziente Speicherung und Verarbeitung großer Datenmengen im Hauptspeicher zu realisieren. Verglichen mit klassischen Komprimierungstechniken wie z.B. Huffman erzielen leichtgewichtige Kompressionsalgorithmen vergleichbare Kompressionsraten aufgrund der Einbeziehung von Kontextwissen und erlauben eine schnellere Kompression und Dekompression. Die Vielfalt der leichtgewichtigen Kompressionsalgorithmen hat in den letzten Jahren zugenommen, da ein großes Optimierungspotential über die Einbeziehung des Kontextwissens besteht. Um diese Vielfalt zu bewältigen haben wir uns mit der Modularisierung von leichtgewichtigen Kompressionsalgorithmen beschäftigt und ein allgemeines Kompressionsschema entwickelt. Durch den Austausch einzelner Module oder auch nur eingehender Parameter lassen sich verschiedene Algorithmen einfach realisieren.:1 Einleitung 1 2 Modularisierung von Komprimierungsmethoden 5 2.1 Zum Literaturstand 5 2.2 Einfaches Schema zur Komprimierung 7 2.3 Weitere Betrachtungen 11 2.3.1 Splitmodul und Wortgenerator mit mehreren Ausgaben 11 2.3.2 Hierarchische Datenorganisation 13 2.3.3 Mehrmaliger Aufruf des Schemas 15 2.4 Bewertung und Begründung der Modularisierung 17 2.5 Zusammenfassung 17 3 Modularisierung für verschiedene Kompressionsmuster 19 3.1 Frame of Reference (FOR) 19 3.2 Differenzkodierung (DELTA) 21 3.3 Symbolunterdrückung 23 3.4 Lauflängenkodierung (RLE) 23 3.5 Wörterbuchkompression (DICT) 24 3.6 Bitvektoren (BV) 26 3.7 Vergleich verschiedener Muster und Techniken 26 3.8 Zusammenfassung 30 4 Konkrete Algorithmen 31 4.1 Binary Packing 31 4.2 FOR mit Binary Packing 33 4.3 Adaptive FOR und VSEncoding 35 4.4 PFOR-Algorithmen 38 4.4.1 PFOR und PFOR2008 38 4.4.2 NewPFD und OptPFD 42 4.4.3 SimplePFOR und FastPFOR 46 4.4.4 Anmerkungen zur differenzkodierten Daten 49 5.4 Simple-Algorithmen 49 4.5.1 Simple-9 49 4.5.2 Simple-16 50 4.5.3 Relative-10 und Carryover-12 52 4.6 Byteorientierte Kodierungen 55 4.6.1 Varint-SU und Varint-PU 56 4.6.2 Varint-GU 56 4.6.3 Varint-PB 59 4.6.4 Varint-GB 61 4.6.5 Vergleich der Module der Varint-Algorithmen 62 4.6.6 RLE VByte 62 4.7 Wörterbuchalgorithmen 63 4.7.1 ZIL 63 4.7.2 Sigmakodierte invertierte Dateien 65 4.8 Zusammenfassung 66 5 Eigenschaften von Komprimierungsmethoden 69 5.1 Anpassbarkeit 69 5.2 Anzahl der Pässe 71 5.3 Genutzte Information 74 5.4 Art der Daten und Arten von Redundanz 74 5.5 Zusammenfassung 77 6 Zusammenfassung und Ausblick 79 info:eu-repo/classification/ddc/004 ddc:004
6	ANNIS: A graph-based query system for deeply annotated text corpora Krause, Thomas 11 January 2019 (has links) Diese Dissertation beschreibt das Design und die Implementierung eines effizienten Suchsystems für linguistische Korpora. Das bestehende und auf einer relationalen Datenbank basierende System ANNIS ist spezialisiert darin, Korpora mit verschiedenen Arten von Annotationen zu unterstützen und nutzt Graphen als einheitliche Repräsentation der verschiedener Annotationen. Für diese Dissertation wurde eine Hauptspeicher-Datenbank, die rein auf Graphen basiert, als Nachfolger für ANNIS entwickelt. Die Korpora werden in Kantenkomponenten partitioniert und für verschiedene Typen von Subgraphen werden unterschiedliche Implementationen zur Darstellung und Suche in diesen Komponenten genutzt. Operationen der Anfragesprache AQL (ANNIS Query Language) werden als Kombination von Erreichbarkeitsanfragen auf diesen verschiedenen Komponenten implementiert und jede Implementierung hat optimierte Funktionen für diese Art von Anfragen. Dieser Ansatz nutzt die verschiedenen Strukturen der unterschiedlichen Annotationsarten aus, ohne die einheitliche Darstellung als Graph zu verlieren. Zusätzliche Optimierungen, wie die parallele Ausführung von Teilen der Anfragen, wurden ebenfalls implementiert und evaluiert. Da AQL eine bestehende Implementierung besitzt und diese für Forscher offen als webbasierter Service zu Verfügung steht, konnten echte AQL-Anfragen aufgenommen werden. Diese dienten als Grundlage für einen Benchmark der neuen Implementierung. Mehr als 4000 Anfragen über 18 Korpora wurden zu einem realistischen Workload zusammengetragen, der sehr unterschiedliche Arten von Korpora und Anfragen mit einem breitem Spektrum von Komplexität enthält. Die neue graphbasierte Implementierung wurde mit der existierenden, die eine relationale Datenbank nutzt, verglichen. Sie führt den Anfragen im Workload im Vergleich ~10 schneller aus und die Experimente zeigen auch, dass die verschiedenen Implementierungen für die Kantenkomponenten daran einen großen Anteil haben. / This dissertation describes the design and implementation of an efficient system for linguistic corpus queries. The existing system ANNIS is based on a relational database and is focused on providing support for corpora with very different kinds of annotations and uses graphs as unified representations of the different annotations. For this dissertation, a main memory and solely graph-based successor of ANNIS has been developed. Corpora are divided into edge components and different implementations for representation and search of these components are used for different types of subgraphs. AQL operations are interpreted as a set of reachability queries on the different components and each component implementation has optimized functions for this type of queries. This approach allows exploiting the different structures of the different kinds of annotations without losing the common representation as a graph. Additional optimizations, like parallel executions of parts of the query, are also implemented and evaluated. Since AQL has an existing implementation and is already provided as a web-based service for researchers, real-life AQL queries have been recorded and thus can be used as a base for benchmarking the new implementation. More than 4000 queries from 18 corpora (from which most are available under an open-access license) have been compiled into a realistic workload that includes very different types of corpora and queries with a wide range of complexity. The new graph-based implementation was compared against the existing one, which uses a relational database. It executes the workload ~10 faster than the baseline and experiments show that the different graph storage implementations had a major effect in this improvement. Hauptspeicher-Datenbank Graphdatenbank Korpuslinguistik Suchmaschine In-memory database Graph database Corpus linguistics Search engine 004 Datenverarbeitung; Informatik ST 306 ddc:004
7	Cache conscious column organization in in-memory column stores Schwalb, David, Krüger, Jens, Plattner, Hasso January 2013 (has links) Cost models are an essential part of database systems, as they are the basis of query performance optimization. Based on predictions made by cost models, the fastest query execution plan can be chosen and executed or algorithms can be tuned and optimised. In-memory databases shifts the focus from disk to main memory accesses and CPU costs, compared to disk based systems where input and output costs dominate the overall costs and other processing costs are often neglected. However, modelling memory accesses is fundamentally different and common models do not apply anymore. This work presents a detailed parameter evaluation for the plan operators scan with equality selection, scan with range selection, positional lookup and insert in in-memory column stores. Based on this evaluation, a cost model based on cache misses for estimating the runtime of the considered plan operators using different data structures is developed. Considered are uncompressed columns, bit compressed and dictionary encoded columns with sorted and unsorted dictionaries. Furthermore, tree indices on the columns and dictionaries are discussed. Finally, partitioned columns consisting of one partition with a sorted and one with an unsorted dictionary are investigated. New values are inserted in the unsorted dictionary partition and moved periodically by a merge process to the sorted partition. An efficient attribute merge algorithm is described, supporting the update performance required to run enterprise applications on read-optimised databases. Further, a memory traffic based cost model for the merge process is provided. / Kostenmodelle sind ein essentieller Teil von Datenbanksystemen und bilden die Basis für Optimierungen von Ausführungsplänen. Durch Abschätzungen der Kosten können die entsprechend schnellsten Operatoren und Algorithmen zur Abarbeitung einer Anfrage ausgewählt und ausgeführt werden. Hauptspeicherresidente Datenbanken verschieben den Fokus von I/O Operationen hin zu Zugriffen auf den Hauptspeicher und CPU Kosten, verglichen zu Datenbanken deren primäre Kopie der Daten auf Sekundärspeicher liegt und deren Kostenmodelle sich in der Regel auf die kostendominierenden Zugriffe auf das Sekundärmedium beschränken. Kostenmodelle für Zugriffe auf Hauptspeicher unterscheiden sich jedoch fundamental von Kostenmodellen für Systeme basierend auf Festplatten, so dass alte Modelle nicht mehr greifen. Diese Arbeit präsentiert eine detaillierte Parameterdiskussion, sowie ein Kostenmodell basierend auf Cache-Zugriffen zum Abschätzen der Laufzeit von Datenbankoperatoren in spaltenorientierten und hauptspeicherresidenten Datenbanken wie das Selektieren von Werten einer Spalte mittels einer Gleichheitsbedingung oder eines Wertebereichs, das Nachschlagen der Werte einzelner Positionen oder dem Hinzufügen neuer Werte. Dabei werden Kostenfunktionen für die Operatoren erstellt, welche auf unkomprimierten Spalten, mittels Substitutionskompression komprimierten Spalten sowie bit-komprimierten Spalten operieren. Des Weiteren werden Baumstrukturen als Index Strukturen auf Spalten und Wörterbüchern in die Betrachtung gezogen. Abschließend werden partitionierte Spalten eingeführt, welche aus einer lese- und einer schreib-optimierten Partition bestehen. Neu Werte werden in die schreiboptimierte Partition eingefügt und periodisch von einem Attribut-Merge-Prozess mit der leseoptimierten Partition zusammengeführt. Beschrieben wird eine Effiziente Implementierung für den Attribut-Merge-Prozess und ein Hauptspeicher-bandbreitenbasiertes Kostenmodell aufgestellt. Hauptspeicherdatenbank Datenbank-Kostenmodell Attribut-Merge-Prozess In-Memory Database Database Cost Model Attribute Merge Process Data processing Computer science
8	SAP reporting s využitím in-memory databáze / SAP Reporting with the Use of In-memory Database Smejkal, Václav January 2018 (has links) The master's thesis deals with the in-memory database SAP HANA which keeps all data directly in main memory with the use of column-oriented data layout. Practical part of the thesis consists in development of application in SAP R/3 environment, with which performance of the in-memory database SAP HANA and the traditional database MaxDB is compared, including influence of used data layout. The results show that the in-memory database is advantageous especially for analytical operations based on aggregate functions.
9	Adaptive Energy-Control for In-Memory Database Systems Kissinger, Thomas, Habich, Dirk, Lehner, Wolfgang 30 May 2022 (has links) The ever-increasing demand for scalable database systems is limited by their energy consumption, which is one of the major challenges in research today. While existing approaches mainly focused on transaction-oriented disk-based database systems, we are investigating and optimizing the energy consumption and performance of data-oriented scale-up in-memory database systems that make heavy use of the main power consumers, which are processors and main memory. We give an in-depth energy analysis of a current mainstream server system and show that modern processors provide a rich set of energy-control features, but lack the capability of controlling them appropriately, because of missing application-specific knowledge. Thus, we propose the Energy-Control Loop (ECL) as an DBMS-integrated approach for adaptive energy-control on scale-up in-memory database systems that obeys a query latency limit as a soft constraint and actively optimizes energy efficiency and performance of the DBMS. The ECL relies on adaptive workload-dependent energy profiles that are continuously maintained at runtime. In our evaluation, we observed energy savings ranging from 20% to 40% for a real-world load profile. info:eu-repo/classification/ddc/004 ddc:004
10	Efficient Compute Node-Local Replication Mechanisms for NVRAM-Centric Data Structures Zarubin, Mikhail, Kissinger, Thomas, Habich, Dirk, Lehner, Wolfgang 11 July 2022 (has links) Non-volatile random-access memory (NVRAM) is about to hit the market and will require significant changes to the architecture of in-memory database systems. Since such hybrid DRAM-NVRAM database systems will keep the primary data solely persistent in the NVRAM, efficient replication mechanisms need to be considered to prevent data losses and to guarantee high availability in case of NVDIMM failures. In this paper, we argue for a software-based replication approach and present compute node-local mechanisms to provide the building blocks for an efficient NVRAM replication with a low latency and throughput penalty. Within our evaluation, we measured up to 10x less overhead for our optimized replication mechanisms compared to the basic replication mechanism of the Intel persistent memory development kit (PMDK). info:eu-repo/classification/ddc/004 ddc:004

Search results