121 |
EDEN: an epigraphic web database of ancient inscriptionsScholz, Martin January 2016 (has links)
No description available.
|
122 |
Trismegistos Places: a geographical index for all Latin inscriptionsVerreth, Herbert January 2016 (has links)
The Trismegistos database has recently created a geographical index for all Latin inscriptions. For the moment we have 67.884 geographical references attested in Latin documentary texts, but this rough starting material still has to be refined. This paper describes how we undertook this task, which problems we encountered while doing so, and the choices we made for the presentation of the material.
|
123 |
Implementierung eines effizienten datenbankbasierten SPARQL-Prozessors und Erweiterungen zu SPARQLWeiske, Christian 21 November 2017 (has links)
Die Vision des Internets als weltumspannendes Informationsnetz ist zur Wirklichkeit geworden. Um die Idee des weltweiten Datennetzes wahr werden zu lassen, wurde das Resource Description Framework RDF entwickelt. Die dieses Netz speisenden Daten sind zu großen Teilen in Datenbanken abgelegt. Eine Abfragesprache für RDF-Daten ist die vom World Wide Web Consortium im Standardisierungsprozess befindliche SPARQL Protocol And Query Language. Die RDF API for PHP ist eine in der Skriptsprache PHP programmierte Funktionsbibliothek zum Arbeiten mit Daten im RDF-Format. Ziel dieser Arbeit ist es, eine effiziente, datenbankgestützte Implementierung von SPARQL auf Basis der RAP-Bibliothek zu schaffen. Nach den Grundlagen des semantischen Netzes werden neben SPARQL und der RDF API for PHP auch konkurrierende Bibliotheken mit SPARQL-Unterstützung vorgestellt. Der bereits existierende, speicherbasierte SPARQL Prozessor des RAP-Frameworks wird analysiert und dessen Probleme identifiziert. Im Folgenden werden die Anforderungen und die Implementierung des neuen datenbankgestützten Prozessors im Detail beschrieben. Abschließend kommt es zur Evaluation der Geschwindigkeit des neuen Prozessors im Vergleich zu anderen SPARQL-fähigen Datensystemen. Während der täglichen Arbeit mit SPARQL hat sich herausgestellt, dass sich das Protokoll aufgrund einiger Unzulänglichkeiten noch nicht vollständig für den praktischen Einsatz eignet. Aus diesem Grund werden abschließend Erweiterungen zu SPARQL vorgestellt, die sich in der Praxis als hilfreich erwiesen haben. Der im Rahmen dieser Diplomarbeit umgesetzte SPARQL Prozessor erweist sich als robust und schnell. Er liegt trotz der Umsetzung mit einer Skriptsprache in der gleichen Leistungsdimension wie andere State-of-the-Art-Implementierungen und ist diesen häufig sogar überlegen. Er wird bereits in mehreren Anwendungen produktiv eingesetzt und ist als Open-Source-Projekt einer internationalen Anwender- und Entwicklergemeinde zugänglich.
|
124 |
Sicheres Verteilen von Konfigurationsdaten und Migrationsstrategie zum Trennen von Diensten und DatenbasisWehrmann, Sebastian 01 August 2006 (has links)
Aus historischen Gründen war die CSN Datenbank und die darauf zugreifenden Dienste immer auf dem selben Rechner. Zum einen aus Geldmangel, zum anderen, weil die Verteilung der Konfiguration und Zugriffssteuerung zur Datenbank ein ungelöstes Problem ist. Aufgabe dieser Arbeit ist die physikalische und logische Trennung der Firewall (und des Shapers) von der Datenbank. Dazu muss ein Dienst geschaffen werden, der die Konfigurationsinformationen für die Firewall und potentiell andere Applikationen bereitstellt. Der Zugriff auf diese Informationen muss vor Dritten geschützt werden. Im Weiteren soll eine Migrationstrategie entworfen werden, wie der Übergang zu der skizzierten Lösung bewerkstelligt werden kann.
|
125 |
Allocation Strategies for Data-Oriented ArchitecturesKiefer, Tim 09 October 2015 (has links)
Data orientation is a common design principle in distributed data management systems. In contrast to process-oriented or transaction-oriented system designs, data-oriented architectures are based on data locality and function shipping. The tight coupling of data and processing thereon is implemented in different systems in a variety of application scenarios such as data analysis, database-as-a-service, and data management on multiprocessor systems. Data-oriented systems, i.e., systems that implement a data-oriented architecture, bundle data and operations together in tasks which are processed locally on the nodes of the distributed system. Allocation strategies, i.e., methods that decide the mapping from tasks to nodes, are core components in data-oriented systems. Good allocation strategies can lead to balanced systems while bad allocation strategies cause skew in the load and therefore suboptimal application performance and infrastructure utilization. Optimal allocation strategies are hard to find given the complexity of the systems, the complicated interactions of tasks, and the huge solution space. To ensure the scalability of data-oriented systems and to keep them manageable with hundreds of thousands of tasks, thousands of nodes, and dynamic workloads, fast and reliable allocation strategies are mandatory. In this thesis, we develop novel allocation strategies for data-oriented systems based on graph partitioning algorithms.
Therefore, we show that systems from different application scenarios with different abstraction levels can be generalized to generic infrastructure and workload descriptions. We use weighted graph representations to model infrastructures with bounded and unbounded, i.e., overcommited, resources and possibly non-linear performance characteristics. Based on our generalized infrastructure and workload model, we formalize the allocation problem, which seeks valid and balanced allocations that minimize communication. Our allocation strategies partition the workload graph using solution heuristics that work with single and multiple vertex weights. Novel extensions to these solution heuristics can be used to balance penalized and secondary graph partition weights. These extensions enable the allocation strategies to handle infrastructures with non-linear performance behavior. On top of the basic algorithms, we propose methods to incorporate heterogeneous infrastructures and to react to changing workloads and infrastructures by incrementally updating the partitioning. We evaluate all components of our allocation strategy algorithms and show their applicability and scalability with synthetic workload graphs. In end-to-end--performance experiments in two actual data-oriented systems, a database-as-a-service system and a database management system for multiprocessor systems, we prove that our allocation strategies outperform alternative state-of-the-art methods.
|
126 |
Heterogeneity-Aware Placement Strategies for Query OptimizationKarnagel, Tomas 23 May 2017 (has links)
Computing hardware is changing from systems with homogeneous CPUs to systems with heterogeneous computing units like GPUs, Many Integrated Cores, or FPGAs. This trend is caused by scaling problems of homogeneous systems, where heat dissipation and energy consumption is limiting further growths in compute-performance. Heterogeneous systems provide differently optimized computing hardware, which allows different operations to be computed on the most appropriate computing unit, resulting in faster execution and less energy consumption.
For database systems, this is a new opportunity to accelerate query processing, allowing faster and more interactive querying of large amounts of data. However, the current hardware trend is also a challenge as most database systems do not support heterogeneous computing resources and it is not clear how to support these systems best. In the past, mainly single operators were ported to different computing units showing great results, while missing a system wide application. To efficiently support heterogeneous systems, a systems approach for query processing and query optimization is needed.
In this thesis, we tackle the optimization challenge in detail. As a starting point, we evaluate three different approaches on isolated use-cases to assess their advantages and limitations. First, we evaluate a fork-join approach of intra-operator parallelism, where the same operator is executed on multiple computing units at the same time, each execution with different data partitions. Second, we evaluate using one computing unit statically to accelerate one operator, which provides high code-optimization potential, due to this static and pre-known usage of hardware and software. Third, we evaluate dynamically placing operators onto computing units, depending on the operator, the available computing hardware, and the given data sizes. We argue that the first and second approach suffer from multiple overheads or high implementation costs. The third approach, dynamic placement, shows good performance, while being highly extensible to different computing units and different operator implementations.
To automate this dynamic approach, we first propose general placement optimization for query processing. This general approach includes runtime estimation of operators on different computing units as well as two approaches for defining the actual operator placement according to the estimated runtimes. The two placement approaches are local optimization, which decides the placement locally at run-time, and global optimization, where the placement is decided at compile-time, while allowing a global view for enhanced data sharing. The main limitation of the latter is the high dependency on cardinality estimation of intermediate results, as estimation errors for the cardinalities propagate to the operator runtime estimation and placement optimization. Therefore, we propose adaptive placement optimization, allowing the placement optimization to become fully independent of cardinalities estimation, effectively eliminating the main source of inaccuracy for runtime estimation and placement optimization. Finally, we define an adaptive placement sequence, incorporating all our proposed techniques of placement optimization. We implement this sequence as a virtualization layer between the database system and the heterogeneous hardware. Our implementation approach bases on preexisting interfaces to the database system and the hardware, allowing non-intrusive integration into existing database systems. We evaluate our techniques using two different database systems and two different OLAP benchmarks, accelerating the query processing through heterogeneous execution.
|
127 |
Data compilation and evaluation for U(IV) and U(VI) for the Thermodynamic Reference Database THEREDARichter, Anke, Bok, Frank, Brendler, Vinzenz January 2015 (has links)
THEREDA (Thermodynamic Reference Database) is a collaborative project, which has been addressed this challenge. The partners are Helmholtz-Zentrum Dresden-Rossendorf, Karlsruhe Institute of Technology (KIT-INE), Gesellschaft für Anlagen- und Reaktorsicherheit Braunschweig mbH (GRS), TU Bergakademie Freiberg (TUBAF) and AF-Consult Switzerland AG (Baden, Switzerland). The aim of the project is the establishment of a consistent and quality assured database for all safety relevant elements, temperature and pressure ranges, with its focus on saline systems. This implied the use of the Pitzer approach to compute activity coefficients suitable for such conditions. Data access is possible via commonly available internet browsers under the address http://www.thereda.de.
One part of the project - the data collection and evaluation for uranium – was a task of the Helmholtz-Zentrum Dresden-Rossendorf. The aquatic chemistry and thermodynamics of U(VI) and U(IV) is of great importance for geochemical modelling in repository-relevant systems. The OECD/NEA Thermochemical Database (NEA TDB) compilation is the major source for thermodynamic data of the aqueous and solid uranium species, even though this data selection does not utilize the Pitzer model for the ionic strength effect correction. As a result of the very stringent quality demands, the NEA TDB is rather restrictive and therefore incomplete for extensive modelling calculations of real systems. Therefore, the THEREDA compilation includes additional thermodynamic data of solid secondary phases formed in the waste material, the backfill and the host rock, though falling into quality assessment (QA) categories of lower accuracy. The data review process prefers log K values from solubility experiments (if available) to those calculated from thermochemical data.
|
128 |
Lit4School – Lesen, Lehren, Lernen: ProjektkonzeptWeise, Simon Paul 19 June 2019 (has links)
Das Projekt „Lit4School“ (Literatur für die Schule) forciert die Etablierung einer benutzerfreundlichen, onlinebasierten Datenbank mit Empfehlungen authentischer, literarischer Texte für den Englischunterricht – anhand der jeweiligen Schulform, Klassen- bzw. Jahrgangsstufe, Textsorte bzw. Gattung. Als universell einsetzbares, digitales Werkzeug soll Lit4School zu einer systematischen Weiterentwicklung universitärer Lehrpraxis beitragen und gleichzeitig LehrerInnen unterschiedlicher Schulformen bei der Vermittlung von authentischer Literatur unterstützen.
|
129 |
KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern ArchitecturesKissinger, Thomas, Schlegel, Benjamin, Habich, Dirk, Lehner, Wolfgang 04 June 2012 (has links)
Growing main memory capacities and an increasing number of hardware threads in modern server systems led to fundamental changes in database architectures. Most importantly, query processing is nowadays performed on data that is often completely stored in main memory. Despite of a high main memory scan performance, index structures are still important components, but they have to be designed from scratch to cope with the specific characteristics of main memory and to exploit the high degree of parallelism. Current research mainly focused on adapting block-optimized B+-Trees, but these data structures were designed for secondary memory and involve comprehensive structural maintenance for updates.
In this paper, we present the KISS-Tree, a latch-free inmemory index that is optimized for a minimum number of memory accesses and a high number of concurrent updates. More specifically, we aim for the same performance as modern hash-based algorithms but keeping the order-preserving nature of trees. We achieve this by using a prefix tree that incorporates virtual memory management functionality and compression schemes. In our experiments, we evaluate the KISS-Tree on different workloads and hardware platforms and compare the results to existing in-memory indexes. The KISS-Tree offers the highest reported read performance on current architectures, a balanced read/write performance, and has a low memory footprint.
|
130 |
Leveraging Non-Volatile Memory in Modern Storage Management ArchitecturesLersch, Lucas 14 May 2021 (has links)
Non-volatile memory technologies (NVM) introduce a novel class of devices that combine characteristics of both storage and main memory. Like storage, NVM is not only persistent, but also denser and cheaper than DRAM. Like DRAM, NVM is byte-addressable and has lower access latency. In recent years, NVM has gained a lot of attention both in academia and in the data management industry, with views ranging from skepticism to over excitement. Some critics claim that NVM is not cheap enough to replace flash-based SSDs nor is it fast enough to replace DRAM, while others see it simply as a storage device. Supporters of NVM have observed that its low latency and byte-addressability requires radical changes and a complete rewrite of storage management architectures.
This thesis takes a moderate stance between these two views. We consider that, while NVM might not replace flash-based SSD or DRAM in the near future, it has the potential to reduce the gap between them. Furthermore, treating NVM as a regular storage media does not fully leverage its byte-addressability and low latency. On the other hand, completely redesigning systems to be NVM-centric is impractical. Proposals that attempt to leverage NVM to simplify storage management result in completely new architectures that face the same challenges that are already well-understood and addressed by the traditional architectures. Therefore, we take three common storage management architectures as a starting point, and propose incremental changes to enable them to better leverage NVM. First, in the context of log-structured merge-trees, we investigate the impact of storing data in NVM, and devise methods to enable small granularity accesses and NVM-aware caching policies. Second, in the context of B+Trees, we propose to extend the buffer pool and describe a technique based on the concept of optimistic consistency to handle corrupted pages in NVM. Third, we employ NVM to enable larger capacity and reduced costs in a index+log key-value store, and combine it with other techniques to build a system that achieves low tail latency. This thesis aims to describe and evaluate these techniques in order to enable storage management architectures to leverage NVM and achieve increased performance and lower costs, without major architectural changes.:1 Introduction
1.1 Non-Volatile Memory
1.2 Challenges
1.3 Non-Volatile Memory & Database Systems
1.4 Contributions and Outline
2 Background
2.1 Non-Volatile Memory
2.1.1 Types of NVM
2.1.2 Access Modes
2.1.3 Byte-addressability and Persistency
2.1.4 Performance
2.2 Related Work
2.3 Case Study: Persistent Tree Structures
2.3.1 Persistent Trees
2.3.2 Evaluation
3 Log-Structured Merge-Trees
3.1 LSM and NVM
3.2 LSM Architecture
3.2.1 LevelDB
3.3 Persistent Memory Environment
3.4 2Q Cache Policy for NVM
3.5 Evaluation
3.5.1 Write Performance
3.5.2 Read Performance
3.5.3 Mixed Workloads
3.6 Additional Case Study: RocksDB
3.6.1 Evaluation
4 B+Trees
4.1 B+Tree and NVM
4.1.1 Category #1: Buffer Extension
4.1.2 Category #2: DRAM Buffered Access
4.1.3 Category #3: Persistent Trees
4.2 Persistent Buffer Pool with Optimistic Consistency
4.2.1 Architecture and Assumptions
4.2.2 Embracing Corruption
4.3 Detecting Corruption
4.3.1 Embracing Corruption
4.4 Repairing Corruptions
4.5 Performance Evaluation and Expectations
4.5.1 Checksums Overhead
4.5.2 Runtime and Recovery
4.6 Discussion
5 Index+Log Key-Value Stores
5.1 The Case for Tail Latency
5.2 Goals and Overview
5.3 Execution Model
5.3.1 Reactive Systems and Actor Model
5.3.2 Message-Passing Communication
5.3.3 Cooperative Multitasking
5.4 Log-Structured Storage
5.5 Networking
5.6 Implementation Details
5.6.1 NVM Allocation on RStore
5.6.2 Log-Structured Storage and Indexing
5.6.3 Garbage Collection
5.6.4 Logging and Recovery
5.7 Systems Operations
5.8 Evaluation
5.8.1 Methodology
5.8.2 Environment
5.8.3 Other Systems
5.8.4 Throughput Scalability
5.8.5 Tail Latency
5.8.6 Scans
5.8.7 Memory Consumption
5.9 Related Work
6 Conclusion
Bibliography
A PiBench
|
Page generated in 0.0488 seconds