Global ETD Search

Return to search

Dynamically tuning LSM tree based databases

Log-Structured Merge (LSM) trees are a popular choice of data structure for key-value database systems due to their high ingestion rate and fast reads. They achieve this by appending new writes and updates sequentially and buffering changes in memory before flushing them to disk in sorted order. The LSM tree behavior can be dynamically altered by a large set of tunable parameters to accommodate a wide range of workloads. While these parameters provide flexibility, identifying the optimal value for these parameters, in order to maximize system performance, is a known hard problem. Offline tuning approaches can provide optimal configurations, however, they require knowledge about the workload a priori and/or evaluating hundreds of configurations, meaning that they lack the flexibility to adapt to evolving conditions. In the online setting, evaluating the performance impact of a particular tuning knob can be an expensive endeavor.

To this end, we propose Onix, a tuning framework that focuses on tackling the online setting of the tuning problem, specifically dynamically tuning LSM trees using Bayesian Optimization (BO). BO constructs a probabilistic model to navigate the space of tuning knobs, striking a careful balance between exploring uncharted parameter configurations and exploiting areas already identified as promising. We leverage BO’s efficient convergence to minimize the number of configurations deployed for exploration. Onix integrates Microsoft’s BO-based system tuning framework, MLOS, with Meta’s state-of-the-art LSM tree implementation, RocksDB. As workloads are executed on RocksDB, Onix propagates appropriate information to MLOS, which in turn recommends the correct configuration for the current workload. This process is repeated periodically, thus re-evaluating if a new tuning suggestion (i.e., configuration) can provide better performance. In the best-case scenario, Onix can achieve up to 2× better performance (in terms of average read latency) than the default configuration, while performing at least as well as default tuning in the worst case guaranteeing no performance regression.

https://hdl.handle.net/2144/49065

Computer science

Databases

Identifer	oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/49065
Date	02 July 2024
Creators	Sharma, Sakshi
Contributors	Athanassoulis, Manos
Source Sets	Boston University
Language	en_US
Detected Language	English
Type	Thesis/Dissertation
Rights	Attribution-NonCommercial-ShareAlike 4.0 International, http://creativecommons.org/licenses/by-nc-sa/4.0/

Page generated in 0.0172 seconds

Dynamically tuning LSM tree based databases

Description

Links & Downloads

Tags

Additional Fields