Global ETD Search

Return to search

Aggregate-based Training Phase for ML-based Cardinality Estimation

Cardinality estimation is a fundamental task in database query processing and optimization. As shown in recent papers, machine learning (ML)-based approaches may deliver more accurate cardinality estimations than traditional approaches. However, a lot of training queries have to be executed during the model training phase to learn a data-dependent ML model making it very time-consuming. Many of those training or example queries use the same base data, have the same query structure, and only differ in their selective predicates. To speed up the model training phase, our core idea is to determine a predicate-independent pre-aggregation of the base data and to execute the example queries over this pre-aggregated data. Based on this idea, we present a specific aggregate-based training phase for ML-based cardinality estimation approaches in this paper. As we are going to show with different workloads in our evaluation, we are able to achieve an average speedup of 90 with our aggregate-based training phase and thus outperform indexes.

info:eu-repo/classification/ddc/004

ddc:004

Identifer	oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:89177
Date	22 April 2024
Creators	Woltmann, Lucas, Hartmann, Claudio, Lehner, Wolfgang, Habich, Dirk
Publisher	Springer
Source Sets	Hochschulschriftenserver (HSSS) der SLUB Dresden
Language	English
Detected Language	English
Type	info:eu-repo/semantics/publishedVersion, doc-type:article, info:eu-repo/semantics/article, doc-type:Text
Rights	info:eu-repo/semantics/openAccess
Relation	1610-1995, 10.1007/s13222-021-00400-z

Page generated in 0.0021 seconds

Aggregate-based Training Phase for ML-based Cardinality Estimation

Description

Links & Downloads

Tags

Additional Fields