Storing and processing data at different locations using a heterogeneous set of formats and data managements systems is state-of-the-art in many organizations. However, data analyses can often provide better insight when data from several sources is integrated into a combined perspective. In this paper we present an overview of our data integration system DataCalc. DataCalc is an extensible integration platform that executes adhoc analytical queries on a set of heterogeneous data processors. Our novel platform uses an expressive function shipping interface that promotes local computation and reduces data movement between processors. In this paper, we provide a discussion of the overall architecture and the main components of DataCalc. Moreover, we discuss the cost of integrating additional processors and evaluate the overall performance of the platform.
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:86499 |
Date | 19 July 2023 |
Creators | Luong, Johannes, Habich, Dirk, Lehner, Wolfgang |
Publisher | IEEE |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/acceptedVersion, doc-type:conferenceObject, info:eu-repo/semantics/conferenceObject, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Relation | 978-1-7281-0858-2, 10.1109/BigData47090.2019.9006252 |
Page generated in 0.0016 seconds