Today, data warehouses are used to store large amounts of data. This thesis investigates the impact of various database schema designs on query execution time within the cloud platform Azure Data Explorer. As Azure Data Explorer is a relatively new platform, limited research exists on designing database schemas within the platform. Further, the design of the database schema has a direct impact on the query execution times. The design should also align with the use case of the data warehouse. This thesis conducts a requirements analysis, determines the use case, and designs three database schemas. The three database schemas are implemented and evaluated through a performance test. Schema 1 is designed to utilize results tables from stored functions, while schema 2 utilizes sub-functions divided by different departments or products to minimize the data accessed per query. Finally, schema 3 uses the results tables from the sub-functions found in schema 2. The result from the performance tests shows that schema 3 has the best overall improvement in query execution time compared to the other designs and the original design. The findings emphasize the critical role of database schema design in influencing query performance. Additionally, a conclusion is reached that using more than one approach to enhance query performance increases the potential query performance.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-203823 |
Date | January 2024 |
Creators | Petersson, Linn, Ferlin, Angelica |
Publisher | Linköpings universitet, Institutionen för datavetenskap |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0024 seconds