Global ETD Search

1	Smart Cube Predictions for Online Analytic Query Processing in Data Warehouses Belcin, Andrei 01 April 2021 (has links) A data warehouse (DW) is a transformation of many sources of transactional data integrated into a single collection that is non-volatile and time-variant that can provide decision support to managerial roles within an organization. For this application, the database server needs to process multiple users’ queries by joining various datasets and loading the result in main memory to begin calculations. In current systems, this process is reactionary to users’ input and can be undesirably slow. In previous studies, it was shown that a personalization scheme of a single user’s query patterns and loading the smaller subset into main memory the query response time significantly shortened the query response time. The LPCDA framework developed in this research handles multiple users’ query demands, and the query patterns are subject to change (so-called concept drift) and noise. To this end, the LPCDA framework detects changes in user behaviour and dynamically adapts the personalized smart cube definition for the group of users. Numerous data mart (DM)s, as components of the DW, are subject to intense aggregations to assist analytics at the request of automated systems and human users’ queries. Subsequently, there is a growing need to properly manage the supply of data into main memory that is in closest proximity to the CPU that computes the query in order to reduce the response time from the moment a query arrives at the DW server. As a result, this thesis proposes an end-to-end adaptive learning ensemble for resource allocation of cuboids within a a DM to achieve a relevant and timely constructed smart cube before the time in need, as a way of adopting the just-in-time inventory management strategy applied in other real-world scenarios. The algorithms comprising the ensemble involve predictive methodologies from Bayesian statistics, data mining, and machine learning, that reflect the changes in the data-generating process using a number of change detection algorithms. Therefore, given different operational constraints and data-specific considerations, the ensemble can, to an effective degree, determine the cuboids in the lattice of a DM to pre-construct into a smart cube ahead of users submitting their queries, thereby benefiting from a quicker response than static schema views or no action at all. Machine Learning Concept Drift Data Warehouse Smart Cube OLAP Predictive Modelling
2	A Personalized Smart Cube for Faster and Reliable Access to Data Antwi, Daniel K. 02 December 2013 (has links) Organizations own data sources that contain millions, billions or even trillions of rows and these data are usually highly dimensional in nature. Typically, these raw repositories are comprised of numerous independent data sources that are too big to be copied or joined, with the consequence that aggregations become highly problematic. Data cubes play an essential role in facilitating fast Online Analytical Processing (OLAP) in many multi-dimensional data warehouses. Current data cube computation techniques have had some success in addressing the above-mentioned aggregation problem. However, the combined problem of reducing data cube size for very large and highly dimensional databases, while guaranteeing fast query response times, has received less attention. Another issue is that most OLAP tools often causes users to be lost in the ocean of data while performing data analysis. Often, most users are interested in only a subset of the data. For example, consider in such a scenario, a business manager who wants to answer the crucial location-related business question. "Why are my sales declining at location X"? This manager wants fast, unambiguous location-aware answers to his queries. He requires access to only the relevant ltered information, as found from the attributes that are directly correlated with his current needs. Therefore, it is important to determine and to extract, only that small data subset that is highly relevant from a particular user's location and perspective. In this thesis, we present the Personalized Smart Cube approach to address the abovementioned scenario. Our approach consists of two main parts. Firstly, we combine vertical partitioning, partial materialization and dynamic computation to drastically reduce the size of the computed data cube while guaranteeing fast query response times. Secondly, our personalization algorithm dynamically monitors user query pattern and creates a personalized data cube for each user. This ensures that users utilize only that small subset of data that is most relevant to them. Our experimental evaluation of our Personalized Smart Cube approach showed that our work compared favorably with other state-of-the-art methods. We evaluated our work focusing on three main criteria, namely the storage space used, query response time and the cost savings ratio of using a personalized cube. The results showed that our algorithm materializes a relatively smaller number of views than other techniques and it also compared favourable in terms of query response time. Further, our personalization algorithm is superior to the state-of-the art Virtual Cube algorithm, when evaluated in terms of the number of user queries that were successfully answered when using a personalized cube, instead of the base cube. Data Cube Dynamic Data Cube Personalized Cube Smart Cube Smart Data Cube Data Cube partitioning Small Data cube
3	A Personalized Smart Cube for Faster and Reliable Access to Data Antwi, Daniel K. January 2013 (has links) Organizations own data sources that contain millions, billions or even trillions of rows and these data are usually highly dimensional in nature. Typically, these raw repositories are comprised of numerous independent data sources that are too big to be copied or joined, with the consequence that aggregations become highly problematic. Data cubes play an essential role in facilitating fast Online Analytical Processing (OLAP) in many multi-dimensional data warehouses. Current data cube computation techniques have had some success in addressing the above-mentioned aggregation problem. However, the combined problem of reducing data cube size for very large and highly dimensional databases, while guaranteeing fast query response times, has received less attention. Another issue is that most OLAP tools often causes users to be lost in the ocean of data while performing data analysis. Often, most users are interested in only a subset of the data. For example, consider in such a scenario, a business manager who wants to answer the crucial location-related business question. "Why are my sales declining at location X"? This manager wants fast, unambiguous location-aware answers to his queries. He requires access to only the relevant ltered information, as found from the attributes that are directly correlated with his current needs. Therefore, it is important to determine and to extract, only that small data subset that is highly relevant from a particular user's location and perspective. In this thesis, we present the Personalized Smart Cube approach to address the abovementioned scenario. Our approach consists of two main parts. Firstly, we combine vertical partitioning, partial materialization and dynamic computation to drastically reduce the size of the computed data cube while guaranteeing fast query response times. Secondly, our personalization algorithm dynamically monitors user query pattern and creates a personalized data cube for each user. This ensures that users utilize only that small subset of data that is most relevant to them. Our experimental evaluation of our Personalized Smart Cube approach showed that our work compared favorably with other state-of-the-art methods. We evaluated our work focusing on three main criteria, namely the storage space used, query response time and the cost savings ratio of using a personalized cube. The results showed that our algorithm materializes a relatively smaller number of views than other techniques and it also compared favourable in terms of query response time. Further, our personalization algorithm is superior to the state-of-the art Virtual Cube algorithm, when evaluated in terms of the number of user queries that were successfully answered when using a personalized cube, instead of the base cube. Data Cube Dynamic Data Cube Personalized Cube Smart Cube Smart Data Cube Data Cube partitioning Small Data cube

1

Page generated in 0.0423 seconds