Spelling suggestions: "subject:"smart tube"" "subject:"smart cube""
1 |
Smart Cube Predictions for Online Analytic Query Processing in Data WarehousesBelcin, Andrei 01 April 2021 (has links)
A data warehouse (DW) is a transformation of many sources of transactional data integrated into a single collection that is non-volatile and time-variant that can provide decision support to managerial roles within an organization. For this application, the database server needs to process multiple users’ queries by joining various datasets and loading the result in main memory to begin calculations. In current systems, this process is reactionary to users’ input and can be undesirably slow. In previous studies, it was shown that a personalization scheme of a single user’s query patterns and loading the smaller subset into main memory the query response time significantly shortened the query response time. The LPCDA framework developed in this research handles multiple users’ query demands, and the query patterns are subject to change (so-called concept drift) and noise. To this end, the LPCDA framework detects changes in user behaviour and dynamically adapts the personalized smart cube definition for the group of users.
Numerous data mart (DM)s, as components of the DW, are subject to intense aggregations to assist analytics at the request of automated systems and human users’ queries. Subsequently, there is a growing need to properly manage the supply of data into main memory that is in closest proximity to the CPU that computes the query in order to reduce the response time from the moment a query arrives at the DW server. As a result, this thesis proposes an end-to-end adaptive learning ensemble for resource allocation of cuboids within a a DM to achieve a relevant and timely constructed smart cube before the time in need, as a way of adopting the just-in-time inventory management strategy applied in other real-world scenarios.
The algorithms comprising the ensemble involve predictive methodologies from Bayesian statistics, data mining, and machine learning, that reflect the changes in the data-generating process using a number of change detection algorithms. Therefore, given different operational constraints and data-specific considerations, the ensemble can, to an effective degree, determine the cuboids in the lattice of a DM to pre-construct into a smart cube ahead of users submitting their queries, thereby benefiting from a quicker response than static schema views or no action at all.
|
2 |
A Personalized Smart Cube for Faster and Reliable Access to DataAntwi, Daniel K. 02 December 2013 (has links)
Organizations own data sources that contain millions, billions or even trillions of rows
and these data are usually highly dimensional in nature. Typically, these raw repositories
are comprised of numerous independent data sources that are too big to be copied or
joined, with the consequence that aggregations become highly problematic. Data cubes
play an essential role in facilitating fast Online Analytical Processing (OLAP) in many
multi-dimensional data warehouses. Current data cube computation techniques have
had some success in addressing the above-mentioned aggregation problem. However,
the combined problem of reducing data cube size for very large and highly dimensional
databases, while guaranteeing fast query response times, has received less attention.
Another issue is that most OLAP tools often causes users to be lost in the ocean of
data while performing data analysis. Often, most users are interested in only a subset
of the data. For example, consider in such a scenario, a business manager who wants
to answer the crucial location-related business question. "Why are my sales declining
at location X"? This manager wants fast, unambiguous location-aware answers to his
queries. He requires access to only the relevant ltered information, as found from the
attributes that are directly correlated with his current needs. Therefore, it is important
to determine and to extract, only that small data subset that is highly relevant from a
particular user's location and perspective.
In this thesis, we present the Personalized Smart Cube approach to address the abovementioned scenario. Our approach consists of two main parts. Firstly, we combine
vertical partitioning, partial materialization and dynamic computation to drastically
reduce the size of the computed data cube while guaranteeing fast query response times.
Secondly, our personalization algorithm dynamically monitors user query pattern and
creates a personalized data cube for each user. This ensures that users utilize only that
small subset of data that is most relevant to them.
Our experimental evaluation of our Personalized Smart Cube approach showed that
our work compared favorably with other state-of-the-art methods. We evaluated our
work focusing on three main criteria, namely the storage space used, query response
time and the cost savings ratio of using a personalized cube. The results showed that our
algorithm materializes a relatively smaller number of views than other techniques and it
also compared favourable in terms of query response time. Further, our personalization
algorithm is superior to the state-of-the art Virtual Cube algorithm, when evaluated
in terms of the number of user queries that were successfully answered when using a
personalized cube, instead of the base cube.
|
3 |
A Personalized Smart Cube for Faster and Reliable Access to DataAntwi, Daniel K. January 2013 (has links)
Organizations own data sources that contain millions, billions or even trillions of rows
and these data are usually highly dimensional in nature. Typically, these raw repositories
are comprised of numerous independent data sources that are too big to be copied or
joined, with the consequence that aggregations become highly problematic. Data cubes
play an essential role in facilitating fast Online Analytical Processing (OLAP) in many
multi-dimensional data warehouses. Current data cube computation techniques have
had some success in addressing the above-mentioned aggregation problem. However,
the combined problem of reducing data cube size for very large and highly dimensional
databases, while guaranteeing fast query response times, has received less attention.
Another issue is that most OLAP tools often causes users to be lost in the ocean of
data while performing data analysis. Often, most users are interested in only a subset
of the data. For example, consider in such a scenario, a business manager who wants
to answer the crucial location-related business question. "Why are my sales declining
at location X"? This manager wants fast, unambiguous location-aware answers to his
queries. He requires access to only the relevant ltered information, as found from the
attributes that are directly correlated with his current needs. Therefore, it is important
to determine and to extract, only that small data subset that is highly relevant from a
particular user's location and perspective.
In this thesis, we present the Personalized Smart Cube approach to address the abovementioned scenario. Our approach consists of two main parts. Firstly, we combine
vertical partitioning, partial materialization and dynamic computation to drastically
reduce the size of the computed data cube while guaranteeing fast query response times.
Secondly, our personalization algorithm dynamically monitors user query pattern and
creates a personalized data cube for each user. This ensures that users utilize only that
small subset of data that is most relevant to them.
Our experimental evaluation of our Personalized Smart Cube approach showed that
our work compared favorably with other state-of-the-art methods. We evaluated our
work focusing on three main criteria, namely the storage space used, query response
time and the cost savings ratio of using a personalized cube. The results showed that our
algorithm materializes a relatively smaller number of views than other techniques and it
also compared favourable in terms of query response time. Further, our personalization
algorithm is superior to the state-of-the art Virtual Cube algorithm, when evaluated
in terms of the number of user queries that were successfully answered when using a
personalized cube, instead of the base cube.
|
Page generated in 0.0423 seconds