• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Recursive Relative Prefix Sum Approach to Range Queries in Data Warehouses

Wu¡@, Fa-Jung 07 July 2002 (has links)
Data warehouses contain data consolidated from several operational databases and provide the historical, and summarized data which is more appropriate for analysis than detail, individual records. On-Line Analytical Processing (OLAP) provides advanced analysis tools to extract information from data stored in a Data Warehouse. OLAP is designed to provide aggregate information that can be used to analyze the contents of databases and data warehouses. A range query applies an aggregation operation over all selected cells of an OLAP data cube where the selection is specified by providing ranges of values for numeric dimensions. Range sum queries are very useful in finding trends and in discovering relationships between attributes in the database. There is a method, prefix sum method, promises that any range sum query on a data cube can be answered in constant time by precomputing some auxiliary information. However, it is hampered by its update cost. For today's applications, interactive data analysis applications which provide current or "near current" information will require fast response time and have reasonable update time. Since the size of a data cube is exponential in the number of its dimensions, rebuilding the entire data cube can be very costly and is not realistic. To cope with this dynamic data cube problem, several strategies have been proposed. They all use specific data structures, which require extra storage cost, to response range sum query fast. For example, the double relative prefix sum method makes use of three components: a block prefix array, a relative overlay array and a relative prefix array to store auxiliary information. Although the double relative prefix sum method improves the update cost, it increases the query time. In the thesis, we present a method, called the recursive relative prefix sum method, which tries to provide a compromise between query and update cost. In the recursive relative prefix sum method with k levels, we use a relative prefix array and k relative overlay arrays. From our performance study, we show that the update cost of our method is always less than that of the prefix sum method. In most of cases, the update cost of our method is less than that of the relative prefix sum method. Moreover, in most of cases, the query cost of our method is less than that of the double relative prefix sum method. Compared with the dynamic data cube method, our method has lower storage cost and shorter query time. Consequently, our recursive relative prefix sum method has a reasonable response time for ad hoc range queries on the data cube, while at the same time, greatly reduces the update cost. In some applications, however, updating in some regions may happen more frequently than others. We also provide a solution, called the weighted relative prefix sum} method, for this situation. Therefore, this method can also provide a compromise between the range sum query cost and the update cost, when the update probabilities of different regions are considered.

Page generated in 0.169 seconds