Return to search

Probabilistic skylines on uncertain data

Skyline analysis is important for multi-criteria decision making applications. The data in some of these applications are inherently uncertain due to various factors. Although a considerable amount of research has been dedicated separately to efficient skyline computation, as well as modeling uncertain data and answering some types of queries on uncertain data, how to conduct skyline analysis on uncertain data remains an open problem at large. In this thesis, we tackle the problem of skyline analysis on uncertain data. We propose a novel probabilistic skyline model where an uncertain object may take a probability to be in the skyline, and a p-skyline contains all the objects whose skyline probabilities are at least p. Computing probabilistic skylines on large uncertain data sets is challenging. An uncertain object is conceptually described by a probability density function (PDF) in the continuous case, or in the discrete case a set of instances (points) such that each instance has a probability to appear. We develop two efficient algorithms, the bottom-up and top-down algorithms, of computing p-skyline of a set of uncertain objects in the discrete case. We also discuss that our techniques can be applied to the continuous case as well. The bottom-up algorithm computes the skyline probabilities of some selected instances of uncertain objects, and uses those instances to prune other instances and uncertain objects effectively. The top-down algorithm recursively partitions the instances of uncertain objects into subsets, and prunes subsets and objects aggressively. Our experimental results on both the real NBA player data set and the benchmark synthetic data sets show that probabilistic skylines are interesting and useful, and our two algorithms are efficient on large data sets, and complementary to each other in performance.

Identiferoai:union.ndltd.org:ADTP/282693
Date January 2007
CreatorsJiang, Bin, Computer Science & Engineering, Faculty of Engineering, UNSW
Source SetsAustraliasian Digital Theses Program
LanguageEnglish
Detected LanguageEnglish
Rightshttp://unsworks.unsw.edu.au/copyright, http://unsworks.unsw.edu.au/copyright

Page generated in 0.0025 seconds