Nearest Neighbor (NN in short) queries are important in emerging applications,
such as wireless networks, location-based services, and data stream applications,
where the data obtained are often imprecise. The imprecision or imperfection of
the data sources is modeled by uncertain data in recent research works. Handling
uncertainty is important because this issue affects the quality of query answers.
Although queries on uncertain data are useful, evaluating the queries on them can
be costly, in terms of I/O or computational efficiency. In this thesis, we study how
to efficiently evaluate NN queries on uncertain data.
Given a query point q and a set of uncertain objects O, the possible nearest neighbor query returns a set of candidates which have non-zero probabilities to be the
query answer. It is also interesting to ask \which region has the same set of possible nearest neighbors", and \which region has one specific object as its possible
nearest neighbor". To reveal the relationship between the query space and nearest
neighbor answers, we propose the UV-diagram, where the query space is split into
disjoint partitions, such that each partition is associated with a set of objects. If a
query point is located inside the partition, its possible nearest neighbors could be
directly retrieved. However, the number of such partitions is exponential and the
construction effort can be expensive. To tackle this problem, we propose an alternative concept, called UV-cell, and efficient algorithms for constructing it. The UV-cell has an irregular shape, which incurs difficulties in storage, maintenance,
and query evaluation. We design an index structure, called UV-index, which is
an approximated version of the UV-diagram. Extensive experiments show that
the UV-index could efficiently answer different variants of NN queries, such as
Probabilistic Nearest Neighbor Queries, Continuous Probabilistic Nearest Neighbor
Queries.
Another problem studied in this thesis is the trajectory nearest neighbor query.
Here the query point is restricted to a pre-known trajectory. In applications (e.g.
monitoring potential threats along a flight/vessel's trajectory), it is useful to derive
nearest neighbors for all points on the query trajectory. Simple solutions, such as
sampling or approximating the locations of uncertain objects as points, fails to
achieve a good query quality. To handle this problem, we design efficient algorithms
and optimization methods for this query. Experiments show that our solution can
efficiently and accurately answer this query. Our solution is also scalable to large
datasets and long trajectories. / published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy
Identifer | oai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/174512 |
Date | January 2012 |
Creators | Xie, Xike., 谢希科. |
Contributors | Cheng, CK |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Source Sets | Hong Kong University Theses |
Language | English |
Detected Language | English |
Type | PG_Thesis |
Source | http://hub.hku.hk/bib/B4784954X |
Rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License |
Relation | HKU Theses Online (HKUTO) |
Page generated in 0.0014 seconds