Return to search

Evaluating nearest neighbor queries over uncertain databases

Nearest Neighbor (NN in short) queries are important in emerging applications,

such as wireless networks, location-based services, and data stream applications,

where the data obtained are often imprecise. The imprecision or imperfection of

the data sources is modeled by uncertain data in recent research works. Handling

uncertainty is important because this issue affects the quality of query answers.

Although queries on uncertain data are useful, evaluating the queries on them can

be costly, in terms of I/O or computational efficiency. In this thesis, we study how

to efficiently evaluate NN queries on uncertain data.



Given a query point q and a set of uncertain objects O, the possible nearest neighbor query returns a set of candidates which have non-zero probabilities to be the

query answer. It is also interesting to ask \which region has the same set of possible nearest neighbors", and \which region has one specific object as its possible

nearest neighbor". To reveal the relationship between the query space and nearest

neighbor answers, we propose the UV-diagram, where the query space is split into

disjoint partitions, such that each partition is associated with a set of objects. If a

query point is located inside the partition, its possible nearest neighbors could be

directly retrieved. However, the number of such partitions is exponential and the

construction effort can be expensive. To tackle this problem, we propose an alternative concept, called UV-cell, and efficient algorithms for constructing it. The UV-cell has an irregular shape, which incurs difficulties in storage, maintenance,

and query evaluation. We design an index structure, called UV-index, which is

an approximated version of the UV-diagram. Extensive experiments show that

the UV-index could efficiently answer different variants of NN queries, such as

Probabilistic Nearest Neighbor Queries, Continuous Probabilistic Nearest Neighbor

Queries.



Another problem studied in this thesis is the trajectory nearest neighbor query.

Here the query point is restricted to a pre-known trajectory. In applications (e.g.

monitoring potential threats along a flight/vessel's trajectory), it is useful to derive

nearest neighbors for all points on the query trajectory. Simple solutions, such as

sampling or approximating the locations of uncertain objects as points, fails to

achieve a good query quality. To handle this problem, we design efficient algorithms

and optimization methods for this query. Experiments show that our solution can

efficiently and accurately answer this query. Our solution is also scalable to large

datasets and long trajectories. / published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy

  1. 10.5353/th_b4784954
  2. b4784954
Identiferoai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/174512
Date January 2012
CreatorsXie, Xike., 谢希科.
ContributorsCheng, CK
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Source SetsHong Kong University Theses
LanguageEnglish
Detected LanguageEnglish
TypePG_Thesis
Sourcehttp://hub.hku.hk/bib/B4784954X
RightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License
RelationHKU Theses Online (HKUTO)

Page generated in 0.0016 seconds