Return to search

Advanced rank-aware queries and recommendation with novel types of data

Nowadays we are living in an era of rich data, not only in the sense of the amount of data, but also in the sense of various sources and content of data. Efficient search, management, and exploitation of data have, over decades, been a major direction of database research. In this thesis, three challenging problems are proposed and studied, targeting (i) time series data, (ii) user preference data, and (iii) location-based social network data, respectively, providing efficient solutions to corresponding real-life applications.

First, durability queries are studied in historical time series databases, which identify objects that have durable quality over time. For example, a sociologist may be interested in the top 10 web search terms during the period of some historical events; the police may seek for vehicles that move close to a suspect 70% of the time during a certain time, etc. Such durable top-k (DTop-k) and durable k-nearest neighbor (DkNN) queries can be viewed as natural extensions of the standard snapshot top-k and NN queries to timestamped sequences of values or locations. Although their snapshot counterparts have been studied extensively, there is little prior work that addresses this new class of durability queries. Efficient and scalable algorithms are proposed based on novel indexing techniques.

Next, an efficient solution to k-nearest neighbor search over top-m lists is investigated. A top-m list is a ranking of m items, typically representing some user’s preference over these items. For example, a user may have a list of her 10 most favourite books; the result from a search engine is typically a list of webpages ranked according to their relevance to some keywords. The search problem aims at extracting k top-m lists from the database that are the “closest” to some query list where the closeness is evaluated using commonly used measures such as the Fagin’s intersection metric, Spearman’s footrule, Kendall’s tau, etc. Despite of the importance of such queries, there’s little prior work suggesting any efficient solution. In this thesis, a unified framework is proposed to answer such queries efficiently.

Finally, the problem of top-N venue recommendation in location-based social networks (LBSNs) is studied, which recommends new venues to users. As an increasingly larger number of users partake in LBSNs, the recommendation problem in this setting has attracted significant attention in research and in practical applications. The detailed information about past user behavior that is traced by the LBSN differentiates the problem significantly from its traditional settings. The spatial nature in the past user behavior and also the information about the user social interaction with other users, provide a richer background to build a more accurate and expressive recommendation model. Although there have been extensive studies on recommender systems working with user-item ratings, GPS trajectories, and other types of data, there are very few approaches that exploit the unique properties of the LBSN user check-in data. In this thesis, effective and efficient algorithms that create recommendations are proposed based on such properties. / published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy

Identiferoai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/206672
Date January 2014
CreatorsWang, Hao, 王皓
ContributorsMamoulis, N, Cheung, DWL
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Source SetsHong Kong University Theses
LanguageEnglish
Detected LanguageEnglish
TypePG_Thesis
RightsCreative Commons: Attribution 3.0 Hong Kong License, The author retains all proprietary rights, (such as patent rights) and the right to use in future works.
RelationHKU Theses Online (HKUTO)

Page generated in 0.0228 seconds