Global ETD Search

Return to search

Mining multi-faceted data

Multi-faceted data contains different types of objects and relationships between them. With rapid growth of web-based services, multi-faceted data are increasing (e.g. Flickr, Yago, IMDB), which offers us richer information to infer users’ preferences and provide them better services. In this study, we look at two types of multi-faceted data: social tagging system and heterogeneous information network and how to improve service such as resources retrieving and classification on them.

In social tagging systems, resources such as images and videos are annotated with descriptive words called tags. It has been shown that tag-based resource searching and retrieval is much more effective than content-based retrieval. With the advances in mobile technology, many resources are also geo-tagged with location information. We observe that a traditional tag (word) can carry different semantics at different locations. We study how location information can be used to help distinguish the different semantics of a resource’s tags and thus to improve retrieval accuracy. Given a search query, we propose a location-partitioning method that partitions all locations into regions such that the user query carries distinguishing semantics in each region. Based on the identified regions, we utilize location information in estimating the ranking scores of resources for the given query. These ranking scores are learned using the Bayesian Personalized Ranking (BPR) framework. Two algorithms, namely, LTD and LPITF, which apply Tucker Decomposition and Pairwise Interaction Tensor Factorization, respectively for modeling the ranking score tensor are proposed. Through experiments on real datasets, we show that LTD and LPITF outperform other tag-based resource retrieval methods.

A heterogeneous information network (HIN) is used to model objects of different types and their relationships. Meta-paths are sequences of object types. They are used to represent complex relationships between objects beyond what links in a homogeneous network capture. We study the problem of classifying objects in an HIN. We propose class-level meta-paths and study how they can be used to (1) build more accurate classifiers and (2) improve active learning in identifying objects for which training labels should be obtained. We show that class-level meta-paths and object classification exhibit interesting synergy. Our experimental results show that the use of class-level meta-paths results in very effective active learning and good classification performance in HINs. / published_or_final_version / Computer Science / Master / Master of Philosophy

Data mining

Identifer	oai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/197527
Date	January 2013
Creators	Wan, Chang, 萬暢
Contributors	Cheung, DWL, Kao, CM
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Source Sets	Hong Kong University Theses
Language	English
Detected Language	English
Type	PG_Thesis
Rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License
Relation	HKU Theses Online (HKUTO)

Page generated in 0.0017 seconds

Mining multi-faceted data

Description

Links & Downloads

Tags

Additional Fields