Return to search

Personalized web search re-ranking and content recommendation

In this thesis, I propose a method for establishing a personalized recommendation system for re-ranking web search results and recommending web contents. The method is based on personal reading interest which can be reflected by the user’s dwell time on each document or webpage. I acquire document-level dwell times via a customized web browser, or a mobile device. To obtain better precision, I also explore the possibility of tracking gaze position and facial expression, from which I can determine the attractiveness of different parts of a document.

Inspired by idea of Google Knowledge Graph, I also establish a graph-based ontology to maintain a user profile to describe the user’s personal reading interest. Each node in the graph is a concept, which represents the user’s potential interest on this concept. I also use the dwell time to measure concept-level interest, which can be inferred from document-level user dwell times. The graph is generated based on the Wikipedia. According to the estimated concept-level user interest, my algorithm can estimate a user’s potential dwell time over a new document, based on which personalized webpage re-ranking can be carried out. I compare the rankings produced by my algorithm with rankings generated by popular commercial search engines and a recently proposed personalized ranking algorithm. The results clearly show the superiority of my method.

I also use my personalized recommendation framework in other applications. A good example is personalized document summarization. The same knowledge graph is employed to estimate the weight of every word in a document; combining with a traditional document summarization algorithm which focused on text mining, I could generate a personalized summary which emphasize the user’s interest in the document. To deal with images and videos, I present a new image search and ranking algorithm for retrieving unannotated images by collaboratively mining online search results, which consists of online images and text search results. The online image search results are leveraged as reference examples to perform content-based image search over unannotated images. The online text search results are used to estimate individual reference images’ relevance to the search query as not all the online image search results are closely related to the query. Overall, the key contribution of my method lies in its ability to deal with unreliable online image search results through jointly mining visual and textual aspects of online search results. Through such collaborative mining, my algorithm infers the relevance of an online search result image to a text query. Once I estimate a query relevance score for each online image search result, I can selectively use query specific online search result images as reference examples for retrieving and ranking unannotated images. To explore the performance of my algorithm, I tested it both on a standard public image datasets and several modestly sized personal photo collections. I also compared the performance of my method with that of two peer methods. The results are very positive, which indicate that my algorithm is superior to existing content-based image search algorithms for retrieving and ranking unannotated images. Overall, the main advantage of my algorithm comes from its collaborative mining over online search results both in the visual and the textual domains. / published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy

Identiferoai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/197548
Date January 2013
CreatorsJiang, Hao, 江浩
ContributorsLau, FCM
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Source SetsHong Kong University Theses
LanguageEnglish
Detected LanguageEnglish
TypePG_Thesis
RightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License
RelationHKU Theses Online (HKUTO)

Page generated in 0.003 seconds