Due to the information explosion on the Internet, effective information search techniques are required to retrieve the desired information from the Web. Based on much analysis on users' search intention and the variant forms of Web content, we find that both the query and the indexed web content are often associated with various context information, which can provide much essential information to indicate the ranking relevance in Web search. This dissertation seeks to develop new search algorithms and techniques by taking advantage of rich context information to improve search quality and consists of two major parts.
In the first part, we study the context of the query in terms of various ranking objectives of different queries. In order to improve the ranking relevance, we propose to incorporate such query context information into the ranking model. Two general approaches will be introduced in the following of this dissertation. The first one proposes to incorporate query difference into ranking by introducing query-dependent loss functions, by optimizing which we can obtain better ranking model. Then, we investigate another approach which applies a divide-and-conquer framework for ranking specialization.
The second part of this dissertation investigates how to extract the context of specific Web content and explore them to build more effective search system. This study is based on the new emerging social media content. Unlike traditional Web content, social media content is inherently associated with much new context information, including content semantics and quality, user reputation, and user interactions, all of which provide useful information for acquiring knowledge from social media. In this dissertation, we seek to develop algorithms and techniques for effective knowledge acquisition from collaborative social media environments by using the dynamic context information. We first propose a new general framework for searching social media content, which integrates both the content features and the user interactions. Then, a semi-supervised framework is proposed to explicitly compute content quality and user reputation, which are incorporated into the search framework to improve the search quality. Furthermore, this dissertation also investigates techniques for extracting the structured semantics of social media content as new context information, which is essential for content retrieval and organization.
Identifer | oai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/37246 |
Date | 29 September 2010 |
Creators | Bian, Jiang |
Publisher | Georgia Institute of Technology |
Source Sets | Georgia Tech Electronic Thesis and Dissertation Archive |
Detected Language | English |
Type | Dissertation |
Page generated in 0.0021 seconds