Global ETD Search

Return to search

Application of MapReduce to Ranking SVM for Large-Scale Datasets

Nowadays, search engines are more relying on machine learning techniques to construct a model, using past user queries and clicks as training data, for ranking web pages. There are several learning to rank methods for information retrieval, and among them ranking support vector machine (SVM) attracts a lot of attention in the information retrieval community. One difficulty with Ranking SVM is that the computation cost is very high for constructing a ranking model due to the huge number of training data pairs when the size of training dataset is large. We adopt the MapReduce programming model to solve this difficulty. MapReduce is a distributed computing framework introduced by Google and is commonly adopted in cloud computing centers. It can deal easily with large-scale datasets using a large number of computers. Moreover, it hides the messy details of parallelization, fault-tolerance, data distribution, and load balancing from the programmer and allows him/her to focus on only the underlying problem to be solved. In this paper, we apply MapReduce to Ranking SVM for processing large-scale datasets. We specify the Map function to solve the dual sub problems involved in Ranking SVM and the Reduce function to aggregate all the outputs having the same intermediate key from Map functions of distributed machines. Experimental results show efficiency improvement on ranking SVM by our proposed approach.

http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0810110-175653

Identifer	oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0810110-175653
Date	10 August 2010
Creators	Hu, Su-Hsien
Contributors	Chih-Hung Wu, Chih-Chin Lai, Hsien-Leing Tsai, Chen-Sen Ouyang, Shie-Jue Lee
Publisher	NSYSU
Source Sets	NSYSU Electronic Thesis and Dissertation Archive
Language	Cholon
Detected Language	English
Type	text
Format	application/pdf
Source	http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0810110-175653
Rights	restricted, Copyright information available at source archive

Page generated in 0.0024 seconds

Application of MapReduce to Ranking SVM for Large-Scale Datasets

Description

Links & Downloads

Tags

Additional Fields