Return to search

Some topics in modeling ranking data

Many applications of analysis of ranking data arise from different fields of study, such as psychology, economics, and politics. Over the past decade, many ranking data models have been proposed. AdaBoost is proved to be a very successful technique to generate a stronger classifier from weak ones; it can be viewed as a forward stagewise additive modeling using the exponential loss function. Motivated by this, a new AdaBoost algorithm is developed for ranking data. Taking into consideration the ordinal structure of the ranking data, I propose measures based on the Spearman/Kendall distance to evaluate classifier instead of the usual misclassification rate. Some ranking datasets are tested by the new algorithm, and the results show that the new algorithm outperforms traditional algorithms.

The distance-based model assumes that the probability of observing a ranking depends on the distance between the ranking and its central ranking. Prediction of ranking data can be made by combining distance-based model with the famous k-nearest-neighbor (kNN) method. This model can be improved by assigning weights to the neighbors according to their distances to the central ranking and assigning weights to the features according to their relative importance. For the feature weighting part, a revised version of the traditional ReliefF algorithm is proposed. From the experimental results we can see that the new algorithm is more suitable for ranking data problem.

Error-correcting output codes (ECOC) is widely used in solving multi-class learning problems by decomposing the multi-class problem into several binary classification problems. Several ECOCs for ranking data are proposed and tested. By combining these ECOCs and some traditional binary classifiers, a predictive model for ranking data with high accuracy can be made.

While the mixture of factor analyzers (MFA) is useful tool for analyzing heterogeneous data, it cannot be directly used for ranking data due to the special discrete ordinal structures of rankings. I fill in this gap by extending MFA to accommodate for complete and incomplete/partial ranking data. Both simulated and real examples are studied to illustrate the effectiveness of the proposed MFA methods. / published_or_final_version / Statistics and Actuarial Science / Doctoral / Doctor of Philosophy

Identiferoai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/209210
Date January 2014
CreatorsQi, Fang, 齊放
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Source SetsHong Kong University Theses
LanguageEnglish
Detected LanguageEnglish
TypePG_Thesis
RightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works., Creative Commons: Attribution 3.0 Hong Kong License
RelationHKU Theses Online (HKUTO)

Page generated in 0.0025 seconds