Finally, we study how to construct an appropriate graph for spectral clustering. Given a local similarity matrix (a graph), we propose an iterative regularization procedure to iteratively enhance its cluster structure, leading to a global similarity matrix. Significant improvement of clustering performance is observed when the new graph is used for spectral clustering. / In this thesis, we consider the general problem of classifying a data set into a number of subsets, which has been one of the most fundamental problems in machine learning. Specifically, we mainly address the following four common learning problems in three active research fields: semi-supervised classification, semi-supervised clustering, and unsupervised clustering. The first problem we consider is semi-supervised classification from both unlabeled data and pairwise constraints. The pairwise constraints specify which two objects belong to the same class or not. Our aim is to propagate the pairwise constraints to the entire data set. We formulate the propagation model as a semidefinite programming (SDP) problem, which can be globally solved reliably. Our approach is applicable to multi-class problems and handles class labels, pairwise constraints, or a mixture of them in a unified framework. / The second problem is semi-supervised clustering with pairwise constraints. We present a principled framework for learning a data-driven and constraint-consistent nonlinear mapping to reshape the data in a feature space. We formulate the problem as a small-scale SDP problem, whose size is independent of the numbers of the objects and the constraints. Thus it can be globally solved efficiently. Our framework has several attractive features. First, it can effectively propagate pairwise constraints, when available, to the entire data set. Second, it scales well to large-scale problems. Third, it can effectively handle noisy constraints. Fourth, in the absence of constraints, it becomes a novel kernel-based clustering algorithm that can discover linearly non-separable clusters. / Third, we deal with noise robust clustering. Many clustering algorithms, including spectral clustering, often fail on noisy data. We propose a data warping model to map the data into a new space. During the warping, each object spreads its spatial information smoothly over the data graph to other objects. After the warping, hopefully each cluster becomes compact and different clusters become well-separated, including the noise cluster that is formed by the noise objects. The proposed clustering algorithm can handle significantly noisy data, and can find the number of clusters automatically. / Li, Zhenguo. / Adviser: Liu Jianzhuang. / Source: Dissertation Abstracts International, Volume: 70-06, Section: B, page: 3604. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 121-131). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_344294 |
Date | January 2008 |
Contributors | Li, Zhenguo., Chinese University of Hong Kong Graduate School. Division of Information Engineering. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, theses |
Format | electronic resource, microform, microfiche, 1 online resource (xiv, 131 leaves : ill.) |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0019 seconds