Lin Dahua. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2006. / Includes bibliographical references (leaves 233-250). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- The Problem We are Facing --- p.1 / Chapter 1.2 --- Generative vs. Discriminative Models --- p.2 / Chapter 1.3 --- Statistical Feature Extraction: Success and Challenge --- p.3 / Chapter 1.4 --- Overview of Our Works --- p.5 / Chapter 1.4.1 --- New Linear Discriminant Methods: Generalized LDA Formulation and Performance-Driven Sub space Learning --- p.5 / Chapter 1.4.2 --- Coupled Learning Models: Coupled Space Learning and Inter Modality Recognition --- p.6 / Chapter 1.4.3 --- Informative Learning Approaches: Conditional Infomax Learning and Information Chan- nel Model --- p.6 / Chapter 1.5 --- Organization of the Thesis --- p.8 / Chapter I --- History and Background --- p.10 / Chapter 2 --- Statistical Pattern Recognition --- p.11 / Chapter 2.1 --- Patterns and Classifiers --- p.11 / Chapter 2.2 --- Bayes Theory --- p.12 / Chapter 2.3 --- Statistical Modeling --- p.14 / Chapter 2.3.1 --- Maximum Likelihood Estimation --- p.14 / Chapter 2.3.2 --- Gaussian Model --- p.15 / Chapter 2.3.3 --- Expectation-Maximization --- p.17 / Chapter 2.3.4 --- Finite Mixture Model --- p.18 / Chapter 2.3.5 --- A Nonparametric Technique: Parzen Windows --- p.21 / Chapter 3 --- Statistical Learning Theory --- p.24 / Chapter 3.1 --- Formulation of Learning Model --- p.24 / Chapter 3.1.1 --- Learning: Functional Estimation Model --- p.24 / Chapter 3.1.2 --- Representative Learning Problems --- p.25 / Chapter 3.1.3 --- Empirical Risk Minimization --- p.26 / Chapter 3.2 --- Consistency and Convergence of Learning --- p.27 / Chapter 3.2.1 --- Concept of Consistency --- p.27 / Chapter 3.2.2 --- The Key Theorem of Learning Theory --- p.28 / Chapter 3.2.3 --- VC Entropy --- p.29 / Chapter 3.2.4 --- Bounds on Convergence --- p.30 / Chapter 3.2.5 --- VC Dimension --- p.35 / Chapter 4 --- History of Statistical Feature Extraction --- p.38 / Chapter 4.1 --- Linear Feature Extraction --- p.38 / Chapter 4.1.1 --- Principal Component Analysis (PCA) --- p.38 / Chapter 4.1.2 --- Linear Discriminant Analysis (LDA) --- p.41 / Chapter 4.1.3 --- Other Linear Feature Extraction Methods --- p.46 / Chapter 4.1.4 --- Comparison of Different Methods --- p.48 / Chapter 4.2 --- Enhanced Models --- p.49 / Chapter 4.2.1 --- Stochastic Discrimination and Random Subspace --- p.49 / Chapter 4.2.2 --- Hierarchical Feature Extraction --- p.51 / Chapter 4.2.3 --- Multilinear Analysis and Tensor-based Representation --- p.52 / Chapter 4.3 --- Nonlinear Feature Extraction --- p.54 / Chapter 4.3.1 --- Kernelization --- p.54 / Chapter 4.3.2 --- Dimension reduction by Manifold Embedding --- p.56 / Chapter 5 --- Related Works in Feature Extraction --- p.59 / Chapter 5.1 --- Dimension Reduction --- p.59 / Chapter 5.1.1 --- Feature Selection --- p.60 / Chapter 5.1.2 --- Feature Extraction --- p.60 / Chapter 5.2 --- Kernel Learning --- p.61 / Chapter 5.2.1 --- Basic Concepts of Kernel --- p.61 / Chapter 5.2.2 --- The Reproducing Kernel Map --- p.62 / Chapter 5.2.3 --- The Mercer Kernel Map --- p.64 / Chapter 5.2.4 --- The Empirical Kernel Map --- p.65 / Chapter 5.2.5 --- Kernel Trick and Kernelized Feature Extraction --- p.66 / Chapter 5.3 --- Subspace Analysis --- p.68 / Chapter 5.3.1 --- Basis and Subspace --- p.68 / Chapter 5.3.2 --- Orthogonal Projection --- p.69 / Chapter 5.3.3 --- Orthonormal Basis --- p.70 / Chapter 5.3.4 --- Subspace Decomposition --- p.70 / Chapter 5.4 --- Principal Component Analysis --- p.73 / Chapter 5.4.1 --- PCA Formulation --- p.73 / Chapter 5.4.2 --- Solution to PCA --- p.75 / Chapter 5.4.3 --- Energy Structure of PCA --- p.76 / Chapter 5.4.4 --- Probabilistic Principal Component Analysis --- p.78 / Chapter 5.4.5 --- Kernel Principal Component Analysis --- p.81 / Chapter 5.5 --- Independent Component Analysis --- p.83 / Chapter 5.5.1 --- ICA Formulation --- p.83 / Chapter 5.5.2 --- Measurement of Statistical Independence --- p.84 / Chapter 5.6 --- Linear Discriminant Analysis --- p.85 / Chapter 5.6.1 --- Fisher's Linear Discriminant Analysis --- p.85 / Chapter 5.6.2 --- Improved Algorithms for Small Sample Size Problem . --- p.89 / Chapter 5.6.3 --- Kernel Discriminant Analysis --- p.92 / Chapter II --- Improvement in Linear Discriminant Analysis --- p.100 / Chapter 6 --- Generalized LDA --- p.101 / Chapter 6.1 --- Regularized LDA --- p.101 / Chapter 6.1.1 --- Generalized LDA Implementation Procedure --- p.101 / Chapter 6.1.2 --- Optimal Nonsingular Approximation --- p.103 / Chapter 6.1.3 --- Regularized LDA algorithm --- p.104 / Chapter 6.2 --- A Statistical View: When is LDA optimal? --- p.105 / Chapter 6.2.1 --- Two-class Gaussian Case --- p.106 / Chapter 6.2.2 --- Multi-class Cases --- p.107 / Chapter 6.3 --- Generalized LDA Formulation --- p.108 / Chapter 6.3.1 --- Mathematical Preparation --- p.108 / Chapter 6.3.2 --- Generalized Formulation --- p.110 / Chapter 7 --- Dynamic Feedback Generalized LDA --- p.112 / Chapter 7.1 --- Basic Principle --- p.112 / Chapter 7.2 --- Dynamic Feedback Framework --- p.113 / Chapter 7.2.1 --- Initialization: K-Nearest Construction --- p.113 / Chapter 7.2.2 --- Dynamic Procedure --- p.115 / Chapter 7.3 --- Experiments --- p.115 / Chapter 7.3.1 --- Performance in Training Stage --- p.116 / Chapter 7.3.2 --- Performance on Testing set --- p.118 / Chapter 8 --- Performance-Driven Subspace Learning --- p.119 / Chapter 8.1 --- Motivation and Principle --- p.119 / Chapter 8.2 --- Performance-Based Criteria --- p.121 / Chapter 8.2.1 --- The Verification Problem and Generalized Average Margin --- p.122 / Chapter 8.2.2 --- Performance Driven Criteria based on Generalized Average Margin --- p.123 / Chapter 8.3 --- Optimal Subspace Pursuit --- p.125 / Chapter 8.3.1 --- Optimal threshold --- p.125 / Chapter 8.3.2 --- Optimal projection matrix --- p.125 / Chapter 8.3.3 --- Overall procedure --- p.129 / Chapter 8.3.4 --- Discussion of the Algorithm --- p.129 / Chapter 8.4 --- Optimal Classifier Fusion --- p.130 / Chapter 8.5 --- Experiments --- p.131 / Chapter 8.5.1 --- Performance Measurement --- p.131 / Chapter 8.5.2 --- Experiment Setting --- p.131 / Chapter 8.5.3 --- Experiment Results --- p.133 / Chapter 8.5.4 --- Discussion --- p.139 / Chapter III --- Coupled Learning of Feature Transforms --- p.140 / Chapter 9 --- Coupled Space Learning --- p.141 / Chapter 9.1 --- Introduction --- p.142 / Chapter 9.1.1 --- What is Image Style Transform --- p.142 / Chapter 9.1.2 --- Overview of our Framework --- p.143 / Chapter 9.2 --- Coupled Space Learning --- p.143 / Chapter 9.2.1 --- Framework of Coupled Modelling --- p.143 / Chapter 9.2.2 --- Correlative Component Analysis --- p.145 / Chapter 9.2.3 --- Coupled Bidirectional Transform --- p.148 / Chapter 9.2.4 --- Procedure of Coupled Space Learning --- p.151 / Chapter 9.3 --- Generalization to Mixture Model --- p.152 / Chapter 9.3.1 --- Coupled Gaussian Mixture Model --- p.152 / Chapter 9.3.2 --- Optimization by EM Algorithm --- p.152 / Chapter 9.4 --- Integrated Framework for Image Style Transform --- p.154 / Chapter 9.5 --- Experiments --- p.156 / Chapter 9.5.1 --- Face Super-resolution --- p.156 / Chapter 9.5.2 --- Portrait Style Transforms --- p.157 / Chapter 10 --- Inter-Modality Recognition --- p.162 / Chapter 10.1 --- Introduction to the Inter-Modality Recognition Problem . . . --- p.163 / Chapter 10.1.1 --- What is Inter-Modality Recognition --- p.163 / Chapter 10.1.2 --- Overview of Our Feature Extraction Framework . . . . --- p.163 / Chapter 10.2 --- Common Discriminant Feature Extraction --- p.165 / Chapter 10.2.1 --- Formulation of the Learning Problem --- p.165 / Chapter 10.2.2 --- Matrix-Form of the Objective --- p.168 / Chapter 10.2.3 --- Solving the Linear Transforms --- p.169 / Chapter 10.3 --- Kernelized Common Discriminant Feature Extraction --- p.170 / Chapter 10.4 --- Multi-Mode Framework --- p.172 / Chapter 10.4.1 --- Multi-Mode Formulation --- p.172 / Chapter 10.4.2 --- Optimization Scheme --- p.174 / Chapter 10.5 --- Experiments --- p.176 / Chapter 10.5.1 --- Experiment Settings --- p.176 / Chapter 10.5.2 --- Experiment Results --- p.177 / Chapter IV --- A New Perspective: Informative Learning --- p.180 / Chapter 11 --- Toward Information Theory --- p.181 / Chapter 11.1 --- Entropy and Mutual Information --- p.181 / Chapter 11.1.1 --- Entropy --- p.182 / Chapter 11.1.2 --- Relative Entropy (Kullback Leibler Divergence) --- p.184 / Chapter 11.2 --- Mutual Information --- p.184 / Chapter 11.2.1 --- Definition of Mutual Information --- p.184 / Chapter 11.2.2 --- Chain rules --- p.186 / Chapter 11.2.3 --- Information in Data Processing --- p.188 / Chapter 11.3 --- Differential Entropy --- p.189 / Chapter 11.3.1 --- Differential Entropy of Continuous Random Variable . --- p.189 / Chapter 11.3.2 --- Mutual Information of Continuous Random Variable . --- p.190 / Chapter 12 --- Conditional Infomax Learning --- p.191 / Chapter 12.1 --- An Overview --- p.192 / Chapter 12.2 --- Conditional Informative Feature Extraction --- p.193 / Chapter 12.2.1 --- Problem Formulation and Features --- p.193 / Chapter 12.2.2 --- The Information Maximization Principle --- p.194 / Chapter 12.2.3 --- The Information Decomposition and the Conditional Objective --- p.195 / Chapter 12.3 --- The Efficient Optimization --- p.197 / Chapter 12.3.1 --- Discrete Approximation Based on AEP --- p.197 / Chapter 12.3.2 --- Analysis of Terms and Their Derivatives --- p.198 / Chapter 12.3.3 --- Local Active Region Method --- p.200 / Chapter 12.4 --- Bayesian Feature Fusion with Sparse Prior --- p.201 / Chapter 12.5 --- The Integrated Framework for Feature Learning --- p.202 / Chapter 12.6 --- Experiments --- p.203 / Chapter 12.6.1 --- A Toy Problem --- p.203 / Chapter 12.6.2 --- Face Recognition --- p.204 / Chapter 13 --- Channel-based Maximum Effective Information --- p.209 / Chapter 13.1 --- Motivation and Overview --- p.209 / Chapter 13.2 --- Maximizing Effective Information --- p.211 / Chapter 13.2.1 --- Relation between Mutual Information and Classification --- p.211 / Chapter 13.2.2 --- Linear Projection and Metric --- p.212 / Chapter 13.2.3 --- Channel Model and Effective Information --- p.213 / Chapter 13.2.4 --- Parzen Window Approximation --- p.216 / Chapter 13.3 --- Parameter Optimization on Grassmann Manifold --- p.217 / Chapter 13.3.1 --- Grassmann Manifold --- p.217 / Chapter 13.3.2 --- Conjugate Gradient Optimization on Grassmann Manifold --- p.219 / Chapter 13.3.3 --- Computation of Gradient --- p.221 / Chapter 13.4 --- Experiments --- p.222 / Chapter 13.4.1 --- A Toy Problem --- p.222 / Chapter 13.4.2 --- Face Recognition --- p.223 / Chapter 14 --- Conclusion --- p.230
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_325641 |
Date | January 2006 |
Contributors | Lin, Dahua., Chinese University of Hong Kong Graduate School. Division of Information Engineering. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, bibliography |
Format | print, xv, 250 leaves : ill. ; 30 cm. |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.003 seconds