Global ETD Search

321	Exploring attributes and instances for customized learning based on support patterns. / CUHK electronic theses & dissertations collection January 2005 (has links) Both the learning model and the learning process of CSPL are customized to different query instances. CSPL can make use of the characteristics of the query instance to explore a focused hypothesis space effectively during classification. Unlike many existing learning methods, CSPL conducts learning from specific to general, effectively avoiding the horizon effect. Empirical investigation demonstrates that learning from specific to general can discover more useful patterns for learning. Experimental results on benchmark data sets and real-world problems demonstrate that our CSPL framework has a prominent learning performance in comparison with existing learning rnethods. / CSPL integrates the attributes and instances in a query matrix model under customized learning framework. Within this query matrix model, it can be demonstrated that attributes and instances have a useful symmetry property for learning. This symmetry property leads to a solution for counteracting the negative factor of sparse instances with the abundance of attribute information, which was previously viewed as a kind of dimension curse for common learning methods. Given this symmetry property, we propose to use support patterns as the basic learning unit of CSPL, i.e., the patterns to be explored. Generally, a support pattern can be viewed as a sub-matrix of the query matrix, considering its associated support instances and attribute values. CSPL discovers useful support patterns and combines their statistics for classifying unseen instances. / The developing of machine learning techniques still has a number of challenges. Real world problems often require a more flexible and dynamic learning method, which is customized to the learning scenario and user demand. For example, it is quite often in real-world applications to make a critical decision with only limited data but huge amount of potentially relevant attributes. Therefore, we propose a novel customized learning framework called Customized Support Pattern Learner (CSPL), which exploits a tradeoff between instance-based learning and attribute-based learning. / Han Yiqiu. / "October 2005." / Adviser: Wai Lam. / Source: Dissertation Abstracts International, Volume: 67-07, Section: B, page: 3898. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (p. 99-104). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307. Computational learning theory Machine learning--Mathematical models
322	Learning from data locally and globally. / CUHK electronic theses & dissertations collection / Digital dissertation consortium January 2004 (has links) Huang Kaizhu. / "July 2004." / Thesis (Ph.D.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (p. 176-194) / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Mode of access: World Wide Web. / Abstracts in English and Chinese. Machine learning--Mathematical models Computational learning theory
323	Learning with unlabeled data. / 在未標記的數據中的機器學習 / CUHK electronic theses & dissertations collection / Zai wei biao ji de shu ju zhong de ji qi xue xi January 2009 (has links) In the first part, we deal with the unlabeled data that are in good quality and follow the conditions of semi-supervised learning. Firstly, we present a novel method for Transductive Support Vector Machine (TSVM) by relaxing the unknown labels to the continuous variables and reducing the non-convex optimization problem to a convex semi-definite programming problem. In contrast to the previous relaxation method which involves O (n2) free parameters in the semi-definite matrix, our method takes advantage of reducing the number of free parameters to O (n), so that we can solve the optimization problem more efficiently. In addition, the proposed approach provides a tighter convex relaxation for the optimization problem in TSVM. Empirical studies on benchmark data sets demonstrate that the proposed method is more efficient than the previous semi-definite relaxation method and achieves promising classification results comparing with the state-of-the-art methods. Our second contribution is an extended level method proposed to efficiently solve the multiple kernel learning (MKL) problems. In particular, the level method overcomes the drawbacks of both the Semi-Infinite Linear Programming (SILP) method and the Subgradient Descent (SD) method for multiple kernel learning. Our experimental results show that the level method is able to greatly reduce the computational time of MKL over both the SD method and the SILP method. Thirdly, we discuss the connection between two fundamental assumptions in semi-supervised learning. More specifically, we show that the loss on the unlabeled data used by TSVM can be essentially viewed as an additional regularizer for the decision boundary. We further show that this additional regularizer induced by the TSVM is closely related to the regularizer introduced by the manifold regularization. Both of them can be viewed as a unified regularization framework for semi-supervised learning. / In the second part, we discuss how to employ the unlabeled data for building reliable classification systems in three scenarios: (1) only poorly-related unlabeled data are available, (2) good quality unlabeled data are mixed with irrelevant data and there are no prior knowledge on their composition, and (3) no unlabeled data are available but can be achieved from the Internet for text categorization. We build several frameworks to deal with the above cases. Firstly, we present a study on how to deal with the weakly-related unlabeled data, called the Supervised Self-taught Learning framework, which can transfer knowledge from the unlabeled data actively. The proposed model is able to select those discriminative features or representations, which are more appropriate for classification. Secondly, we also propose a novel framework that can learn from a mixture of unlabeled data, where good quality unlabeled data are mixed with unlabeled irrelevant samples. Moreover, we do not need the prior knowledge on which data samples are relevant or irrelevant. Consequently it is significantly different from the recent framework of semi-supervised learning with universum and the framework of Universum Support Vector Machine. As an important contribution, we have successfully formulated this new learning approach as a Semi-definite Programming problem, which can be solved in polynomial time. A series of experiments demonstrate that this novel framework has advantages over the semi-supervised learning on both synthetic and real data in many facets. Finally, for third scenario, we present a general framework for semi-supervised text categorization that collects the unlabeled documents via Web search engines and utilizes them to improve the accuracy of supervised text categorization. Extensive experiments have demonstrated that the proposed semi-supervised text categorization framework can significantly improve the classification accuracy. Specifically, the classification error is reduced by 30% averaged on the nine data sets when using Google as the search engine. / We consider the problem of learning from both labeled and unlabeled data through the analysis on the quality of the unlabeled data. Usually, learning from both labeled and unlabeled data is regarded as semi-supervised learning, where the unlabeled data and the labeled data are assumed to be generated from the same distribution. When this assumption is not satisfied, new learning paradigms are needed in order to effectively explore the information underneath the unlabeled data. This thesis consists of two parts: the first part analyzes the fundamental assumptions of semi-supervised learning and proposes a few efficient semi-supervised learning models; the second part discusses three learning frameworks in order to deal with the case that unlabeled data do not satisfy the conditions of semi-supervised learning. / Xu, Zenglin. / Advisers: Irwin King; Michael R. Lyu. / Source: Dissertation Abstracts International, Volume: 70-09, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 158-179). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307. Data mining Supervised learning (Machine learning)
324	Generalized regularized learning. / 廣義正則化學習 / CUHK electronic theses & dissertations collection / Guang yi zheng ze hua xue xi January 2007 (has links) A classical algorithm in classification is the support vector machine (SVM) algorithm. Based on Vapnik's statistical learning theory, it tries to find a linear boundary with maximum margin to separate the given data into different classes. In non-separable case, SVM uses a kernel trick to map the data onto a feature space and finds a linear boundary in the new space. / Different algorithms are derived from the framework. When the empirical error is defined by a quadratic loss, we have generalized regularized least-squares learning algorithm. When the idea is applied to SVM, we obtain semi-parametric SVM algorithm. Besides, we derive the third algorithm which generalizes the kernel logistic regression algorithm. / How to choose non-regularized features? We give some empirical studies. We use dimensionality reduction techniques in text categorization, extract some non-regularized intrinsic features for the high dimensional data, and report improved results. / Instead of understanding SVM's behavior from Vapnik's theory, our work follows regularized learning viewpoint. In regularized learning, people try to find a solution from a function space which has small empirical error in explaining the input-output relationship for training data, yet keeping the simplicity of the solution. / To provide the simplicity, the complexity of the solution is penalized, which involves all features in the function space. An equal penalty, as in standard regularized learning, is reasonable without knowing the significance of individual features. But how about if we have prior knowledge that some features are more important than others? Instead of penalizing all features, we study a generalized regularized learning framework where part of the function space is not penalized, and derive its corresponding solution. / Two generalized algorithms need to solve positive definite linear systems to get the parameters. How to solve a large-scale linear system efficiently? Different from previous work in machine learning where people generally resort to conjugate gradient method, our work proposes to use a domain decomposition approach. New interpretations and improved results are reported accordingly. / Li, Wenye. / "September 2007." / Advisers: Kwong-Sak Leung; Kin-Hong Lee. / Source: Dissertation Abstracts International, Volume: 69-08, Section: B, page: 4850. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (p. 101-109). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307. Computational learning theory Machine learning--Mathematical models
325	Fast Graph Laplacian regularized kernel learning via semidefinite-quadratic-linear programming. January 2011 (has links) Wu, Xiaoming. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2011. / Includes bibliographical references (p. 30-34). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Preliminaries --- p.4 / Chapter 2.1 --- Kernel Learning Theory --- p.4 / Chapter 2.1.1 --- Positive Semidefinite Kernel --- p.4 / Chapter 2.1.2 --- The Reproducing Kernel Map --- p.6 / Chapter 2.1.3 --- Kernel Tricks --- p.7 / Chapter 2.2 --- Spectral Graph Theory --- p.8 / Chapter 2.2.1 --- Graph Laplacian --- p.8 / Chapter 2.2.2 --- Eigenvectors of Graph Laplacian --- p.9 / Chapter 2.3 --- Convex Optimization --- p.10 / Chapter 2.3.1 --- From Linear to Conic Programming --- p.11 / Chapter 2.3.2 --- Second-Order Cone Programming --- p.12 / Chapter 2.3.3 --- Semidefinite Programming --- p.12 / Chapter 3 --- Fast Graph Laplacian Regularized Kernel Learning --- p.14 / Chapter 3.1 --- The Problems --- p.14 / Chapter 3.1.1 --- MVU --- p.16 / Chapter 3.1.2 --- PCP --- p.17 / Chapter 3.1.3 --- Low-Rank Approximation: from SDP to QSDP --- p.18 / Chapter 3.2 --- Previous Approach: from QSDP to SDP --- p.20 / Chapter 3.3 --- Our Formulation: from QSDP to SQLP --- p.21 / Chapter 3.4 --- Experimental Results --- p.23 / Chapter 3.4.1 --- The Results --- p.25 / Chapter 4 --- Conclusion --- p.28 / Bibliography --- p.30 Machine learning Kernel functions Laplacian operator Algorithms
326	Distributionally Robust Optimization and its Applications in Machine Learning Kang, Yang January 2017 (has links) The goal of Distributionally Robust Optimization (DRO) is to minimize the cost of running a stochastic system, under the assumption that an adversary can replace the underlying baseline stochastic model by another model within a family known as the distributional uncertainty region. This dissertation focuses on a class of DRO problems which are data-driven, which generally speaking means that the baseline stochastic model corresponds to the empirical distribution of a given sample. One of the main contributions of this dissertation is to show that the class of data-driven DRO problems that we study unify many successful machine learning algorithms, including square root Lasso, support vector machines, and generalized logistic regression, among others. A key distinctive feature of the class of DRO problems that we consider here is that our distributional uncertainty region is based on optimal transport costs. In contrast, most of the DRO formulations that exist to date take advantage of a likelihood based formulation (such as Kullback-Leibler divergence, among others). Optimal transport costs include as a special case the so-called Wasserstein distance, which is popular in various statistical applications. The use of optimal transport costs is advantageous relative to the use of divergence-based formulations because the region of distributional uncertainty contains distributions which explore samples outside of the support of the empirical measure, therefore explaining why many machine learning algorithms have the ability to improve generalization. Moreover, the DRO representations that we use to unify the previously mentioned machine learning algorithms, provide a clear interpretation of the so-called regularization parameter, which is known to play a crucial role in controlling generalization error. As we establish, the regularization parameter corresponds exactly to the size of the distributional uncertainty region. Another contribution of this dissertation is the development of statistical methodology to study data-driven DRO formulations based on optimal transport costs. Using this theory, for example, we provide a sharp characterization of the optimal selection of regularization parameters in machine learning settings such as square-root Lasso and regularized logistic regression. Our statistical methodology relies on the construction of a key object which we call the robust Wasserstein profile function (RWP function). The RWP function similar in spirit to the empirical likelihood profile function in the context of empirical likelihood (EL). But the asymptotic analysis of the RWP function is different because of a certain lack of smoothness which arises in a suitable Lagrangian formulation. Optimal transport costs have many advantages in terms of statistical modeling. For example, we show how to define a class of novel semi-supervised learning estimators which are natural companions of the standard supervised counterparts (such as square root Lasso, support vector machines, and logistic regression). We also show how to define the distributional uncertainty region in a purely data-driven way. Precisely, the optimal transport formulation allows us to inform the shape of the distributional uncertainty, not only its center (which given by the empirical distribution). This shape is informed by establishing connections to the metric learning literature. We develop a class of metric learning algorithms which are based on robust optimization. We use the robust-optimization-based metric learning algorithms to inform the distributional uncertainty region in our data-driven DRO problem. This means that we endow the adversary with additional which force him to spend effort on regions of importance to further improve generalization properties of machine learning algorithms. In summary, we explain how the use of optimal transport costs allow constructing what we call double-robust statistical procedures. We test all of the procedures proposed in this paper in various data sets, showing significant improvement in generalization ability over a wide range of state-of-the-art procedures. Finally, we also discuss a class of stochastic optimization algorithms of independent interest which are particularly useful to solve DRO problems, especially those which arise when the distributional uncertainty region is based on optimal transport costs. Statistics Robust optimization Machine learning Mathematical optimization
327	Multiscale Modeling of Adsorbate Interactions on Transition Metal Alloy Surfaces Boes, Jacob Russell 01 April 2017 (has links) Transition metals represent some of the first catalysts used in industrial processes and are still used today to produce many of the most needed chemicals. Adopting from ancient metallurgical techniques, it followed that the performance of these basic transition metals can be refined by adding multiple components. Since that time, improvements to these alloy catalysts has been mostly incremental due to the difficulty of producing new catalysts experimentally and a lack of fundamental understanding of the underlying physics. More recently, computational chemistry has proven itself an increasingly effective means for identifying these underlying physics. Through the use of d-band interactions of adsorbates with the surface, basic adsorption characteristics can be predicted across transition metals with limited initial information. However, although these models function well as high-level screening tools, much work is yet to be done before optimal catalysts can be comfortably designed from properties which experimentalists can directly control. This remains particularly challenging for alloy modeling, primarily due to the large number of possible atomic configurations, even for two metal systems. This work focuses on developing the methods for modeling optimal reaction properties at the surface of a transition metal alloy. Based on thermodynamic equilibrium between the surface, bulk, and gas reservoir, a model for the prediction of segregation under vacuum and adsorbate conditions can be predicted. Furthermore, by relating strain in the bulk lattice constant to the adsorption energies of varying local active sites, the optimal surface compositions can be related to bulk composition; a feature which can easily be selected for. Although useful for identifying trends across bulk composition space, these methods are limited to a small subset of active site configurations. To capture the complexity of more sophisticated processes, such as segregation, higher-timescale methods are required. Traditional computational tools are often too expensive to implement for these methods, and as such, they are usually completed with less-accurate potentials. In this work, we demonstrate that machine learning techniques have improved accuracy compared to physical potentials. We then go on to demonstrate how this improved accuracy can lead to experimentally accurate predictions of segregation. Catalysis Machine Learning Molecular Simulation Surface Science
328	Kernel based learning methods for pattern and feature analysis Wu, Zhili 01 January 2004 (has links) No description available. Kernel functions Machine learning Pattern recognition systems
329	On feature selection, kernel learning and pairwise constraints for clustering analysis Zeng, Hong 01 January 2009 (has links) No description available. Cluster analysis Data mining Machine learning
330	Studying the ability of finding single and interaction effects with Random Forest, and its application in psychiatric genetics Neira Gonzalez, Lara Andrea January 2018 (has links) Psychotic disorders such as schizophrenia and bipolar disorder have a strong genetic component. The aetiology of psychoses is known to be complex, including additive effects from multiple susceptibility genes, interactions between genes, environmental risk factors, and gene by environment interactions. With the development of new technologies such as genome-wide association studies and imputation of ungenotyped variants, the amount of genomic data has increased dramatically leading to the necessary use of Machine Learning techniques. Random Forest has been widely used to study the underlying genetic factors of psychiatric disorders such as epistasis and gene-gene interactions. Several authors have investigated the ability of this algorithm in finding single and interaction effects, but have reported contradictory results. Therefore, in order to examine Random Forest ability of detecting single and interaction effects based on different variable importance measures, I conducted a simulation study assessing whether the algorithm was able to detect single and interaction models under different correlation conditions. The results suggest that the optimal Variable Importance Measures to use in real situations under correlation is the unconditional unscaled permutation variable importance measure. Several studies have shown bias in one of the most popular variable importance measures, the Gini importance. Hence, in a second simulation study I study whether the Gini variable importance is influenced by the variability of predictors, the precision of measuring them, and the variability of the error. Evidence of other biases in this variable importance was found. The results from the first simulation study were used to study whether genes related to 29 molecular biomarkers, which have been associated with schizophrenia, influence risk for schizophrenia in a case-control study of 26476 cases and 31804 controls from 39 different European ancestry cohorts. Single effects from ACAT2 and TNC genes were detected to contribute risk for schizophrenia. ACAT2 is a gene in the chromosome 6 which is related to energy metabolism. Transcriptional differences have been shown in schizophrenia brain tissue studies. TNC is expressed in the brain where is involved in the migration of the neurons and axons. In addition, we also used the simulation results to examine whether interactions between genes associated with abnormal emotion/affect behaviour influence risk for psychosis and cognition in humans, in a case-control study of 2049 cases and 1794 controls. Before correcting for multiple testing, significant interactions between CRHR1 and ESR1, and between MAPT and ESR1, and among CRHR1, ESR1 and TOM1L2, and among MAPT, ESR1 and TOM1L2 were observed in abnormal fear/anxiety-related behaviour pathway. There was no evidence for epistasis after Bonferroni correction.

Search results