Global ETD Search

1	Calibrating recurrent sliding window classifiers for sequential supervised learning Joshi, Saket Subhash 03 October 2003 (has links) Sequential supervised learning problems involve assigning a class label to each item in a sequence. Examples include part-of-speech tagging and text-to-speech mapping. A very general-purpose strategy for solving such problems is to construct a recurrent sliding window (RSW) classifier, which maps some window of the input sequence plus some number of previously-predicted items into a prediction for the next item in the sequence. This paper describes a general purpose implementation of RSW classifiers and discusses the highly practical issue of how to choose the size of the input window and the number of previous predictions to incorporate. Experiments on two real-world domains show that the optimal choices vary from one learning algorithm to another. They also depend on the evaluation criterion (number of correctly-predicted items versus number of correctly-predicted whole sequences). We conclude that window sizes must be chosen by cross-validation. The results have implications for the choice of window sizes for other models including hidden Markov models and conditional random fields. / Graduation date: 2004 Supervised learning (Machine learning)
2	Shrunken learning rates do not improve AdaBoost on benchmark datasets Forrest, Daniel L. K. 30 November 2001 (has links) Recent work has shown that AdaBoost can be viewed as an algorithm that maximizes the margin on the training data via functional gradient descent. Under this interpretation, the weight computed by AdaBoost, for each hypothesis generated, can be viewed as a step size parameter in a gradient descent search. Friedman has suggested that shrinking these step sizes could produce improved generalization and reduce overfitting. In a series of experiments, he showed that very small step sizes did indeed reduce overfitting and improve generalization for three variants of Gradient_Boost, his generic functional gradient descent algorithm. For this report, we tested whether reduced learning rates can also improve generalization in AdaBoost. We tested AdaBoost (applied to C4.5 decision trees) with reduced learning rates on 28 benchmark datasets. The results show that reduced learning rates provide no statistically significant improvement on these datasets. We conclude that reduced learning rates cannot be recommended for use with boosted decision trees on datasets similar to these benchmark datasets. / Graduation date: 2002 Supervised learning (Machine learning) Algorithms
3	Protein secondary structure prediction using conditional random fields and profiles / Shen, Rongkun. January 1900 (has links) Thesis (M.S.)--Oregon State University, 2006. / Printout. Includes bibliographical references (leaves 42-46). Also available on the World Wide Web.
4	Learning with unlabeled data. / 在未標記的數據中的機器學習 / CUHK electronic theses & dissertations collection / Zai wei biao ji de shu ju zhong de ji qi xue xi January 2009 (has links) In the first part, we deal with the unlabeled data that are in good quality and follow the conditions of semi-supervised learning. Firstly, we present a novel method for Transductive Support Vector Machine (TSVM) by relaxing the unknown labels to the continuous variables and reducing the non-convex optimization problem to a convex semi-definite programming problem. In contrast to the previous relaxation method which involves O (n2) free parameters in the semi-definite matrix, our method takes advantage of reducing the number of free parameters to O (n), so that we can solve the optimization problem more efficiently. In addition, the proposed approach provides a tighter convex relaxation for the optimization problem in TSVM. Empirical studies on benchmark data sets demonstrate that the proposed method is more efficient than the previous semi-definite relaxation method and achieves promising classification results comparing with the state-of-the-art methods. Our second contribution is an extended level method proposed to efficiently solve the multiple kernel learning (MKL) problems. In particular, the level method overcomes the drawbacks of both the Semi-Infinite Linear Programming (SILP) method and the Subgradient Descent (SD) method for multiple kernel learning. Our experimental results show that the level method is able to greatly reduce the computational time of MKL over both the SD method and the SILP method. Thirdly, we discuss the connection between two fundamental assumptions in semi-supervised learning. More specifically, we show that the loss on the unlabeled data used by TSVM can be essentially viewed as an additional regularizer for the decision boundary. We further show that this additional regularizer induced by the TSVM is closely related to the regularizer introduced by the manifold regularization. Both of them can be viewed as a unified regularization framework for semi-supervised learning. / In the second part, we discuss how to employ the unlabeled data for building reliable classification systems in three scenarios: (1) only poorly-related unlabeled data are available, (2) good quality unlabeled data are mixed with irrelevant data and there are no prior knowledge on their composition, and (3) no unlabeled data are available but can be achieved from the Internet for text categorization. We build several frameworks to deal with the above cases. Firstly, we present a study on how to deal with the weakly-related unlabeled data, called the Supervised Self-taught Learning framework, which can transfer knowledge from the unlabeled data actively. The proposed model is able to select those discriminative features or representations, which are more appropriate for classification. Secondly, we also propose a novel framework that can learn from a mixture of unlabeled data, where good quality unlabeled data are mixed with unlabeled irrelevant samples. Moreover, we do not need the prior knowledge on which data samples are relevant or irrelevant. Consequently it is significantly different from the recent framework of semi-supervised learning with universum and the framework of Universum Support Vector Machine. As an important contribution, we have successfully formulated this new learning approach as a Semi-definite Programming problem, which can be solved in polynomial time. A series of experiments demonstrate that this novel framework has advantages over the semi-supervised learning on both synthetic and real data in many facets. Finally, for third scenario, we present a general framework for semi-supervised text categorization that collects the unlabeled documents via Web search engines and utilizes them to improve the accuracy of supervised text categorization. Extensive experiments have demonstrated that the proposed semi-supervised text categorization framework can significantly improve the classification accuracy. Specifically, the classification error is reduced by 30% averaged on the nine data sets when using Google as the search engine. / We consider the problem of learning from both labeled and unlabeled data through the analysis on the quality of the unlabeled data. Usually, learning from both labeled and unlabeled data is regarded as semi-supervised learning, where the unlabeled data and the labeled data are assumed to be generated from the same distribution. When this assumption is not satisfied, new learning paradigms are needed in order to effectively explore the information underneath the unlabeled data. This thesis consists of two parts: the first part analyzes the fundamental assumptions of semi-supervised learning and proposes a few efficient semi-supervised learning models; the second part discusses three learning frameworks in order to deal with the case that unlabeled data do not satisfy the conditions of semi-supervised learning. / Xu, Zenglin. / Advisers: Irwin King; Michael R. Lyu. / Source: Dissertation Abstracts International, Volume: 70-09, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2009. / Includes bibliographical references (leaves 158-179). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307. Data mining Supervised learning (Machine learning)
5	Knowledge transfer techniques for dynamic environments Rajan, Suju 28 August 2008 (has links) Not available / text Supervised learning (Machine learning) Knowledge management
6	Sequential supervised learning and conditional random fields / Ashenfelter, Adam J. January 1900 (has links) Thesis (M.S.)--Oregon State University, 2004. / Typescript (photocopy). Includes bibliographical references (leaves 33-34). Also available on the World Wide Web.
7	Efficient training and feature induction in sequential supervised learning / Hao, Guohua. January 1900 (has links) Thesis (Ph. D.)--Oregon State University, 2010. / Printout. Includes bibliographical references (leaves 82-87). Also available on the World Wide Web.
8	Knowledge transfer techniques for dynamic environments Rajan, Suju, January 1900 (has links) (PDF) Thesis (Ph. D.)--University of Texas at Austin, 2006. / Vita. Includes bibliographical references.
9	On surrogate supervision multi-view learning Jin, Gaole 03 December 2012 (has links) Data can be represented in multiple views. Traditional multi-view learning methods (i.e., co-training, multi-task learning) focus on improving learning performance using information from the auxiliary view, although information from the target view is sufficient for learning task. However, this work addresses a semi-supervised case of multi-view learning, the surrogate supervision multi-view learning, where labels are available on limited views and a classifier is obtained on the target view where labels are missing. In surrogate multi-view learning, one cannot obtain a classifier without information from the auxiliary view. To solve this challenging problem, we propose discriminative and generative approaches. / Graduation date: 2013 multi-view learning semi-supervised learning Supervised learning (Machine learning)
10	Graph based semi-supervised learning in computer vision Huang, Ning, January 2009 (has links) Thesis (Ph. D.)--Rutgers University, 2009. / "Graduate Program in Biomedical Engineering." Includes bibliographical references (p. 54-55).

Search results