• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

[en] ALGORITHMS FOR PARTIAL LEAST SQUARES REGRESSION / [pt] ALGORITMOS PARA REGRESSÃO POR MÍNIMOS QUADRADOS PARCIAIS

RAUL PIERRE RENTERIA 08 January 2004 (has links)
[pt] Muitos problemas da área de aprendizagem automática tem por objetivo modelar a complexa relação existente num sisitema , entre variáveis de entrada X e de saída Y na ausência de um modelo teórico. A regressão por mínimos quadrados parciais PLS ( Partial Least Squares) constitui um método linear para resolução deste tipo de problema , voltado para o caso de um grande número de variáveis de entrada quando comparado com número de amostras. Nesta tese , apresentamos uma variante do algoritmo clássico PLS para o tratamento de grandes conjuntos de dados , mantendo um bom poder preditivo. Dentre os principais resultados destacamos um versão paralela PPLS (Parallel PLS ) exata para o caso de apenas um variável de saída e um versão rápida e aproximada DPLS (DIRECT PLS) para o caso de mais de uma variável de saída. Por outro lado ,apresentamos também variantes para o aumento da qualidade de predição graças à formulação não linear. São elas o LPLS ( Lifted PLS ), algoritmo para o caso de apenas uma variável de saída, baseado na teoria de funções de núcleo ( kernel functions ), uma formulação kernel para o DPLS e um algoritmo multi-kernel MKPLS capaz de uma modelagemmais compacta e maior poder preditivo, graças ao uso de vários núcleos na geração do modelo. / [en] The purpose of many problems in the machine learning field isto model the complex relationship in a system between the input X and output Y variables when no theoretical model is available. The Partial Least Squares (PLS)is one linear method for this kind of problem, for the case of many input variables when compared to the number of samples. In this thesis we present versions of the classical PLS algorithm designed for large data sets while keeping a good predictive power. Among the main results we highlight PPLS (Parallel PLS), a parallel version for the case of only one output variable, and DPLS ( Direct PLS), a fast and approximate version, for the case fo more than one output variable. On the other hand, we also present some variants of the regression algorithm that can enhance the predictive quality based on a non -linear formulation. We indroduce LPLS (Lifted PLS), for the case of only one dependent variable based on the theory of kernel functions, KDPLS, a non-linear formulation for DPLS, and MKPLS, a multi-kernel algorithm that can result in a more compact model and a better prediction quality, thankas to the use of several kernels for the model bulding.
2

Design and Analysis of Techniques for Multiple-Instance Learning in the Presence of Balanced and Skewed Class Distributions

Wang, Xiaoguang January 2015 (has links)
With the continuous expansion of data availability in many large-scale, complex, and networked systems, such as surveillance, security, the Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decision-making processes. Existing knowledge discovery and data analyzing techniques have shown great success in many real-world applications such as applying Automatic Target Recognition (ATR) methods to detect targets of interest in imagery, drug activity prediction, computer vision recognition, and so on. Among these techniques, Multiple-Instance (MI) learning is different from standard classification since it uses a set of bags containing many instances as input. The instances in each bag are not labeled | instead the bags themselves are labeled. In this area many researchers have accomplished a lot of work and made a lot of progress. However, there still exist some areas which are not covered. In this thesis, we focus on two topics of MI learning: (1) Investigating the relationship between MI learning and other multiple pattern learning methods, which include multi-view learning, data fusion method and multi-kernel SVM. (2) Dealing with the class imbalance problem of MI learning. In the first topic, three different learning frameworks will be presented for general MI learning. The first uses multiple view approaches to deal with MI problem, the second is a data fusion framework, and the third framework, which is an extension of the first framework, uses multiple-kernel SVM. Experimental results show that the approaches presented work well on solving MI problem. The second topic is concerned with the imbalanced MI problem. Here we investigate the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. For this problem, we propose three solution frameworks: a data re-sampling framework, a cost-sensitive boosting framework and an adaptive instance-weighted boosting SVM (with the name IB_SVM) for MI learning. Experimental results - on both benchmark datasets and application datasets - show that the proposed frameworks are proved to be effective solutions for the imbalanced problem of MI learning.

Page generated in 0.0306 seconds