Global ETD Search

1	[en] ALGORITHMS FOR PARTIAL LEAST SQUARES REGRESSION / [pt] ALGORITMOS PARA REGRESSÃO POR MÍNIMOS QUADRADOS PARCIAIS RAUL PIERRE RENTERIA 08 January 2004 (has links) [pt] Muitos problemas da área de aprendizagem automática tem por objetivo modelar a complexa relação existente num sisitema , entre variáveis de entrada X e de saída Y na ausência de um modelo teórico. A regressão por mínimos quadrados parciais PLS ( Partial Least Squares) constitui um método linear para resolução deste tipo de problema , voltado para o caso de um grande número de variáveis de entrada quando comparado com número de amostras. Nesta tese , apresentamos uma variante do algoritmo clássico PLS para o tratamento de grandes conjuntos de dados , mantendo um bom poder preditivo. Dentre os principais resultados destacamos um versão paralela PPLS (Parallel PLS ) exata para o caso de apenas um variável de saída e um versão rápida e aproximada DPLS (DIRECT PLS) para o caso de mais de uma variável de saída. Por outro lado ,apresentamos também variantes para o aumento da qualidade de predição graças à formulação não linear. São elas o LPLS ( Lifted PLS ), algoritmo para o caso de apenas uma variável de saída, baseado na teoria de funções de núcleo ( kernel functions ), uma formulação kernel para o DPLS e um algoritmo multi-kernel MKPLS capaz de uma modelagemmais compacta e maior poder preditivo, graças ao uso de vários núcleos na geração do modelo. / [en] The purpose of many problems in the machine learning field isto model the complex relationship in a system between the input X and output Y variables when no theoretical model is available. The Partial Least Squares (PLS)is one linear method for this kind of problem, for the case of many input variables when compared to the number of samples. In this thesis we present versions of the classical PLS algorithm designed for large data sets while keeping a good predictive power. Among the main results we highlight PPLS (Parallel PLS), a parallel version for the case of only one output variable, and DPLS ( Direct PLS), a fast and approximate version, for the case fo more than one output variable. On the other hand, we also present some variants of the regression algorithm that can enhance the predictive quality based on a non -linear formulation. We indroduce LPLS (Lifted PLS), for the case of only one dependent variable based on the theory of kernel functions, KDPLS, a non-linear formulation for DPLS, and MKPLS, a multi-kernel algorithm that can result in a more compact model and a better prediction quality, thankas to the use of several kernels for the model bulding. [pt] REGRESSAO NAO LINEAR [en] NONLINEAR REGRESSION [pt] PARALELISMO [en] PARALLELISM [pt] MINIMOS QUADRADOS PARCIAIS [en] PARTIAL LEAST SQUARES [pt] MULTI-KERNEL [en] MULTI-KERNEL
2	[en] TS-TARX: TREE STRUCTURED - THRESHOLD AUTOREGRESSION WITH EXTERNAL VARIABLES / [pt] TS-TARX: UM MODELO DE REGRESSÃO COM LIMIARES BASEADO EM ÁRVORE DE DECISÃO CHRISTIAN NUNES ARANHA 28 January 2002 (has links) [pt] Este trabalho propõe um novo modelo linear por partes para a extração de regras de conhecimento de banco de dados. O modelo é uma heurística baseada em análise de árvore de regressão, como introduzido por Friedman (1979) e discutido em detalhe por Breiman (1984). A motivação desta pesquisa é trazer uma nova abordagem combinando técnicas estatísticas de modelagem e um algoritmo de busca por quebras eficiente. A decisão de quebra usada no algoritmo de busca leva em consideração informações do ajuste de equações lineares e foi implementado tendo por inspiração o trabalho de Tsay (1989). Neste, ele sugere um procedimento para construção um modelo para a análise de séries temporais chamado TAR (threshold autoregressive model), introduzido por Tong (1978) e discutido em detalhes por Tong e Lim (1980) e Tong (1983). O modelo TAR é um modelo linear por partes cuja idéia central é alterar os parâmetros do modelo linear autoregressivo de acordo com o valor de uma variável observada, chamada de variável limiar. No trabalho de Tsay, a Identificação do número e localização do potencial limiar era baseada na analise de gráficos. A idéia foi então criar um novo algoritmo todo automatizado. Este processo é um algoritmo que preserva o método de regressão por mínimos quadrados recursivo (MQR) usado no trabalho de Tsay. Esta talvez seja uma das grandes vantagens da metodologia introduzida neste trabalho, visto que Cooper (1998) em seu trabalho de análise de múltiplos regimes afirma não ser possível testar cada quebra. Da combinação da árvore de decisão com a técnica de regressão (MQR), o modelo se tornou o TS-TARX (Tree Structured - Threshold AutoRegression with eXternal variables). O procedimento consiste numa busca em árvore binária calculando a estatística F para a seleção das variáveis e o critério de informação BIC para a seleção dos modelos. Ao final, o algoritmo gera como resposta uma árvore de decisão (por meio de regras) e as equações de regressão estimadas para cada regime da partição. A principal característica deste tipo de resposta é sua fácil interpretação. O trabalho conclui com algumas aplicações em bases de dados padrões encontradas na literatura e outras que auxiliarão o entendimento do processo implementado. / [en] This research work proposes a new piecewise linear model to extract knowledge rules from databases. The model is an heuristic based on analysis of regression trees, introduced by Friedman (1979) and discussed in detail by Breiman (1984). The motivation of this research is to come up with a new approach combining both statistical modeling techniques and an efficient split search algorithm. The split decision used in the split search algorithm counts on information from adjusted linear equation and was implemented inspired by the work of Tsay (1989). In his work, he suggests a model-building procedure for a nonlinear time series model called by TAR (threshold autoregressive model), first proposed by Tong (1978) and discussed in detail by Tong and Lim (1980) and Tong (1983). The TAR model is a piecewise linear model which main idea is to set the coefficients of a linear autoregressive process in accordance with a value of observed variable, called by threshold variable. Tsay`s identification of the number and location of the potential thresholds was based on supplementary graphic devices. The idea is to get the whole process automatic on a new model-building process. This process is an algorithm that preserves the method of regression by recursive least squares (RLS) used in Tsay`s work. This regression method allowed the test of all possibilities of data split. Perhaps that is the main advantage of the methodology introduced in this work, seeing that Cooper, S. (1998) said about the impossibility of testing each break.Thus, combining decision tree methodology with a regression technique (RLS), the model became the TS-TARX (Tree Structured - Threshold AutoRegression with eXternal variables). It searches on a binary tree calculating F statistics for variable selection and the information criteria BIC for model selection. In the end, the algorithm produces as result a decision tree and a regression equation adjusted to each regime of the partition defined by the decision tree. Its major advantage is easy interpretation.This research work concludes with some applications in benchmark databases from literature and others that helps the understanding of the algorithm process. [pt] MINERACAO DE DADOS [en] DATA MINING [pt] REGRESSAO [en] REGRESSION [pt] REGRESSAO NAO LINEAR [en] NONLINEAR REGRESSION [pt] LINEAR POR PARTES [en] PIECEWISE MODEL [pt] ARVORE DE DECISAO [en] DECISION TREE [pt] ARVORE DE REGRESSAO [en] REGRESSION TREE [pt] MINIMOS QUADRADOS RECURSIVO [en] RECURSIVE LEAST SQUARES [pt] FACIL INTERPRETACAO [en] EASY INTERPRETATION

Search results

[en] ALGORITHMS FOR PARTIAL LEAST SQUARES REGRESSION / [pt] ALGORITMOS PARA REGRESSÃO POR MÍNIMOS QUADRADOS PARCIAIS

[en] TS-TARX: TREE STRUCTURED - THRESHOLD AUTOREGRESSION WITH EXTERNAL VARIABLES / [pt] TS-TARX: UM MODELO DE REGRESSÃO COM LIMIARES BASEADO EM ÁRVORE DE DECISÃO