Global ETD Search

1	Multivariate analysis of multiproduct market research data Bolton, Richard John January 2000 (has links) No description available. 519.5 Projection pursuit; Cluster analysis
2	Redução dimensional de dados de alta dimensão e poucas amostras usando Projection Pursuit / Dimension reduction of datasets with large dimensionalities and few samples using Projection Pursuit Espezua Llerena, Soledad 30 July 2013 (has links) Reduzir a dimensão de bancos de dados é um passo importante em processos de reconhecimento de padrões e aprendizagem de máquina. Projection Pursuit (PP) tem emergido como uma técnica relevante para tal fim, a qual busca projeções dos dados em espaços de baixa dimensão onde estruturas interessantes sejam reveladas. Apesar do relativo sucesso de PP em vários problemas de redução dimensional, a literatura mostra uma aplicação limitada da mesma em bancos de dados com elevada quantidade de atributos e poucas amostras, tais como os gerados em biologia molecular. Nesta tese, estudam-se formas de aproveitar o potencial de PP em problemas de alta dimensão e poucas amostras a fim de facilitar a posterior construção de classificadores. Entre as principais contribuições deste trabalho tem-se: i) Sequential Projection Pursuit Modified (SPPM), um método de busca sequencial de espaços de projeção baseado em Algoritmo Genético (AG) e operadores de cruzamento especializados; ii) Block Sequential Projection Pursuit Modified (Block-SPPM) e Whitened Sequential Projection Pursuit Modified (W-SPPM), duas estratégias de aplicação de SPPM em problemas com mais atributos do que amostras, sendo a primeira baseada e particionamento de atributos e a segunda baseada em pré-compactação dos dados. Avaliações experimentais sobre bancos de dados públicos de expressão gênica mostraram a eficácia das propostas em melhorar a acurácia de algoritmos de classificação populares em relação a vários outros métodos de redução dimensional, tanto de seleção quanto de extração de atributos, encontrando-se que W-SPPM oferece o melhor compromisso entre acurácia e custo computacional. / Reducing the dimension of datasets is an important step in pattern recognition and machine learning processes. PP has emerged as a relevant technique for that purpose. PP aims to find projections of the data in low dimensional spaces where interesting structures are revealed. Despite the success of PP in many dimension reduction problems, the literature shows a limited application of it in dataset with large amounts of features and few samples, such as those obtained in molecular biology. In this work we study ways to take advantage of the potential of PP in order to deal with problems of large dimensionalities and few samples. Among the main contributions of this work are: i) SPPM, an improved method for searching projections, based on a genetic algorithm and specialized crossover operators; and ii) Block-SPPM and W-SPPM, two strategies of applying SPPM in problems with more attributes than samples. The first strategy is based on partitioning the attribute space while the later is based on a precompaction of the data followed by a projection search. Experimental evaluations over public gene-expression datasets showed the efficacy of the proposals in improving the accuracy of popular classifiers with respect to several representative dimension reduction methods, being W-SPPM the strategy with the best compromise between accuracy and computational cost. Classificação Classification Dados de microarranjo Dimentionality reduction Microarray data Projection Pursuit Projection Pursuit Redução dimensional
3	Redução dimensional de dados de alta dimensão e poucas amostras usando Projection Pursuit / Dimension reduction of datasets with large dimensionalities and few samples using Projection Pursuit Soledad Espezua Llerena 30 July 2013 (has links) Reduzir a dimensão de bancos de dados é um passo importante em processos de reconhecimento de padrões e aprendizagem de máquina. Projection Pursuit (PP) tem emergido como uma técnica relevante para tal fim, a qual busca projeções dos dados em espaços de baixa dimensão onde estruturas interessantes sejam reveladas. Apesar do relativo sucesso de PP em vários problemas de redução dimensional, a literatura mostra uma aplicação limitada da mesma em bancos de dados com elevada quantidade de atributos e poucas amostras, tais como os gerados em biologia molecular. Nesta tese, estudam-se formas de aproveitar o potencial de PP em problemas de alta dimensão e poucas amostras a fim de facilitar a posterior construção de classificadores. Entre as principais contribuições deste trabalho tem-se: i) Sequential Projection Pursuit Modified (SPPM), um método de busca sequencial de espaços de projeção baseado em Algoritmo Genético (AG) e operadores de cruzamento especializados; ii) Block Sequential Projection Pursuit Modified (Block-SPPM) e Whitened Sequential Projection Pursuit Modified (W-SPPM), duas estratégias de aplicação de SPPM em problemas com mais atributos do que amostras, sendo a primeira baseada e particionamento de atributos e a segunda baseada em pré-compactação dos dados. Avaliações experimentais sobre bancos de dados públicos de expressão gênica mostraram a eficácia das propostas em melhorar a acurácia de algoritmos de classificação populares em relação a vários outros métodos de redução dimensional, tanto de seleção quanto de extração de atributos, encontrando-se que W-SPPM oferece o melhor compromisso entre acurácia e custo computacional. / Reducing the dimension of datasets is an important step in pattern recognition and machine learning processes. PP has emerged as a relevant technique for that purpose. PP aims to find projections of the data in low dimensional spaces where interesting structures are revealed. Despite the success of PP in many dimension reduction problems, the literature shows a limited application of it in dataset with large amounts of features and few samples, such as those obtained in molecular biology. In this work we study ways to take advantage of the potential of PP in order to deal with problems of large dimensionalities and few samples. Among the main contributions of this work are: i) SPPM, an improved method for searching projections, based on a genetic algorithm and specialized crossover operators; and ii) Block-SPPM and W-SPPM, two strategies of applying SPPM in problems with more attributes than samples. The first strategy is based on partitioning the attribute space while the later is based on a precompaction of the data followed by a projection search. Experimental evaluations over public gene-expression datasets showed the efficacy of the proposals in improving the accuracy of popular classifiers with respect to several representative dimension reduction methods, being W-SPPM the strategy with the best compromise between accuracy and computational cost. Classificação Dados de microarranjo Projection Pursuit Redução dimensional Classification Dimentionality reduction Microarray data Projection Pursuit
4	Daugiamačių Gauso skirstinių mišinio statistinė analizė, taikant duomenų projektavimą / The Projection-based Statistical Analysis of the Multivariate Gaussian Distribution Mixture Kavaliauskas, Mindaugas 21 January 2005 (has links) Problem of the dissertation. The Gaussian random values are very common in practice, because if a random value depends on many additive factors, according to the Central Limit Theorem (if particular conditions are satisfied), the sum is approximately from Gaussian distribution. If the observed random value belongs to one of the several classes, it is from the Gaussian distribution mixture model. The mixtures of the Gaussian distributions are common in various fields: biology, medicine, astronomy, military science and many others. The most important statistical problems are problems of mixture identification and data clustering. In case of high data dimension, these tasks are not completely solved. The new parameter estimation of the multivariate Gaussian distribution mixture model and data clustering methods are proposed and analysed in the dissertation. Since it is much easier to solve these problems in univariate case, the projection-based approach is used. The aim of the dissertation. The aim of this work is the development of constructive algorithms for distribution analysis and clustering of data from the mixture model of the Gaussian distributions. Mathematics Klasterizavimas EM algoritmas Gaussian distribution mixture EM algorithm Gauso skirstinių mišinys Clustering Projection pursuit Projektavimas
5	Daugiamačiu Gauso skirstinių mišinio statistinė analizė, taikant duomenų projektavimą / The Projection-based Statistical Analysis of the Multivariate Gaussian Distribution Mixture Kavaliauskas, Mindaugas 21 January 2005 (has links) Problem of the dissertation. The Gaussian random values are very common in practice, because if a random value depends on many additive factors, according to the Central Limit Theorem (if particular conditions are satisfied), the sum is approximately from Gaussian distribution. If the observed random value belongs to one of the several classes, it is from the Gaussian distribution mixture model. The mixtures of the Gaussian distributions are common in various fields: biology, medicine, astronomy, military science and many others. The most important statistical problems are problems of mixture identification and data clustering. In case of high data dimension, these tasks are not completely solved. The new parameter estimation of the multivariate Gaussian distribution mixture model and data clustering methods are proposed and analysed in the dissertation. Since it is much easier to solve these problems in univariate case, the projection-based approach is used. The aim of the dissertation. The aim of this work is the development of constructive algorithms for distribution analysis and clustering of data from the mixture model of the Gaussian distributions. Mathematics Clustering Projection pursuit Projektavimas Gaussian distribution mixture EM algorithm Gauso skirstinių mišinys EM algoritmas Klasterizavimas
6	Simultaneous Adaptive Fractional Discriminant Analysis: Applications to the Face Recognition Problem Draper, John Daniel 19 June 2012 (has links) No description available. Statistics linear discriminant analysis LDA fractional LDA projection pursuit face recognition
7	Statistical Modeling of High-Dimensional Nonlinear Systems: A Projection Pursuit Solution Swinson, Michael D. 28 November 2005 (has links) Despite recent advances in statistics, artificial neural network theory, and machine learning, nonlinear function estimation in high-dimensional space remains a nontrivial problem. As the response surface becomes more complicated and the dimensions of the input data increase, the dreaded "curse of dimensionality" takes hold, rendering the best of function approximation methods ineffective. This thesis takes a novel approach to solving the high-dimensional function estimation problem. In this work, we propose and develop two distinct parametric projection pursuit learning networks with wide-ranging applicability. Included in this work is a discussion of the choice of basis functions used as well as a description of the optimization schemes utilized to find the parameters that enable each network to best approximate a response surface. The essence of these new modeling methodologies is to approximate functions via the superposition of a series of piecewise one-dimensional models that are fit to specific directions, called projection directions. The key to the effectiveness of each model lies in its ability to find efficient projections for reducing the dimensionality of the input space to best fit an underlying response surface. Moreover, each method is capable of effectively selecting appropriate projections from the input data in the presence of relatively high levels of noise. This is accomplished by rigorously examining the theoretical conditions for approximating each solution space and taking full advantage of the principles of optimization to construct a pair of algorithms, each capable of effectively modeling high-dimensional nonlinear response surfaces to a higher degree of accuracy than previously possible. Prediction Statistical modeling Function approximation High-dimensional modeling Regression Modeling PPLM PPLN Projection pursuit Data mining Nonlinear systems Statistical methods Approximation theory Mathematical optimization
8	Partial EM Procedure for Big-Data Linear Mixed Effects Model, and Generalized PPE for High-Dimensional Data in Julia Cho, Jang Ik 31 August 2018 (has links) No description available. Statistics Biostatistics Biomedical Research Health Mining
9	Análise de componentes independentes aplicada à separação de sinais de áudio. / Independent component analysis applied to separation of audio signals. Moreto, Fernando Alves de Lima 19 March 2008 (has links) Este trabalho estuda o modelo de análise em componentes independentes (ICA) para misturas instantâneas, aplicado na separação de sinais de áudio. Três algoritmos de separação de misturas instantâneas são avaliados: FastICA, PP (Projection Pursuit) e PearsonICA; possuindo dois princípios básicos em comum: as fontes devem ser independentes estatisticamente e não-Gaussianas. Para analisar a capacidade de separação dos algoritmos foram realizados dois grupos de experimentos. No primeiro grupo foram geradas misturas instantâneas, sinteticamente, a partir de sinais de áudio pré-definidos. Além disso, foram geradas misturas instantâneas a partir de sinais com características específicas, também geradas sinteticamente, para avaliar o comportamento dos algoritmos em situações específicas. Para o segundo grupo foram geradas misturas convolutivas no laboratório de acústica do LPS. Foi proposto o algoritmo PP, baseado no método de Busca de Projeções comumente usado em sistemas de exploração e classificação, para separação de múltiplas fontes como alternativa ao modelo ICA. Embora o método PP proposto possa ser utilizado para separação de fontes, ele não pode ser considerado um método ICA e não é garantida a extração das fontes. Finalmente, os experimentos validam os algoritmos estudados. / This work studies Independent Component Analysis (ICA) for instantaneous mixtures, applied to audio signal (source) separation. Three instantaneous mixture separation algorithms are considered: FastICA, PP (Projection Pursuit) and PearsonICA, presenting two common basic principles: sources must be statistically independent and non-Gaussian. In order to analyze each algorithm separation capability, two groups of experiments were carried out. In the first group, instantaneous mixtures were generated synthetically from predefined audio signals. Moreover, instantaneous mixtures were generated from specific signal generated with special features, synthetically, enabling the behavior analysis of the algorithms. In the second group, convolutive mixtures were probed in the acoustics laboratory of LPS at EPUSP. The PP algorithm is proposed, based on the Projection Pursuit technique usually applied in exploratory and clustering environments, for separation of multiple sources as an alternative to conventional ICA. Although the PP algorithm proposed could be applied to separate sources, it couldnt be considered an ICA method, and source extraction is not guaranteed. Finally, experiments validate the studied algorithms. Análise em componentes independentes Blind source separation Busca de projeção Cocktail Party Cocktail Party Estatística de ordem superior Higher order statistics Independent component analysis Processamento de sinais Projection Pursuit Separação cega de fontes Signal processing
10	Análise de componentes independentes aplicada à separação de sinais de áudio. / Independent component analysis applied to separation of audio signals. Fernando Alves de Lima Moreto 19 March 2008 (has links) Este trabalho estuda o modelo de análise em componentes independentes (ICA) para misturas instantâneas, aplicado na separação de sinais de áudio. Três algoritmos de separação de misturas instantâneas são avaliados: FastICA, PP (Projection Pursuit) e PearsonICA; possuindo dois princípios básicos em comum: as fontes devem ser independentes estatisticamente e não-Gaussianas. Para analisar a capacidade de separação dos algoritmos foram realizados dois grupos de experimentos. No primeiro grupo foram geradas misturas instantâneas, sinteticamente, a partir de sinais de áudio pré-definidos. Além disso, foram geradas misturas instantâneas a partir de sinais com características específicas, também geradas sinteticamente, para avaliar o comportamento dos algoritmos em situações específicas. Para o segundo grupo foram geradas misturas convolutivas no laboratório de acústica do LPS. Foi proposto o algoritmo PP, baseado no método de Busca de Projeções comumente usado em sistemas de exploração e classificação, para separação de múltiplas fontes como alternativa ao modelo ICA. Embora o método PP proposto possa ser utilizado para separação de fontes, ele não pode ser considerado um método ICA e não é garantida a extração das fontes. Finalmente, os experimentos validam os algoritmos estudados. / This work studies Independent Component Analysis (ICA) for instantaneous mixtures, applied to audio signal (source) separation. Three instantaneous mixture separation algorithms are considered: FastICA, PP (Projection Pursuit) and PearsonICA, presenting two common basic principles: sources must be statistically independent and non-Gaussian. In order to analyze each algorithm separation capability, two groups of experiments were carried out. In the first group, instantaneous mixtures were generated synthetically from predefined audio signals. Moreover, instantaneous mixtures were generated from specific signal generated with special features, synthetically, enabling the behavior analysis of the algorithms. In the second group, convolutive mixtures were probed in the acoustics laboratory of LPS at EPUSP. The PP algorithm is proposed, based on the Projection Pursuit technique usually applied in exploratory and clustering environments, for separation of multiple sources as an alternative to conventional ICA. Although the PP algorithm proposed could be applied to separate sources, it couldnt be considered an ICA method, and source extraction is not guaranteed. Finally, experiments validate the studied algorithms. Análise em componentes independentes Busca de projeção Cocktail Party Estatística de ordem superior Processamento de sinais Separação cega de fontes Blind source separation Cocktail Party Higher order statistics Independent component analysis Projection Pursuit Signal processing

Search results