81 |
Geometric algorithms for component analysis with a view to gene expression data analysisJournée, Michel 04 June 2009 (has links)
The research reported in this thesis addresses the problem of component analysis, which aims at reducing large data to lower dimensions, to reveal the essential structure of the data. This problem is encountered in almost all areas of science - from physics and biology to finance, economics and psychometrics - where large data sets need to be analyzed.
Several paradigms for component analysis are considered, e.g., principal component analysis, independent component analysis and sparse principal component analysis, which are naturally formulated as an optimization problem subject to constraints that endow the problem with a well-characterized matrix manifold structure. Component analysis is so cast in the realm of optimization on matrix manifolds. Algorithms for component analysis are subsequently derived that take advantage of the geometrical structure of the problem.
When formalizing component analysis into an optimization framework, three main classes of problems are encountered, for which methods are proposed. We first consider the problem of optimizing a smooth function on the set of n-by-p real matrices with orthonormal columns. Then, a method is proposed to maximize a convex function on a compact manifold, which generalizes to this context the well-known power method that computes the dominant eigenvector of a matrix. Finally, we address the issue of solving problems defined in terms of large positive semidefinite matrices in a numerically efficient manner by using low-rank approximations of such matrices.
The efficiency of the proposed algorithms for component analysis is evaluated on the analysis of gene expression data related to breast cancer, which encode the expression levels of thousands of genes gained from experiments on hundreds of cancerous cells. Such data provide a snapshot of the biological processes that occur in tumor cells and offer huge opportunities for an improved understanding of cancer. Thanks to an original framework to evaluate the biological significance of a set of components, well-known but also novel knowledge is inferred about the biological processes that underlie breast cancer.
Hence, to summarize the thesis in one sentence: We adopt a geometric point of view to propose optimization algorithms performing component analysis, which, applied on large gene expression data, enable to reveal novel biological knowledge.
|
82 |
Holistic Face Recognition By Dimension ReductionGul, Ahmet Bahtiyar 01 January 2003 (has links) (PDF)
Face recognition is a popular research area where there are different
approaches studied in the literature. In this thesis, a holistic Principal
Component Analysis (PCA) based method, namely Eigenface method is
studied in detail and three of the methods based on the Eigenface method
are compared. These are the Bayesian PCA where Bayesian classifier is
applied after dimension reduction with PCA, the Subspace Linear
Discriminant Analysis (LDA) where LDA is applied after PCA and
Eigenface where Nearest Mean Classifier applied after PCA. All the
three methods are implemented on the Olivetti Research Laboratory
(ORL) face database, the Face Recognition Technology (FERET)
database and the CNN-TURK Speakers face database. The results are
compared with respect to the effects of changes in illumination, pose and
aging. Simulation results show that Subspace LDA and Bayesian PCA
perform slightly well with respect to PCA under changes in pose / however, even Subspace LDA and Bayesian PCA do not perform well
under changes in illumination and aging although they perform better
than PCA.
|
83 |
Boolean factor analysis a review of a novel method of matrix decomposition and neural network Boolean factor analysis /Upadrasta, Bharat. January 2009 (has links)
Thesis (M.S.)--State University of New York at Binghamton, Thomas J. Watson School of Engineering and Applied Science, Department of Systems Science and Industrial Engineering, 2009. / Includes bibliographical references.
|
84 |
Acquisition et traitement d’images hyperspectrales pour l’aide à la visualisation peropératoire de tissus vitaux / Acquisition and processing of hyperspectral images for assisted intraoperative visualization of vital tissuesNouri Kridiss, Dorra 26 May 2014 (has links)
L’imagerie hyperspectrale issue de la télédétection, va devenir une nouvelle modalité d’imagerie médicale pouvant assister le diagnostic de plusieurs pathologies via la détection des marges tumorales des cancers ou la mesure de l’oxygénation des tissus. L’originalité de ce travail de thèse est de fournir au chirurgien en cours d’intervention une vision améliorée du champ opératoire grâce à une image RGB affichée sur écran, résultat de traitements des cubes hyperspectraux dans le visible, le proche infrarouge et le moyen infrarouge (400-1700 nm). Notre application permet la détection des tissus difficilement détectables et vitaux comme l’uretère. Deux prototypes d’imagerie hyperspectrale utilisant les filtres programmables à cristaux liquides ont été développés, calibrés et mis en oeuvre dans de nombreuses campagnes d’expérimentations précliniques. Les résultats présentés dans cette thèse permettent de conclure que les méthodes de sélection de bandes sont les plus adaptées pour une application interventionnelle de l’imagerie hyperspectrale en salle d’opération puisqu’elles affichent une quantité maximale d’information, un meilleur rendu naturel de l’image RGB résultante et une amélioration maximale de la visualisation de la scène chirurgicale puisque le contraste dans l’image résultat entre le tissu d’intérêt et les tissus environnants a été triplé par rapport à l’image visualisée par l’oeil du chirurgien. Le principal inconvénient de ces méthodes réside dans le temps d’exécution qui a été nettement amélioré par les méthodes combinées proposées. De plus, la bande spectrale du moyen infrarouge est jugée plus discriminante pour explorer les données hyperspectrales associées à l’uretère puisque la séparabilité entre les tissus y est nettement supérieure par rapport à la gamme spectrale du visible. / Hyperspectral imagery initially applied for remote sensing will become a new medical imaging modality that may assist the diagnosis of several diseases through the detection of tumoral margins of cancers or the measure of the tissue oxygenation. The originality of this work is to provide, during surgery, an improved vision of the operative field with a RGB image displayed on screen, as the result of hyperspectral cubes processing in the visible, near infrared and mid-infrared (400-1700 nm). Our application allows the detection of hard noticeable and vital tissues as the ureter. Two hyperspectral imaging prototype using liquid crystal tunable filters have been developed, calibrated and implemented in many preclinical experiments campaigns. The results presented in this thesis allow to conclude that the methods of band selection are most suitable for interventional application of hyperspectral imaging in operating room since they show a maximal amount of information, a better natural rendering of the resulting RGB image and a maximal improvement of visualization of the surgical scene as the contrast in the resulting image between the tissue of interest and the surrounding tissues was tripled compared to the image viewed by the surgeon’s eye. The main drawback of these methods lies in the execution time which was significantly improved by the proposed combined methods. Furthermore, the mid-infrared spectral range is considered more discriminating to explore hyperspectral data associated with the ureter as the separability between tissues is significantly higher compared to the visible spectral range.
|
85 |
Uma análise funcional da dinâmica de densidades de retornos financeirosHorta, Eduardo de Oliveira January 2011 (has links)
Uma correta especificação das funções densidade de probabilidade (fdp’s) de retornos de ativos é um tópico dos mais relevantes na literatura de modelagem econométrica de dados financeiros. A presente dissertação propõe-se a oferecer, neste âmbito, uma abordagem distinta, através de uma aplicação da metodologia desenvolvida em Bathia et al. (2010) a dados intradiários do índice bovespa. Esta abordagem consiste em focar a análise diretamente sobre a estrutura dinâmica das fdp’s dos retornos, enxergando-as como uma sequência de variáveis aleatórias que tomam valores em um espaço de funções. A dependência serial existente entre essas curvas permite que se obtenham estimativas filtradas das fdp’s, e mesmo que se façam previsões sobre densidades de períodos subsequentes à amostra. No artigo que integra esta dissertação, onde é feita a mencionada aplicação, encontrou-se evidência de que o comportamento dinâmico das fdp’s dos retornos do índice bovespa se reduz a um processo bidimensional, o qual é bem representado por um modelo var(1) e cuja dinâmica afeta a dispersão e a assimetria das distribuições no suceder dos dias. Ademais, utilizando-se de subamostras, construíram-se previsões um passo à frente para essas fdp’s, e avaliaram-se essas previsões de acordo com métricas apropriadas. / Adequate specification of the probability density functions (pdf’s) of asset returns is a most relevant topic in econometric modelling of financial data. This dissertation aims to provide a distinct approach on that matter, through applying the methodology developed in Bathia et al. (2010) to intraday bovespa index data. This approach consists in focusing the analysis directly on the dynamic structure of returns fdp’s, seeing them as a sequence of function-valued random variables. The serial dependence of these curves allows one to obtain filtered estimates of the pdf’s, and even to forecast upcoming densities. In the paper contained into this dissertation, evidence is found that the dynamic structure of the bovespa index returns pdf’s reduces to a R2-valued process, which is well represented by a var(1) model, and whose dynamics affect the dispersion and symmetry of the distributions at each day. Moreover, one-step-ahead forecasts of upcoming pdf’s were constructed through subsamples and evaluated according to appropriate metrics.
|
86 |
Uma análise funcional da dinâmica de densidades de retornos financeirosHorta, Eduardo de Oliveira January 2011 (has links)
Uma correta especificação das funções densidade de probabilidade (fdp’s) de retornos de ativos é um tópico dos mais relevantes na literatura de modelagem econométrica de dados financeiros. A presente dissertação propõe-se a oferecer, neste âmbito, uma abordagem distinta, através de uma aplicação da metodologia desenvolvida em Bathia et al. (2010) a dados intradiários do índice bovespa. Esta abordagem consiste em focar a análise diretamente sobre a estrutura dinâmica das fdp’s dos retornos, enxergando-as como uma sequência de variáveis aleatórias que tomam valores em um espaço de funções. A dependência serial existente entre essas curvas permite que se obtenham estimativas filtradas das fdp’s, e mesmo que se façam previsões sobre densidades de períodos subsequentes à amostra. No artigo que integra esta dissertação, onde é feita a mencionada aplicação, encontrou-se evidência de que o comportamento dinâmico das fdp’s dos retornos do índice bovespa se reduz a um processo bidimensional, o qual é bem representado por um modelo var(1) e cuja dinâmica afeta a dispersão e a assimetria das distribuições no suceder dos dias. Ademais, utilizando-se de subamostras, construíram-se previsões um passo à frente para essas fdp’s, e avaliaram-se essas previsões de acordo com métricas apropriadas. / Adequate specification of the probability density functions (pdf’s) of asset returns is a most relevant topic in econometric modelling of financial data. This dissertation aims to provide a distinct approach on that matter, through applying the methodology developed in Bathia et al. (2010) to intraday bovespa index data. This approach consists in focusing the analysis directly on the dynamic structure of returns fdp’s, seeing them as a sequence of function-valued random variables. The serial dependence of these curves allows one to obtain filtered estimates of the pdf’s, and even to forecast upcoming densities. In the paper contained into this dissertation, evidence is found that the dynamic structure of the bovespa index returns pdf’s reduces to a R2-valued process, which is well represented by a var(1) model, and whose dynamics affect the dispersion and symmetry of the distributions at each day. Moreover, one-step-ahead forecasts of upcoming pdf’s were constructed through subsamples and evaluated according to appropriate metrics.
|
87 |
Uma análise funcional da dinâmica de densidades de retornos financeirosHorta, Eduardo de Oliveira January 2011 (has links)
Uma correta especificação das funções densidade de probabilidade (fdp’s) de retornos de ativos é um tópico dos mais relevantes na literatura de modelagem econométrica de dados financeiros. A presente dissertação propõe-se a oferecer, neste âmbito, uma abordagem distinta, através de uma aplicação da metodologia desenvolvida em Bathia et al. (2010) a dados intradiários do índice bovespa. Esta abordagem consiste em focar a análise diretamente sobre a estrutura dinâmica das fdp’s dos retornos, enxergando-as como uma sequência de variáveis aleatórias que tomam valores em um espaço de funções. A dependência serial existente entre essas curvas permite que se obtenham estimativas filtradas das fdp’s, e mesmo que se façam previsões sobre densidades de períodos subsequentes à amostra. No artigo que integra esta dissertação, onde é feita a mencionada aplicação, encontrou-se evidência de que o comportamento dinâmico das fdp’s dos retornos do índice bovespa se reduz a um processo bidimensional, o qual é bem representado por um modelo var(1) e cuja dinâmica afeta a dispersão e a assimetria das distribuições no suceder dos dias. Ademais, utilizando-se de subamostras, construíram-se previsões um passo à frente para essas fdp’s, e avaliaram-se essas previsões de acordo com métricas apropriadas. / Adequate specification of the probability density functions (pdf’s) of asset returns is a most relevant topic in econometric modelling of financial data. This dissertation aims to provide a distinct approach on that matter, through applying the methodology developed in Bathia et al. (2010) to intraday bovespa index data. This approach consists in focusing the analysis directly on the dynamic structure of returns fdp’s, seeing them as a sequence of function-valued random variables. The serial dependence of these curves allows one to obtain filtered estimates of the pdf’s, and even to forecast upcoming densities. In the paper contained into this dissertation, evidence is found that the dynamic structure of the bovespa index returns pdf’s reduces to a R2-valued process, which is well represented by a var(1) model, and whose dynamics affect the dispersion and symmetry of the distributions at each day. Moreover, one-step-ahead forecasts of upcoming pdf’s were constructed through subsamples and evaluated according to appropriate metrics.
|
88 |
Representação e classificação de texturas da íris baseado na análise discriminante de Fisher bi-dimensionalAssunção, Eduardo Timóteo de 24 March 2011 (has links)
Made available in DSpace on 2015-04-22T22:00:53Z (GMT). No. of bitstreams: 1
Dissertacao_Eduardo.pdf: 1434945 bytes, checksum: 3f5239d301bf43836793637602c9c301 (MD5)
Previous issue date: 2011-03-24 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / The recent advances of information technology and the growing security requirements have led to the fast development of intelligent person authentification techniques based on biometric recognition. In this work iris images, from UBIRIS data base, are applied as
biometric measurements in the verification scenery. The literature shows a large variety of feature extraction methods applied to the iris images recognition process. In this work we apply two methods based on subspace. The aim of subspace methods is to find feature vectors
which reduces the space dimension while also optimizes the class separation. Some of the most known subspace methods are Principal Components Analysis (PCA), Liner
Discriminant Analysis (LDA), and Independent Component Analysis (ICA). In this work we employ two extensions of FDA, i.e., D FDA 2 (2 ) and DiaLDA+2FDA for feature extraction. During the classification phase it was applied the nearest neighbor classifier with the Euclidean distance. The results showed that the methods have a good performance, with emphasis on dimension reduction. The methods compress a 200 92 matrix dimension in a 5 5matrix, with an AUC of 0.99. / Os recentes avanços da tecnologia da informação e os crescentes requisitos de segurança têm levado ao rápido desenvolvimento de técnicas inteligentes de autenticação de pessoas baseado em reconhecimento biométrico. Nesse trabalho empregam-se imagens da íris da base de dados UBIRIS como medidas biométricas no cenário de verificação de indivíduos. A literatura exibe uma grande variedade de métodos de extração de características aplicados no processo de reconhecimento de imagens de íris, entre os quais, os métodos baseados em subespaço. O objetivo da análise em subespaço é encontrar uma base de vetores que reduza a dimensão do espaço e, em alguns deles, que também otimize a separação das classes. Dos métodos baseados em subespaço, os mais conhecidos são Análise de Componentes Principais (PCA),
Análise Discriminante de Fisher (FDA) e Análise de Componentes Independentes (ICA). Nesse trabalho empregam-se as extensões do método FDA, denominados D FDA 2 (2 ) e DiaFDA+2FDA, na etapa de extração de características. Na etapa de classificação foi
empregado o classificador vizinho mais próximo com a métrica da distância Euclidiana. Os resultados mostraram que os métodos têm um bom desempenho, com destaque para o grande poder de compressão. Os métodos chegaram a reduzir uma matriz de dimensão 200 92 para
5 5, alcançando uma área sobre a curva ROC (AUC) de 0,99.
|
89 |
Diagnostic du colmatage des générateurs de vapeur à l'aide de modèles physiques et statistiques / Steam generators clogging diagnosis through physical and statistical modellingGirard, Sylvain 17 December 2012 (has links)
Les générateurs de vapeur sont d'imposants échangeurs de chaleur qui alimentent les turbines des centrales nucléaires à eau pressurisée. Au cours de leur exploitation, des dépôts d'oxydes s'y accumulent et obstruent progressivement des trous prévus pour le passage du fluide. Ce phénomène, appelé colmatage, pose des problèmes de sûreté. Une méthode de diagnostic est nécessaire pour optimiser la stratégie de maintenance permettant de s'en prémunir. La piste explorée dans cette thèse est l'analyse de la réponse dynamique des générateurs de vapeur lors de transitoire de puissance, à l'aide d'un modèle physique monodimensionnel. Deux améliorations ont été apportées au modèle existant au cours de la thèse : la prise en compte des débits perpendiculaires à l'axe du générateur de vapeur et la modélisation du déséquilibre cinématique entre la phase liquide et la phase vapeur. Ces éléments ont ajouté des degrés de liberté permettant de mieux reproduire le comportement réel des générateurs de vapeur. Une nouvelle méthodologie de calage et de validation a alors été proposée afin de garantir la robustesse du modèle.Le problème inverse initial était mal posé car plusieurs configurations spatiales de colmatage peuvent donner des réponses identiques. La magnitude relative de l'effet des dépôts suivant leur localisation a été évaluée par analyse de sensibilité avec la méthode de Sobol'. La dimension de la sortie fonctionnelle du modèle a au préalable été réduite par une analyse en composantes principales.Enfin, une méthode de réduction de dimension appelée régression inverse par tranches a été mise en œuvre pour déterminer dessous-espaces de projection optimaux pour le diagnostic. Une méthode de diagnostic plus robuste et mieux maitrisée que celle existante a pu être proposée grâce à cette nouvelle formulation. / Steam generators are massive heat exchangers feeding the turbines of pressurised water nuclear power plants. Internal parts of steam generators foul up with iron oxides which gradually close some holes aimed for the passing of the fluid. This phenomenon called clogging causes safety issues and means to assess it are needed to optimise the maintenance strategy. The approach investigated in this thesis is the analysis of steam generators dynamic behaviour during power transients with a monodimensional physical model. Two improvements to the model have been implemented. One was taking into account flows orthogonal to the modelling axis, the other was introducing a slip between phases accounting for velocity difference between liquid water and steam. These two elements increased the model's degrees of freedom and improved the adequacy of the simulati onto plant data. A new calibration and validation methodology has been proposed to assess the robustness of the model. The initial inverse problem was ill posed: different clogging spatial configurations can produce identical responses. The relative importance of clogging, depending on its localisation, has been estimated by sensitivity analysis with the Sobol' method. The dimension of the model functional output had been previously reduced by principal components analysis. Finally, the input dimension has been reduced by a technique called sliced inverse regression. Based on this new framework, a new diagnosis methodology, more robust and better understood than the existing one, has been proposed.
|
90 |
Efektivní implementace metod pro redukci dimenze v mnohorozměrné statistice / Efficient implementation of dimension reduction methods for high-dimensional statisticsPekař, Vojtěch January 2015 (has links)
The main goal of our thesis is to make the implementation of a classification method called linear discriminant analysis more efficient. It is a model of multivariate statistics which, given samples and their membership to given groups, attempts to determine the group of a new sample. We focus especially on the high-dimensional case, meaning that the number of variables is higher than number of samples and the problem leads to a singular covariance matrix. If the number of variables is too high, it can be practically impossible to use the common methods because of the high computational cost. Therefore, we look at the topic from the perspective of numerical linear algebra and we rearrange the obtained tasks to their equivalent formulation with much lower dimension. We offer new ways of solution, provide examples of particular algorithms and discuss their efficiency. Powered by TCPDF (www.tcpdf.org)
|
Page generated in 0.0335 seconds