Global ETD Search

141	Efficient high-dimensional filtering for image and video processing Gastal, Eduardo Simões Lopes January 2015 (has links) Filtragem é uma das mais importantes operações em processamento de imagens e vídeos. Em particular, filtros de altas dimensões são ferramentas fundamentais para diversas aplicações, tendo recebido recentemente significativa atenção de pesquisadores da área. Infelizmente, implementações ingênuas desta importante classe de filtros são demasiadamente lentas para muitos usos práticos, especialmente tendo em vista o aumento contínuo na resolução de imagens capturadas digitalmente. Esta dissertação descreve três novas abordagens para filtragem eficiente em altas dimensões: a domain transform, os adaptive manifolds, e uma formulação matemática para a aplicação de filtros recursivos em sinais amostrados não-uniformemente. A domain transform, representa o estado-da-arte em termos de algoritmos para filtragem utilizando métrica geodésica. A inovação desta abordagem é a utilização de um procedimento simples de redução de dimensionalidade para implementar eficientemente filtros de alta dimensão. Isto nos permite a primeira demonstração de filtragem com preservação de arestas em tempo real para vídeos coloridos de alta resolução (full HD). Os adaptive manifolds, representam o estado-da-arte em termos de algoritmos para filtragem utilizando métrica Euclidiana. A inovação desta abordagem é a ideia de subdividir o espaço de alta dimensão em fatias não-lineares de mais baixa dimensão, as quais são filtradas independentemente e finalmente interpoladas para obter uma filtragem de alta dimensão com métrica Euclidiana. Com isto obtemos diversos avanços em relação a técnicas anteriores, como filtragem mais rápida e requerendo menos memória, além da derivação do primeiro filtro Euclidiano com custo linear tanto no número de pixels da imagem (ou vídeo) quanto na dimensionalidade do espaço onde o filtro está operando. Finalmente, introduzimos uma formulação matemática que descreve a aplicação de um filtro recursivo em sinais amostrados de maneira não-uniforme. Esta formulação estende a ideia de filtragem geodésica para filtros recursivos arbitrários (tanto passa-baixa quanto passa-alta e passa-banda). Esta extensão fornece maior controle sobre as respostas desejadas para os filtros, as quais podem então ser melhor adaptadas para aplicações específicas. Como exemplo, demonstramos—pela primeira vez na literatura—filtros geodésicos com formato Gaussiano, Laplaciana do Gaussiano, Butterworth, e Cauer, dentre outros. Com a possibilidade de se trabalhar com filtros arbitrários, nosso método permite uma nova variedade de efeitos para aplicações em imagens e vídeos. / Filtering is arguably the single most important operation in image and video processing. In particular, high-dimensional filters are a fundamental building block for several applications, having recently received considerable attention from the research community. Unfortunately, naive implementations of such an important class of filters are too slow for many practical uses, specially in light of the ever increasing resolution of digitally captured images. This dissertation describes three novel approaches to efficiently perform high-dimensional filtering: the domain transform, the adaptive manifolds, and a mathematical formulation for recursive filtering of non-uniformly sampled signals. The domain transform defines an isometry between curves on the 2D image manifold in 5D and the real line. It preserves the geodesic distance between points on these curves, adaptively warping the input signal so that high-dimensional geodesic filtering can be efficiently performed in linear time. Its computational cost is not affected by the choice of the filter parameters; and the resulting filters are the first to work on color images at arbitrary scales in real time, without resorting to subsampling or quantization. The adaptive manifolds compute the filter’s response at a reduced set of sampling points, and use these for interpolation at all input pixels, so that high-dimensional Euclidean filtering can be efficiently performed in linear time. We show that for a proper choice of sampling points, the total cost of the filtering operation is linear both in the number of pixels and in the dimension of the space in which the filter operates. As such, ours is the first high-dimensional filter with such a complexity. We present formal derivations for the equations that define our filter, providing a sound theoretical justification. Finally, we introduce a mathematical formulation for linear-time recursive filtering of non-uniformly sampled signals. This formulation enables, for the first time, geodesic edge-aware evaluation of arbitrary recursive infinite impulse response filters (not only low-pass), which allows practically unlimited control over the shape of the filtering kernel. By providing the ability to experiment with the design and composition of new digital filters, our method has the potential do enable a greater variety of image and video effects. The high-dimensional filters we propose provide the fastest performance (both on CPU and GPU) for a variety of real-world applications. Thus, our filters are a valuable tool for the image and video processing, computer graphics, computer vision, and computational photography communities. Computação gráfica Processamento : Imagem High-dimensional filtering Domain transform Adaptive manifolds Non-uniformly sampled signals Geodesic filtering Euclidean filtering Hybrid filtering
142	Support Vector Machines na classificação de imagens hiperespectrais / Hyperspectral image classification with support vector machines Andreola, Rafaela January 2009 (has links) É de conhecimento geral que, em alguns casos, as classes são espectralmente muito similares e que não é possível separá-las usando dados convencionais em baixa dimensionalidade. Entretanto, estas classes podem ser separáveis com um alto grau de acurácia em espaço de alta dimensão. Por outro lado, classificação de dados em alta dimensionalidade pode se tornar um problema para classificadores paramétricos, como o Máxima Verossimilhança Gaussiana (MVG). Um grande número de variáveis que caracteriza as imagens hiperespectrais resulta em um grande número de parâmetros a serem estimados e, geralmente, tem-se um número limitado de amostras de treinamento disponíveis. Essa condição causa o fenômeno de Hughes que consiste na gradual degradação da acurácia com o aumento da dimensionalidade dos dados. Neste contexto, desperta o interesse a utilização de classificadores não-paramétricos, como é o caso de Support Vector Machines (SVM). Nesta dissertação é analisado o desempenho do classificador SVM quando aplicado a imagens hiperespectrais de sensoriamento remoto. Inicialmente os conceitos teóricos referentes à SVM são revisados e discutidos. Em seguida, uma série de experimentos usando dados AVIRIS são realizados usando diferentes configurações para o classificador. Os dados cobrem uma área de teste da Purdue University e apresenta classes de culturas agrícolas espectralmente muito similares. A acurácia produzida na classificação por diferentes kernels são investigadas em função da dimensionalidade dos dados e comparadas com as obtidas com o classificador MVG. Como SVM é aplicado a um par de classes por vez, desenvolveu-se um classificador multi-estágio estruturado em forma de árvore binária para lidar como problema multi-classe. Em cada nó, a seleção do par de classes mais separáveis é feita pelo critério distância de Bhattacharyya. Tais classes darão origem aos nós descendentes e serão responsáveis por definir a função de decisão SVM. Repete-se este procedimento em todos os nós da árvore, até que reste apenas uma classe por nó, nos chamados nós terminais. Os softwares necessários foram desenvolvidos em ambiente MATLAB e são apresentados na dissertação. Os resultados obtidos nos experimentos permitem concluir que SVM é uma abordagem alternativa válida e eficaz para classificação de imagens hiperespectrais de sensoriamento remoto. / This dissertation deals with the application of Support Vector Machines (SVM) to the classification of remote sensing high-dimensional image data. It is well known that in many cases classes that are spectrally very similar and thus not separable when using the more conventional low-dimensional data, can nevertheless be separated with an high degree of accuracy in high dimensional spaces. Classification of high-dimensional image data can, however, become a challenging problem for parametric classifiers such as the well-known Gaussian Maximum Likelihood. A large number of variables produce an also large number of parameters to be estimated from a generally limited number of training samples. This condition causes the Hughes phenomenon which consists in a gradual degradation of the accuracy as the data dimensionality increases beyond a certain value. Non-parametric classifiers present the advantage of being less sensitive to this dimensionality problem. SVM has been receiving a great deal of attention from the international community as an efficient classifier. In this dissertation it is analyzed the performance of SVM when applied to remote sensing hyper-spectral image data. Initially the more theoretical concepts related to SVM are reviewed and discussed. Next, a series of experiments using AVIRIS image data are performed, using different configurations for the classifier. The data covers a test area established by Purdue University and presents a number of classes (agricultural fields) which are spectrally very similar to each other. The classification accuracy produced by different kernels is investigated as a function of the data dimensionality and compared with the one yielded by the well-known Gaussian Maximum Likelihood classifier. As SVM apply to a pair of classes at a time, a multi-stage classifier structured as a binary tree was developed to deal with the multi-class problem. The tree classifier is initially defined by selecting at each node the most separable pair of classes by using the Bhattacharyya distance as a criterion. These two classes will then be used to define the two descending nodes and the corresponding SVM decision function. This operation is performed at every node across the tree, until the terminal nodes are reached. The required software was developed in MATLAB environment and is also presented in this dissertation. Sensoriamento remoto Imagens hiperespectrais Gaussian maximum likelihood Binary tree classify High-dimensional image data Remote sensing SVM
143	Probabilistic incremental learning for image recognition : modelling the density of high-dimensional data Carvalho, Edigleison Francelino January 2014 (has links) Atualmente diversos sistemas sensoriais fornecem dados em fluxos e essas observações medidas são frequentemente de alta dimensionalidade, ou seja, o número de variáveis medidas é grande, e as observações chegam em sequência. Este é, em particular, o caso de sistemas de visão em robôs. Aprendizagem supervisionada e não-supervisionada com esses fluxos de dados é um desafio, porque o algoritmo deve ser capaz de aprender com cada observação e depois descartá-la antes de considerar a próxima, mas diversos métodos requerem todo o conjunto de dados a fim de estimar seus parâmetros e, portanto, não são adequados para aprendizagem em tempo real. Além disso, muitas abordagens sofrem com a denominada maldição da dimensionalidade (BELLMAN, 1961) e não conseguem lidar com dados de entrada de alta dimensionalidade. Para superar os problemas descritos anteriormente, este trabalho propõe um novo modelo de rede neural probabilístico e incremental, denominado Local Projection Incremental Gaussian Mixture Network (LP-IGMN), que é capaz de realizar aprendizagem perpétua com dados de alta dimensionalidade, ou seja, ele pode aprender continuamente considerando a estabilidade dos parâmetros do modelo atual e automaticamente ajustar sua topologia levando em conta a fronteira do subespaço encontrado por cada neurônio oculto. O método proposto pode encontrar o subespaço intrísico onde os dados se localizam, o qual é denominado de subespaço principal. Ortogonal ao subespaço principal, existem as dimensões que são ruidosas ou que carregam pouca informação, ou seja, com pouca variância, e elas são descritas por um único parâmetro estimado. Portanto, LP-IGMN é robusta a diferentes fontes de dados e pode lidar com grande número de variáveis ruidosas e/ou irrelevantes nos dados medidos. Para avaliar a LP-IGMN nós realizamos diversos experimentos usando conjunto de dados simulados e reais. Demonstramos ainda diversas aplicações do nosso método em tarefas de reconhecimento de imagens. Os resultados mostraram que o desempenho da LP-IGMN é competitivo, e geralmente superior, com outras abordagens do estado da arte, e que ela pode ser utilizada com sucesso em aplicações que requerem aprendizagem perpétua em espaços de alta dimensionalidade. / Nowadays several sensory systems provide data in ows and these measured observations are frequently high-dimensional, i.e., the number of measured variables is large, and the observations are arriving in a sequence. This is in particular the case of robot vision systems. Unsupervised and supervised learning with such data streams is challenging, because the algorithm should be capable of learning from each observation and then discard it before considering the next one, but several methods require the whole dataset in order to estimate their parameters and, therefore, are not suitable for online learning. Furthermore, many approaches su er with the so called curse of dimensionality (BELLMAN, 1961) and can not handle high-dimensional input data. To overcome the problems described above, this work proposes a new probabilistic and incremental neural network model, called Local Projection Incremental Gaussian Mixture Network (LP-IGMN), which is capable to perform life-long learning with high-dimensional data, i.e., it can continuously learn considering the stability of the current model's parameters and automatically adjust its topology taking into account the subspace's boundary found by each hidden neuron. The proposed method can nd the intrinsic subspace where the data lie, which is called the principal subspace. Orthogonal to the principal subspace, there are the dimensions that are noisy or carry little information, i.e., with small variance, and they are described by a single estimated parameter. Therefore, LP-IGMN is robust to di erent sources of data and can deal with large number of noise and/or irrelevant variables in the measured data. To evaluate LP-IGMN we conducted several experiments using simulated and real datasets. We also demonstrated several applications of our method in image recognition tasks. The results have shown that the LP-IGMN performance is competitive, and usually superior, with other stateof- the-art approaches, and it can be successfully used in applications that require life-long learning in high-dimensional spaces. Redes neurais Inteligência artificial Local projection Probabilistic learning Online learning Incremental learning High-dimensional data Gaussian mixture models Image recognition
144	Efektivní implementace metod pro redukci dimenze v mnohorozměrné statistice / Efficient implementation of dimension reduction methods for high-dimensional statistics Pekař, Vojtěch January 2015 (has links) The main goal of our thesis is to make the implementation of a classification method called linear discriminant analysis more efficient. It is a model of multivariate statistics which, given samples and their membership to given groups, attempts to determine the group of a new sample. We focus especially on the high-dimensional case, meaning that the number of variables is higher than number of samples and the problem leads to a singular covariance matrix. If the number of variables is too high, it can be practically impossible to use the common methods because of the high computational cost. Therefore, we look at the topic from the perspective of numerical linear algebra and we rearrange the obtained tasks to their equivalent formulation with much lower dimension. We offer new ways of solution, provide examples of particular algorithms and discuss their efficiency. Powered by TCPDF (www.tcpdf.org)
145	Big Data Analytics for Fault Detection and its Application in Maintenance / Big Data Analytics för Feldetektering och Applicering inom Underhåll Zhang, Liangwei January 2016 (has links) Big Data analytics has attracted intense interest recently for its attempt to extract information, knowledge and wisdom from Big Data. In industry, with the development of sensor technology and Information & Communication Technologies (ICT), reams of high-dimensional, streaming, and nonlinear data are being collected and curated to support decision-making. The detection of faults in these data is an important application in eMaintenance solutions, as it can facilitate maintenance decision-making. Early discovery of system faults may ensure the reliability and safety of industrial systems and reduce the risk of unplanned breakdowns. Complexities in the data, including high dimensionality, fast-flowing data streams, and high nonlinearity, impose stringent challenges on fault detection applications. From the data modelling perspective, high dimensionality may cause the notorious “curse of dimensionality” and lead to deterioration in the accuracy of fault detection algorithms. Fast-flowing data streams require algorithms to give real-time or near real-time responses upon the arrival of new samples. High nonlinearity requires fault detection approaches to have sufficiently expressive power and to avoid overfitting or underfitting problems. Most existing fault detection approaches work in relatively low-dimensional spaces. Theoretical studies on high-dimensional fault detection mainly focus on detecting anomalies on subspace projections. However, these models are either arbitrary in selecting subspaces or computationally intensive. To meet the requirements of fast-flowing data streams, several strategies have been proposed to adapt existing models to an online mode to make them applicable in stream data mining. But few studies have simultaneously tackled the challenges associated with high dimensionality and data streams. Existing nonlinear fault detection approaches cannot provide satisfactory performance in terms of smoothness, effectiveness, robustness and interpretability. New approaches are needed to address this issue. This research develops an Angle-based Subspace Anomaly Detection (ABSAD) approach to fault detection in high-dimensional data. The efficacy of the approach is demonstrated in analytical studies and numerical illustrations. Based on the sliding window strategy, the approach is extended to an online mode to detect faults in high-dimensional data streams. Experiments on synthetic datasets show the online extension can adapt to the time-varying behaviour of the monitored system and, hence, is applicable to dynamic fault detection. To deal with highly nonlinear data, the research proposes an Adaptive Kernel Density-based (Adaptive-KD) anomaly detection approach. Numerical illustrations show the approach’s superiority in terms of smoothness, effectiveness and robustness. Big Data analytics eMaintenance fault detection high-dimensional data stream data mining nonlinear data Other Civil Engineering Annan samhällsbyggnadsteknik
146	Expansion methods for high-dimensional PDEs in finance Wissmann, Rasmus January 2015 (has links) We develop expansion methods as a new computational approach towards high-dimensional partial differential equations (PDEs), particularly of such type as arising in the valuation of financial derivatives. The proposed methods are extended from [41] and use principal component analysis (PCA) of the underlying process in combination with a Taylor expansion of the value function into solutions to low-dimensional PDEs. They enable calculation of highly accurate approximate solutions with computational complexity polynomial in the number of dimensions for PDEs with a low number of dominant principal components. For the case of PDEs with constant coefficients, we show existence of expansion solutions and prove theoretical error bounds. We give a precise characterisation of when our methods can be applied and construct specific examples of a first and second order version. We provide numerical results showing that the empirically observed convergence speeds are in agreement with the theoretical predictions. For the case of PDEs with varying coefficients, we give a heuristic motivation using the Parametrix approach and empirically test the methods' accuracy for a range of variable parameter stock models. We demonstrate the applicability of our expansion methods to real-world securities pricing problems by considering path-dependent and early-exercise options in the LIBOR market model. Using the example of Bermudan swaptions and Ratchet floors, which are considered difficult benchmark problems, we give a careful analysis of the numerical accuracy and computational complexity. We are able to demonstrate that for problems with medium to high dimensionality, around 60-100, and moderate time horizons, the presented PDE methods deliver results comparable in accuracy to benchmark state-of-the-art Monte Carlo methods in similar or (significantly) faster run time. 515
147	Stabilité de la sélection de variables sur des données haute dimension : une application à l'expression génique / Feature selection stability on high dimensional data : an application to gene expression data Dernoncourt, David 15 October 2014 (has links) Les technologies dites « haut débit » permettent de mesurer de très grandes quantités de variables à l'échelle de chaque individu : séquence ADN, expressions des gènes, profil lipidique… L'extraction de connaissances à partir de ces données peut se faire par exemple par des méthodes de classification. Ces données contenant un très grand nombre de variables, mesurées sur quelques centaines de patients, la sélection de variables est une étape préalable indispensable pour réduire le risque de surapprentissage, diminuer les temps de calcul, et améliorer l'interprétabilité des modèles. Lorsque le nombre d’observations est faible, la sélection tend à être instable, et on observe souvent que sur deux jeux de données différents mais traitant d’un même problème, les variables sélectionnées ne se recoupent presque pas. Pourtant, obtenir une sélection stable semble crucial si l'on veut avoir confiance dans la pertinence effective des variables sélectionnées à des fins d'extraction de connaissances. Dans ce travail, nous avons d'abord cherché à déterminer quels sont les facteurs qui influencent le plus la stabilité de la sélection. Puis nous avons proposé une approche, spécifique aux données puces à ADN, faisant appel aux annotations fonctionnelles pour assister les méthodes de sélection habituelles, en enrichissant les données avec des connaissances a priori. Nous avons ensuite travaillé sur deux aspects des méthodes d'ensemble : le choix de la méthode d'agrégation et les ensembles hybrides. Dans un dernier chapitre, nous appliquons les méthodes étudiées à un problème de prédiction de la reprise de poids suite à un régime, à partir de données puces, chez des patients obèses. / High throughput technologies allow us to measure very high amounts of variables in patients: DNA sequence, gene expression, lipid profile… Knowledge discovery can be performed on such data using, for instance, classification methods. However, those data contain a very high number of variables, which are measured, in the best cases, on a few hundreds of patients. This makes feature selection a necessary first step so as to reduce the risk of overfitting, reduce computation time, and improve model interpretability. When the amount of observations is low, feature selection tends to be unstable. It is common to observe that two selections obtained from two different datasets dealing with the same problem barely overlap. Yet, it seems important to obtain a stable selection if we want to be confident that the selected variables are really relevant, in an objective of knowledge discovery. In this work, we first tried to determine which factors have the most influence on feature selection stability. We then proposed a feature selection method, specific to microarray data, using functional annotations from Gene Ontology in order to assist usual feature selection methods, with the addition of a priori knowledge to the data. We then worked on two aspects of ensemble methods: the choice of the aggregation method, and hybrid ensemble methods. In the final chapter, we applied the methods studied in the thesis to a dataset from our lab, dealing with the prediction of weight regain after a diet, from microarray data, in obese patients. Sélection de variables Stabilité Données biopuces Données haute dimension Extraction de connaissances Obésité Feature selection High dimensional data 614.4
148	Modeling of High-Dimensional Industrial Data for Enhanced PHM using Time Series Based Integrated Fusion and Filtering Techniques Cai, Haoshu 25 May 2022 (has links) No description available. Mechanical Engineering Prognostics and Health Management High-dimensional Stream Data Wind Speed Prediction Virtual Metrology Remaining Useful Life Prediction Semiconductor Manufacturing
149	Forecasting the Business Cycle using Partial Least Squares / Prediktion av ekonomiskacykler med hjälp av partiella minsta kvadrat metoden Lannsjö, Fredrik January 2014 (has links) Partial Least Squares is both a regression method and a tool for variable selection, that is especially appropriate for models based on numerous (possibly correlated) variables. While being a well established modeling tool in chemometrics, this thesis adapts PLS to financial data to predict the movements of the business cycle represented by the OECD Composite Leading Indicators. High-dimensional data is used, and a model with automated variable selection through a genetic algorithm is developed to forecast different economic regions with good results in out-of-sample tests. / Partial Least Squares är både en regressionsmetod och ett verktyg för variabelselektion som är specielltlämpligt för modeller baserade på en stor mängd (möjligtvis korrelerade) variabler.Medan det är en väletablerad modelleringsmetod inom kemimetri, anpassar den häruppsatsen PLS till finansiell data för att förutspå rörelserna av konjunkturen,representerad av OECD's Composite Leading Indicator. Högdimensionella dataanvänds och en model med automatiserad variabelselektion via en genetiskalgoritm utvecklas för att göra en prognos av olika ekonomiska regioner medgoda resultat i out-of-sample-tester Quantitative Forecast Partial Least Squares Variable Selection High-dimensional Regression Big Data Business Cycle Leading Indicators Mathematical Analysis Matematisk analys
150	Bilinear Gaussian Radial Basis Function Networks for classification of repeated measurements Sjödin Hällstrand, Andreas January 2020 (has links) The Growth Curve Model is a bilinear statistical model which can be used to analyse several groups of repeated measurements. Normally the Growth Curve Model is defined in such a way that the permitted sampling frequency of the repeated measurement is limited by the number of observed individuals in the data set.In this thesis, we examine the possibilities of utilizing highly frequently sampled measurements to increase classification accuracy for real world data. That is, we look at the case where the regular Growth Curve Model is not defined due to the relationship between the sampling frequency and the number of observed individuals. When working with this high frequency data, we develop a new method of basis selection for the regression analysis which yields what we call a Bilinear Gaussian Radial Basis Function Network (BGRBFN), which we then compare to more conventional polynomial and trigonometrical functional bases. Finally, we examine if Tikhonov regularization can be used to further increase the classification accuracy in the high frequency data case.Our findings suggest that the BGRBFN performs better than the conventional methods in both classification accuracy and functional approximability. The results also suggest that both high frequency data and furthermore Tikhonov regularization can be used to increase classification accuracy. Bilinear Regression Classification Supervised Learning High Dimensional Statistics Growth Curve Model Regularization Radial Basis Probability Theory and Statistics Sannolikhetsteori och statistik

Search results