Global ETD Search

91	A Model Fusion Based Framework For Imbalanced Classification Problem with Noisy Dataset January 2014 (has links) abstract: Data imbalance and data noise often coexist in real world datasets. Data imbalance affects the learning classifier by degrading the recognition power of the classifier on the minority class, while data noise affects the learning classifier by providing inaccurate information and thus misleads the classifier. Because of these differences, data imbalance and data noise have been treated separately in the data mining field. Yet, such approach ignores the mutual effects and as a result may lead to new problems. A desirable solution is to tackle these two issues jointly. Noting the complementary nature of generative and discriminative models, this research proposes a unified model fusion based framework to handle the imbalanced classification with noisy dataset. The phase I study focuses on the imbalanced classification problem. A generative classifier, Gaussian Mixture Model (GMM) is studied which can learn the distribution of the imbalance data to improve the discrimination power on imbalanced classes. By fusing this knowledge into cost SVM (cSVM), a CSG method is proposed. Experimental results show the effectiveness of CSG in dealing with imbalanced classification problems. The phase II study expands the research scope to include the noisy dataset into the imbalanced classification problem. A model fusion based framework, K Nearest Gaussian (KNG) is proposed. KNG employs a generative modeling method, GMM, to model the training data as Gaussian mixtures and form adjustable confidence regions which are less sensitive to data imbalance and noise. Motivated by the K-nearest neighbor algorithm, the neighboring Gaussians are used to classify the testing instances. Experimental results show KNG method greatly outperforms traditional classification methods in dealing with imbalanced classification problems with noisy dataset. The phase III study addresses the issues of feature selection and parameter tuning of KNG algorithm. To further improve the performance of KNG algorithm, a Particle Swarm Optimization based method (PSO-KNG) is proposed. PSO-KNG formulates model parameters and data features into the same particle vector and thus can search the best feature and parameter combination jointly. The experimental results show that PSO can greatly improve the performance of KNG with better accuracy and much lower computational cost. / Dissertation/Thesis / Doctoral Dissertation Industrial Engineering 2014 Industrial engineering Information science Gaussian mixture model Imbalanced classification K nearest Gaussian Particle swarm optimization Support vector machine
92	Small Blob Detection in Medical Images January 2015 (has links) abstract: Recent advances in medical imaging technology have greatly enhanced imaging based diagnosis which requires computational effective and accurate algorithms to process the images (e.g., measure the objects) for quantitative assessment. In this dissertation, one type of imaging objects is of interest: small blobs. Example small blob objects are cells in histopathology images, small breast lesions in ultrasound images, glomeruli in kidney MR images etc. This problem is particularly challenging because the small blobs often have inhomogeneous intensity distribution and indistinct boundary against the background. This research develops a generalized four-phased system for small blob detections. The system includes (1) raw image transformation, (2) Hessian pre-segmentation, (3) feature extraction and (4) unsupervised clustering for post-pruning. First, detecting blobs from 2D images is studied where a Hessian-based Laplacian of Gaussian (HLoG) detector is proposed. Using the scale space theory as foundation, the image is smoothed via LoG. Hessian analysis is then launched to identify the single optimal scale based on which a pre-segmentation is conducted. Novel Regional features are extracted from pre-segmented blob candidates and fed to Variational Bayesian Gaussian Mixture Models (VBGMM) for post pruning. Sixteen cell histology images and two hundred cell fluorescent images are tested to demonstrate the performances of HLoG. Next, as an extension, Hessian-based Difference of Gaussians (HDoG) is proposed which is capable to identify the small blobs from 3D images. Specifically, kidney glomeruli segmentation from 3D MRI (6 rats, 3 humans) is investigated. The experimental results show that HDoG has the potential to automatically detect glomeruli, enabling new measurements of renal microstructures and pathology in preclinical and clinical studies. Realizing the computation time is a key factor impacting the clinical adoption, the last phase of this research is to investigate the data reduction technique for VBGMM in HDoG to handle large-scale datasets. A new coreset algorithm is developed for variational Bayesian mixture models. Using the same MRI dataset, it is observed that the four-phased system with coreset-VBGMM has similar performance as using the full dataset but about 20 times faster. / Dissertation/Thesis / Doctoral Dissertation Industrial Engineering 2015 Industrial engineering Information science Medical imaging and radiology blob detection feature extraction Gaussian mixture models Hessian analysis image analysis variational bayesian
93	Probabilistic incremental learning for image recognition : modelling the density of high-dimensional data Carvalho, Edigleison Francelino January 2014 (has links) Atualmente diversos sistemas sensoriais fornecem dados em fluxos e essas observações medidas são frequentemente de alta dimensionalidade, ou seja, o número de variáveis medidas é grande, e as observações chegam em sequência. Este é, em particular, o caso de sistemas de visão em robôs. Aprendizagem supervisionada e não-supervisionada com esses fluxos de dados é um desafio, porque o algoritmo deve ser capaz de aprender com cada observação e depois descartá-la antes de considerar a próxima, mas diversos métodos requerem todo o conjunto de dados a fim de estimar seus parâmetros e, portanto, não são adequados para aprendizagem em tempo real. Além disso, muitas abordagens sofrem com a denominada maldição da dimensionalidade (BELLMAN, 1961) e não conseguem lidar com dados de entrada de alta dimensionalidade. Para superar os problemas descritos anteriormente, este trabalho propõe um novo modelo de rede neural probabilístico e incremental, denominado Local Projection Incremental Gaussian Mixture Network (LP-IGMN), que é capaz de realizar aprendizagem perpétua com dados de alta dimensionalidade, ou seja, ele pode aprender continuamente considerando a estabilidade dos parâmetros do modelo atual e automaticamente ajustar sua topologia levando em conta a fronteira do subespaço encontrado por cada neurônio oculto. O método proposto pode encontrar o subespaço intrísico onde os dados se localizam, o qual é denominado de subespaço principal. Ortogonal ao subespaço principal, existem as dimensões que são ruidosas ou que carregam pouca informação, ou seja, com pouca variância, e elas são descritas por um único parâmetro estimado. Portanto, LP-IGMN é robusta a diferentes fontes de dados e pode lidar com grande número de variáveis ruidosas e/ou irrelevantes nos dados medidos. Para avaliar a LP-IGMN nós realizamos diversos experimentos usando conjunto de dados simulados e reais. Demonstramos ainda diversas aplicações do nosso método em tarefas de reconhecimento de imagens. Os resultados mostraram que o desempenho da LP-IGMN é competitivo, e geralmente superior, com outras abordagens do estado da arte, e que ela pode ser utilizada com sucesso em aplicações que requerem aprendizagem perpétua em espaços de alta dimensionalidade. / Nowadays several sensory systems provide data in ows and these measured observations are frequently high-dimensional, i.e., the number of measured variables is large, and the observations are arriving in a sequence. This is in particular the case of robot vision systems. Unsupervised and supervised learning with such data streams is challenging, because the algorithm should be capable of learning from each observation and then discard it before considering the next one, but several methods require the whole dataset in order to estimate their parameters and, therefore, are not suitable for online learning. Furthermore, many approaches su er with the so called curse of dimensionality (BELLMAN, 1961) and can not handle high-dimensional input data. To overcome the problems described above, this work proposes a new probabilistic and incremental neural network model, called Local Projection Incremental Gaussian Mixture Network (LP-IGMN), which is capable to perform life-long learning with high-dimensional data, i.e., it can continuously learn considering the stability of the current model's parameters and automatically adjust its topology taking into account the subspace's boundary found by each hidden neuron. The proposed method can nd the intrinsic subspace where the data lie, which is called the principal subspace. Orthogonal to the principal subspace, there are the dimensions that are noisy or carry little information, i.e., with small variance, and they are described by a single estimated parameter. Therefore, LP-IGMN is robust to di erent sources of data and can deal with large number of noise and/or irrelevant variables in the measured data. To evaluate LP-IGMN we conducted several experiments using simulated and real datasets. We also demonstrated several applications of our method in image recognition tasks. The results have shown that the LP-IGMN performance is competitive, and usually superior, with other stateof- the-art approaches, and it can be successfully used in applications that require life-long learning in high-dimensional spaces. Redes neurais Inteligência artificial Local projection Probabilistic learning Online learning Incremental learning High-dimensional data Gaussian mixture models Image recognition
94	HIGMN : an IGMN-based hierarchical architecture and its applications for robotic tasks Pereira, Renato de Pontes January 2013 (has links) O recente campo de Deep Learning introduziu a área de Aprendizagem de Máquina novos métodos baseados em representações distribuídas e abstratas dos dados de treinamento ao longo de estruturas hierárquicas. A organização hierárquica de camadas permite que esses métodos guardem informações distribuídas sobre os sinais sensoriais e criem conceitos com diferentes níveis de abstração para representar os dados de entrada. Este trabalho investiga o impacto de uma estrutura hierárquica inspirada pelas ideias apresentadas em Deep Learning, e com base na Incremental Gaussian Mixture Network (IGMN), uma rede neural probabilística com aprendizagem online e incremental, especialmente adequada para as tarefas de robótica. Como resultado, foi desenvolvida uma arquitetura hierárquica, denominada Hierarchical Incremental Gaussian Mixture Network (HIGMN), que combina dois níveis de IGMNs. As camadas de primeiro nível da HIGMN são capazes de aprender conceitos a partir de dados de diferentes domínios que são então relacionados na camada de segundo nível. O modelo proposto foi comparado com a IGMN em tarefas de robótica, em especial, na tarefa de aprender e reproduzir um comportamento de seguir paredes, com base em uma abordagem de Aprendizado por Demonstração. Os experimentos mostraram como a HIGMN pode executar três diferentes tarefas em paralelo (aprendizagem de conceitos, segmentação de comportamento, e aprendizagem e reprodução de comportamentos) e sua capacidade de aprender um comportamento de seguir paredes e reproduzi-lo em ambientes desconhecidos com novas informações sensoriais. A HIGMN conseguiu reproduzir o comportamento de seguir paredes depois de uma única, simples e curta demonstração do comportamento. Além disso, ela adquiriu conhecimento de diferentes tipos: informações sobre o ambiente, a cinemática do robô, e o comportamento alvo. / The recent field of Deep Learning has introduced to Machine Learning new meth- ods based on distributed abstract representations of the training data throughout hierarchical structures. The hierarchical organization of layers allows these meth- ods to store distributed information on sensory signals and to create concepts with different abstraction levels to represent the input data. This work investigates the impact of a hierarchical structure inspired by ideas on Deep Learning and based on the Incremental Gaussian Mixture Network (IGMN), a probabilistic neural network with an on-line and incremental learning, specially suitable for robotic tasks. As a result, a hierarchical architecture, called Hierarchical Incremental Gaussian Mixture Network (HIGMN), was developed, which combines two levels of IGMNs. The HIGMN first-level layers are able to learn concepts from data of different domains that are then related in the second-level layer. The proposed model was compared with the IGMN regarding robotic tasks, in special, the task of learning and repro- ducing a wall-following behavior, based on a Learning from Demonstration (LfD) approach. The experiments showed how the HIGMN can perform parallely three different tasks concept learning, behavior segmentation, and learning and repro- ducing behaviors and its ability to learn a wall-following behavior and to perform it in unknown environments with new sensory information. HIGMN could reproduce the wall-following behavior after a single, simple, and short demonstration of the behavior. Moreover, it acquired different types of knowledge: information on the environment, the robot kinematics, and the target behavior. Inteligência artificial Redes neurais Robótica IGMN Robotics Learning from demonstration Deep learning
95	Probabilistic incremental learning for image recognition : modelling the density of high-dimensional data Carvalho, Edigleison Francelino January 2014 (has links) Atualmente diversos sistemas sensoriais fornecem dados em fluxos e essas observações medidas são frequentemente de alta dimensionalidade, ou seja, o número de variáveis medidas é grande, e as observações chegam em sequência. Este é, em particular, o caso de sistemas de visão em robôs. Aprendizagem supervisionada e não-supervisionada com esses fluxos de dados é um desafio, porque o algoritmo deve ser capaz de aprender com cada observação e depois descartá-la antes de considerar a próxima, mas diversos métodos requerem todo o conjunto de dados a fim de estimar seus parâmetros e, portanto, não são adequados para aprendizagem em tempo real. Além disso, muitas abordagens sofrem com a denominada maldição da dimensionalidade (BELLMAN, 1961) e não conseguem lidar com dados de entrada de alta dimensionalidade. Para superar os problemas descritos anteriormente, este trabalho propõe um novo modelo de rede neural probabilístico e incremental, denominado Local Projection Incremental Gaussian Mixture Network (LP-IGMN), que é capaz de realizar aprendizagem perpétua com dados de alta dimensionalidade, ou seja, ele pode aprender continuamente considerando a estabilidade dos parâmetros do modelo atual e automaticamente ajustar sua topologia levando em conta a fronteira do subespaço encontrado por cada neurônio oculto. O método proposto pode encontrar o subespaço intrísico onde os dados se localizam, o qual é denominado de subespaço principal. Ortogonal ao subespaço principal, existem as dimensões que são ruidosas ou que carregam pouca informação, ou seja, com pouca variância, e elas são descritas por um único parâmetro estimado. Portanto, LP-IGMN é robusta a diferentes fontes de dados e pode lidar com grande número de variáveis ruidosas e/ou irrelevantes nos dados medidos. Para avaliar a LP-IGMN nós realizamos diversos experimentos usando conjunto de dados simulados e reais. Demonstramos ainda diversas aplicações do nosso método em tarefas de reconhecimento de imagens. Os resultados mostraram que o desempenho da LP-IGMN é competitivo, e geralmente superior, com outras abordagens do estado da arte, e que ela pode ser utilizada com sucesso em aplicações que requerem aprendizagem perpétua em espaços de alta dimensionalidade. / Nowadays several sensory systems provide data in ows and these measured observations are frequently high-dimensional, i.e., the number of measured variables is large, and the observations are arriving in a sequence. This is in particular the case of robot vision systems. Unsupervised and supervised learning with such data streams is challenging, because the algorithm should be capable of learning from each observation and then discard it before considering the next one, but several methods require the whole dataset in order to estimate their parameters and, therefore, are not suitable for online learning. Furthermore, many approaches su er with the so called curse of dimensionality (BELLMAN, 1961) and can not handle high-dimensional input data. To overcome the problems described above, this work proposes a new probabilistic and incremental neural network model, called Local Projection Incremental Gaussian Mixture Network (LP-IGMN), which is capable to perform life-long learning with high-dimensional data, i.e., it can continuously learn considering the stability of the current model's parameters and automatically adjust its topology taking into account the subspace's boundary found by each hidden neuron. The proposed method can nd the intrinsic subspace where the data lie, which is called the principal subspace. Orthogonal to the principal subspace, there are the dimensions that are noisy or carry little information, i.e., with small variance, and they are described by a single estimated parameter. Therefore, LP-IGMN is robust to di erent sources of data and can deal with large number of noise and/or irrelevant variables in the measured data. To evaluate LP-IGMN we conducted several experiments using simulated and real datasets. We also demonstrated several applications of our method in image recognition tasks. The results have shown that the LP-IGMN performance is competitive, and usually superior, with other stateof- the-art approaches, and it can be successfully used in applications that require life-long learning in high-dimensional spaces. Redes neurais Inteligência artificial Local projection Probabilistic learning Online learning Incremental learning High-dimensional data Gaussian mixture models Image recognition
96	Classificação de fluxos de dados não estacionários com algoritmos incrementais baseados no modelo de misturas gaussianas / Non-stationary data streams classification with incremental algorithms based on Gaussian mixture models Luan Soares Oliveira 18 August 2015 (has links) Aprender conceitos provenientes de fluxos de dados é uma tarefa significamente diferente do aprendizado tradicional em lote. No aprendizado em lote, existe uma premissa implicita que os conceitos a serem aprendidos são estáticos e não evoluem significamente com o tempo. Por outro lado, em fluxos de dados os conceitos a serem aprendidos podem evoluir ao longo do tempo. Esta evolução é chamada de mudança de conceito, e torna a criação de um conjunto fixo de treinamento inaplicável neste cenário. O aprendizado incremental é uma abordagem promissora para trabalhar com fluxos de dados. Contudo, na presença de mudanças de conceito, conceitos desatualizados podem causar erros na classificação de eventos. Apesar de alguns métodos incrementais baseados no modelo de misturas gaussianas terem sido propostos na literatura, nota-se que tais algoritmos não possuem uma política explicita de descarte de conceitos obsoletos. Nesse trabalho um novo algoritmo incremental para fluxos de dados com mudanças de conceito baseado no modelo de misturas gaussianas é proposto. O método proposto é comparado com vários algoritmos amplamente utilizados na literatura, e os resultados mostram que o algoritmo proposto é competitivo com os demais em vários cenários, superando-os em alguns casos. / Learning concepts from data streams differs significantly from traditional batch learning. In batch learning there is an implicit assumption that the concept to be learned is static and does not evolve significantly over time. On the other hand, in data stream learning the concepts to be learned may evolve over time. This evolution is called concept drift, and makes the creation of a fixed training set be no longer applicable. Incremental learning paradigm is a promising approach for learning in a data stream setting. However, in the presence of concept drifts, out dated concepts can cause misclassifications. Several incremental Gaussian mixture models methods have been proposed in the literature, but these algorithms lack an explicit policy to discard outdated concepts. In this work, a new incremental algorithm for data stream with concept drifts based on Gaussian Mixture Models is proposed. The proposed methodis compared to various algorithms widely used in the literature, and the results show that it is competitive with them invarious scenarios, overcoming them in some cases. Aprendizado incremental Fluxo de dados Modelo de misturas gaussianas Mudança de Conceito Concept drift Data stream Gaussian mixture model Incremental learning
97	Statistical Background Models with Shadow Detection for Video Based Tracking Wood, John January 2007 (has links) A common problem when using background models to segment moving objects from video sequences is that objects cast shadow usually significantly differ from the background and therefore get detected as foreground. This causes several problems when extracting and labeling objects, such as object shape distortion and several objects merging together. The purpose of this thesis is to explore various possibilities to handle this problem. Three methods for statistical background modeling are reviewed. All methods work on a per pixel basis, the first is based on approximating the median, the next on using Gaussian mixture models, and the last one is based on channel representation. It is concluded that all methods detect cast shadows as foreground. A study of existing methods to handle cast shadows has been carried out in order to gain knowledge on the subject and get ideas. A common approach is to transform the RGB-color representation into a representation that separates color into intensity and chromatic components in order to determine whether or not newly sampled pixel-values are related to the background. The color spaces HSV, IHSL, CIELAB, YCbCr, and a color model proposed in the literature (Horprasert et al.) are discussed and compared for the purpose of shadow detection. It is concluded that Horprasert's color model is the most suitable for this purpose. The thesis ends with a proposal of a method to combine background modeling using Gaussian mixture models with shadow detection using Horprasert's color model. It is concluded that, while not perfect, such a combination can be very helpful in segmenting objects and detecting their cast shadow. Background Models Gaussian Mixture Models Shadow Detection Color Spaces
98	Modèle de mélange gaussien à effets superposés pour l’identification de sous-types de schizophrénie Nefkha-Bahri, Samy 03 1900 (has links) Ce travail s’inscrit dans l’effort de recherche ayant pour but d’identifier des sous-types de schizophrénie à travers des données de connectivité cérébrale tirées de l’imagerie par résonance magnétique fonctionelle. Des techniques de regroupement en grappes, dont l’algorithme Espérance-Maximisation (EM) pour l’estimation des paramètres de modèles de mé- lange gaussien, ont été utilisées sur des données de ce type dans des recherches précédentes. Cette approche capture des effets de processus cérébraux normaux qui sont sans intérêt pour l’identification de sous-types maladifs. Dans le présent travail, les données de la population des individus témoins (non-atteints de la maladie) sont modélisées par un mélange fini de densités gaussiennes. Chaque densité représente un sous-type supposé de fonctionnement cé- rébral normal. Une nouvelle modélisation est proposée pour les données de la population des individus atteints : un mélange de densités gaussiennes où chaque densité a une moyenne correspondant à la somme d’un état normal et d’un état maladif. Il s’agit donc d’un modèle de mélange gaussien dans lequel se superposent des sous-types de fonctionnement cérébral normal et des sous-types de maladie. On présuppose que les processus normaux et maladifs sont additifs et l’objectif est d’isoler et d’estimer les effets maladifs. Un algorithme de type EM spécifiquement conçu pour ce modèle est développé. Nous disposons en outre de données de connectivité cérébrale de 242 individus témoins et 242 patients diagnostiqués schizophrènes. Des résultats de l’utilisation de cet algorithme sur ces données sont rapportés. / This work is part of the research effort to identify subtypes of schizophrenia through brain connectivity data from functional magnetic resonance imaging. Clustering techniques, including the Esperance-Maximization algorithm (EM) for estimating parameters of Gaussian mixture models, have been used on such data in previous research. This approach captures the effects of normal brain processes that are irrelevant to the identification of disease subtypes. In this work, the population data of control (non-disease) individuals are modeled by a finite mixture of Gaussian densities. Each density represents an assumed subtype of normal brain function. A new model is proposed for the population data of affected individuals : a mixture of Gaussian densities where each density has an mean corresponding to the sum of a normal state and a disease state. Therefore, it is a mixture in which subtypes of normal brain function and subtypes of disease are superimposed. It is assumed that normal and unhealthy processes are additive and the goal is to isolate and estimate the unhealthy effects. An EM algorithm specifically designed for this model is developed. Data were obtained from functional magnetic resonance imaging of 242 control individuals and 242 patients diagnosed with schizophrenia. Results obtained using this algorithm on this data set are reported. mélange gaussien clustering schizophrénie gaussian mixture schizophrenia regroupement en grappes effets superposés superimposed effects
99	Mera sličnosti između modela Gausovih smeša zasnovana na transformaciji prostora parametara Krstanović Lidija 25 September 2017 (has links) <p>Predmet istraživanja ovog rada je istraživanje i eksploatacija mogućnosti da parametri Gausovih komponenti korišćenih Gaussian mixture modela  (GMM) aproksimativno leže na niže dimenzionalnoj površi umetnutoj u konusu pozitivno definitnih matrica. U tu svrhu uvodimo novu, mnogo efikasniju meru sličnosti između GMM-ova projektovanjem LPP-tipa parametara komponenti iz više dimenzionalnog parametarskog originalno konfiguracijskog prostora u prostor značajno niže dimenzionalnosti. Prema tome, nalaženje distance između dva GMM-a iz originalnog prostora se redukuje na nalaženje distance između dva skupa niže dimenzionalnih euklidskih vektora, ponderisanih odgovarajućim težinama. Predložena mera je pogodna za primene koje zahtevaju visoko dimenzionalni prostor obeležja i/ili veliki ukupan broj Gausovih komponenti. Razrađena metodologija je primenjena kako na sintetičkim tako i na realnim eksperimentalnim podacima.</p> / <p>This thesis studies the possibility that the parameters of Gaussian components of a<br />particular Gaussian Mixture Model (GMM) lie approximately on a lower-dimensional<br />surface embedded in the cone of positive definite matrices. For that case, we deliver<br />novel, more efficient similarity measure between GMMs, by LPP-like projecting the<br />components of a particular GMM, from the high dimensional original parameter space,<br />to a much lower dimensional space. Thus, finding the distance between two GMMs in<br />the original space is reduced to finding the distance between sets of lower<br />dimensional euclidian vectors, pondered by corresponding weights. The proposed<br />measure is suitable for applications that utilize high dimensional feature spaces and/or<br />large overall number of Gaussian components. We confirm our results on artificial, as<br />well as real experimental data.</p>
100	Entwicklung eines Monte-Carlo-Verfahrens zum selbständigen Lernen von Gauß-Mischverteilungen Lauer, Martin 03 March 2005 (has links) In der Arbeit wird ein neuartiges Lernverfahren für Gauß-Mischverteilungen entwickelt. Es basiert auf der Technik der Markov-Chain Monte-Carlo Verfahren und ist in der Lage, in einem Zuge die Größe der Mischverteilung sowie deren Parameter zu bestimmen. Das Verfahren zeichnet sich sowohl durch eine gute Anpassung an die Trainingsdaten als auch durch eine gute Generalisierungsleistung aus. Ausgehend von einer Beschreibung der stochastischen Grundlagen und einer Analyse der Probleme, die beim Lernen von Gauß-Mischverteilungen auftreten, wird in der Abeit das neue Lernverfahren schrittweise entwickelt und seine Eigenschaften untersucht. Ein experimenteller Vergleich mit bekannten Lernverfahren für Gauß-Mischverteilungen weist die Eignung des neuen Verfahrens auch empirisch nach. Gaussian mixture models Markov-Chain Monte-Carlo Data Augmentation unsupervised learning 54.72 - Künstliche Intelligenz 28 - Informatik, Datenverarbeitung ddc:004

Search results