Global ETD Search

1011	Optimization of convolutional neural networks for image classification using genetic algorithms and bayesian optimization Rawat, Waseem 01 1900 (has links) Notwithstanding the recent successes of deep convolutional neural networks for classification tasks, they are sensitive to the selection of their hyperparameters, which impose an exponentially large search space on modern convolutional models. Traditional hyperparameter selection methods include manual, grid, or random search, but these require expert knowledge or are computationally burdensome. Divergently, Bayesian optimization and evolutionary inspired techniques have surfaced as viable alternatives to the hyperparameter problem. Thus, an alternative hybrid approach that combines the advantages of these techniques is proposed. Specifically, the search space is partitioned into discrete-architectural, and continuous and categorical hyperparameter subspaces, which are respectively traversed by a stochastic genetic search, followed by a genetic-Bayesian search. Simulations on a prominent image classification task reveal that the proposed method results in an overall classification accuracy improvement of 0.87% over unoptimized baselines, and a greater than 97% reduction in computational costs compared to a commonly employed brute force approach. / Electrical and Mining Engineering / M. Tech. (Electrical Engineering) Deep learning Artificial neural networks Convolutional neural networks Evolutionary algorithms Genetic algorithms Bayesian optimization Computer vision Image classification Model selection Hyperparameter optimization 006.32 Neural networks (Computer science) Genetic algorithms Image processing -- Digital techniques
1012	Rozpoznávání pojmenovaných entit pomocí neuronových sítí / Neural Network Based Named Entity Recognition Straková, Jana January 2017 (has links) Title: Neural Network Based Named Entity Recognition Author: Jana Straková Institute: Institute of Formal and Applied Linguistics Supervisor of the doctoral thesis: prof. RNDr. Jan Hajič, Dr., Institute of Formal and Applied Linguistics Abstract: Czech named entity recognition (the task of automatic identification and classification of proper names in text, such as names of people, locations and organizations) has become a well-established field since the publication of the Czech Named Entity Corpus (CNEC). This doctoral thesis presents the author's research of named entity recognition, mainly in the Czech language. It presents work and research carried out during CNEC publication and its evaluation. It fur- ther envelops the author's research results, which improved Czech state-of-the-art results in named entity recognition in recent years, with special focus on artificial neural network based solutions. Starting with a simple feed-forward neural net- work with softmax output layer, with a standard set of classification features for the task, the thesis presents methodology and results, which were later used in open-source software solution for named entity recognition, NameTag. The thesis finalizes with a recurrent neural network based recognizer with word embeddings and character-level word embeddings,...
1013	Um modelo para redes neuronais biologicamente inspirado baseado em minimização de divergência local. / A biologically inspired neural network model based on minimizing local divergence. SANTANA, Ewaldo Eder Carvalho. 14 August 2018 (has links) Submitted by Johnny Rodrigues (johnnyrodrigues@ufcg.edu.br) on 2018-08-14T16:42:54Z No. of bitstreams: 1 EWALDO EDER CARVALHO SANTANA - TESE PPGEE 2009..pdf: 5646465 bytes, checksum: d83cd716193f68815a22b066836f3ae6 (MD5) / Made available in DSpace on 2018-08-14T16:42:54Z (GMT). No. of bitstreams: 1 EWALDO EDER CARVALHO SANTANA - TESE PPGEE 2009..pdf: 5646465 bytes, checksum: d83cd716193f68815a22b066836f3ae6 (MD5) Previous issue date: 2009-11-06 / Neste trabalho é proposto o desenvolvimento de uma rede neuronial com aprendizagem não supervisionada, para modelar a organização topográfica do córtex visual primário. Para isto, estuda-se o comportamento dos campos receptivos do córtex visual primário(V1), e, para o modelamento da rede utilizam-se os conceitos de divergência local e de interação entre neurônios vizinhos, bem como da característica de não linearidades dos neurônios. Para treinamento da rede desenvolveu-se um algoritmo de ponto fixo. / In this work it is proposed an unsupervised neural network model, which seems biologically plausible in modeling the primary visual cortex (V1). It is, also, studied de behavior of the receptive fields of V1. In order to modeling the net it was used the concepts of local discrepancy and interactions between neighbor neurons, as well the non-linearity characteristics of neurons. It was designed a fixed-point algorithm to train the neural network. Neurociência. Ciência da Computação. Engenharia Elétrica. Modelagem de redes neuronais Redes neuronais Neurociência computacional Córtex virtual primário Divergência local - redes Redes neuronais artificiais Codificação neuronal Artificial neural networks Neural coding Neural network modeling Sistema visual Topografia do sistema visual
1014	[en] A SYSTEM FOR STOCK MARKET FORECASTING AND SIMULATION / [pt] UM SISTEMA PARA PREDIÇÃO E SIMULAÇÃO DO MERCADO DE CAPITAIS PAULO DE TARSO GOMIDE CASTRO SILVA 02 February 2017 (has links) [pt] Nos últimos anos, vem crescendo o interesse acerca da predição do comportamento do mercado de capitais, tanto por parte dos investidores quanto dos pesquisadores. Apesar do grande número de publicações tratando esse problema, predizer com eficiência futuras tendências e desenvolver estratégias de negociação capazes de traduzir boas predições em lucros são ainda grandes desafios. A dificuldade em realizar tais tarefas se deve tanto à não linearidade e grande volume de ruídos presentes nos dados do mercado, quanto à falta de sistemas que possam avaliar com propriedade a qualidade das predições realizadas. Nesse trabalho, são realizadas predições de séries temporais visando auxiliar o investidor tanto em operações de compra e venda, como em Pairs Trading. Além disso, as predições são feitas considerando duas diferentes periodicidades. Uma predição interday, que considera apenas dados diários e tem como objetivo a predição de valores referentes ao presente dia. E uma predição intraday, que visa predizer valores referentes a cada hora de negociação do dia atual e para isso considera também os dados intraday conhecidos até o momento que se deseja prever. Para ambas as tarefas propostas, foram testadas três ferramentas de predição, quais sejam, Regressão por Mínimos Quadrados Parciais, Regressão por Vetores de Suporte e Redes Neurais Artificiais. Com o intuito de melhor avaliar a qualidade das predições realizadas, é proposto ainda um trading system. Os testes foram realizados considerando ativos das companhias mais negociadas da BM e FBOVESPA, a bolsa de valores oficial do Brasil e terceira maior do mundo. Os resultados dos três preditores são apresentados e comparados a quatro benchmarks, bem como com a solução ótima. A diferença na qualidade de predição, considerando o erro de predição ou as métricas do trading system, são notáveis. Se quando analisado apenas o Erro Percentual Absoluto Médio os preditores propostos não mostram uma melhora significativa, quando as métricas do trading system são consideradas eles apresentam um resultado bem superior. O retorno anual do investimento em alguns casos atinge valor superior a 300 por cento. / [en] The interest of both investors and researchers in stock market behavior forecasting has increased throughout the recent years. Despite the wide number of publications examining this problem, accurately predicting future stock trends and developing business strategies capable of turning good predictions into profits are still great challenges. This is partly due to the nonlinearity and noise inherent to the stock market data source, and partly because benchmarking systems to assess the forecasting quality are not publicly available. Here, we perform time series forecasting aiming to guide the investor both into Pairs Trading and buy and sell operations. Furthermore, we explore two different forecasting periodicities. First, an interday forecast, which considers only daily data and whose goal is predict values referring to the current day. And second, the intraday approach, which aims to predict values referring to each trading hour of the current day and also takes advantage of the intraday data already known at prediction time. In both forecasting schemes, we use three regression tools as predictor algorithms, which are: Partial Least Squares Regression, Support Vector Regression and Artificial Neural Networks. We also propose a trading system as a better way to assess the forecasting quality. In the experiments, we examine assets of the most traded companies in the BM and FBOVESPA Stock Exchange, the world s third largest and official Brazilian Stock Exchange. The results for the three predictors are presented and compared to four benchmarks, as well as to the optimal solution. The difference in the forecasting quality, when considering either the forecasting error metrics or the trading system metrics, is remarkable. If we consider just the mean absolute percentage error, the proposed predictors do not show a significant superiority. Nevertheless, when considering the trading system evaluation, it shows really outstanding results. The yield in some cases amounts to an annual return on investment of more than 300 per cent. [pt] APRENDIZADO DE MAQUINA [en] MACHINE LEARNING [pt] REDES NEURAIS ARTIFICIAIS [en] ARTIFICIAL NEURAL NETWORKS [pt] MERCADO DE CAPITAIS [en] STOCK MARKETS [pt] PREDICAO DE SERIES TEMPORAIS [pt] REGRESSAO POR VETORES DE SUPORTE [pt] TRADING SYSTEM
1015	A connectionist approach for incremental function approximation and on-line tasks / Uma abordagem conexionista para a aproximação incremental de funções e tarefas de tempo real Heinen, Milton Roberto January 2011 (has links) Este trabalho propõe uma nova abordagem conexionista, chamada de IGMN (do inglês Incremental Gaussian Mixture Network), para aproximação incremental de funções e tarefas de tempo real. Ela é inspirada em recentes teorias do cérebro, especialmente o MPF (do inglês Memory-Prediction Framework) e a Inteligência Artificial Construtivista, que fazem com que o modelo proposto possua características especiais que não estão presentes na maioria dos modelos de redes neurais existentes. Além disso, IGMN é baseado em sólidos princípios estatísticos (modelos de mistura gaussianos) e assintoticamente converge para a superfície de regressão ótima a medida que os dados de treinamento chegam. As principais vantagens do IGMN em relação a outros modelos de redes neurais são: (i) IGMN aprende instantaneamente analisando cada padrão de treinamento apenas uma vez (cada dado pode ser imediatamente utilizado e descartado); (ii) o modelo proposto produz estimativas razoáveis baseado em poucos dados de treinamento; (iii) IGMN aprende de forma contínua e perpétua a medida que novos dados de treinamento chegam (não existem fases separadas de treinamento e utilização); (iv) o modelo proposto resolve o dilema da estabilidade-plasticidade e não sofre de interferência catastrófica; (v) a topologia da rede neural é definida automaticamente e de forma incremental (novas unidades são adicionadas sempre que necessário); (vi) IGMN não é sensível às condições de inicialização (de fato IGMN não utiliza nenhuma decisão e/ou inicialização aleatória); (vii) a mesma rede neural IGMN pode ser utilizada em problemas diretos e inversos (o fluxo de informações é bidirecional) mesmo em regiões onde a função alvo tem múltiplas soluções; e (viii) IGMN fornece o nível de confiança de suas estimativas. Outra contribuição relevante desta tese é o uso do IGMN em importantes tarefas nas áreas de robótica e aprendizado de máquina, como por exemplo a identificação de modelos, a formação incremental de conceitos, o aprendizado por reforço, o mapeamento robótico e previsão de séries temporais. De fato, o poder de representação e a eficiência e do modelo proposto permitem expandir o conjunto de tarefas nas quais as redes neurais podem ser utilizadas, abrindo assim novas direções nos quais importantes contribuições do estado da arte podem ser feitas. Através de diversos experimentos, realizados utilizando o modelo proposto, é demonstrado que o IGMN é bastante robusto ao problema de overfitting, não requer um ajuste fino dos parâmetros de configuração e possui uma boa performance computacional que permite o seu uso em aplicações de controle em tempo real. Portanto pode-se afirmar que o IGMN é uma ferramenta de aprendizado de máquina bastante útil em tarefas de aprendizado incremental de funções e predição em tempo real. / This work proposes IGMN (standing for Incremental Gaussian Mixture Network), a new connectionist approach for incremental function approximation and real time tasks. It is inspired on recent theories about the brain, specially the Memory-Prediction Framework and the Constructivist Artificial Intelligence, which endows it with some unique features that are not present in most ANN models such as MLP, RBF and GRNN. Moreover, IGMN is based on strong statistical principles (Gaussian mixture models) and asymptotically converges to the optimal regression surface as more training data arrive. The main advantages of IGMN over other ANN models are: (i) IGMN learns incrementally using a single scan over the training data (each training pattern can be immediately used and discarded); (ii) it can produce reasonable estimates based on few training data; (iii) the learning process can proceed perpetually as new training data arrive (there is no separate phases for leaning and recalling); (iv) IGMN can handle the stability-plasticity dilemma and does not suffer from catastrophic interference; (v) the neural network topology is defined automatically and incrementally (new units added whenever is necessary); (vi) IGMN is not sensible to initialization conditions (in fact there is no random initialization/ decision in IGMN); (vii) the same neural network can be used to solve both forward and inverse problems (the information flow is bidirectional) even in regions where the target data are multi-valued; and (viii) IGMN can provide the confidence levels of its estimates. Another relevant contribution of this thesis is the use of IGMN in some important state-of-the-art machine learning and robotic tasks such as model identification, incremental concept formation, reinforcement learning, robotic mapping and time series prediction. In fact, the efficiency of IGMN and its representational power expand the set of potential tasks in which the neural networks can be applied, thus opening new research directions in which important contributions can be made. Through several experiments using the proposed model it is demonstrated that IGMN is also robust to overfitting, does not require fine-tunning of its configuration parameters and has a very good computational performance, thus allowing its use in real time control applications. Therefore, IGMN is a very useful machine learning tool for incremental function approximation and on-line prediction. Inteligência artificial Redes bayesianas Aprendizagem : Maquina Cluster Robótica Machine learning Artificial neural networks Incremental learning Bayesian methods Gaussian mixture models Function approximation Regression Clustering Reinforcement learning Autonomous mobile robots
1016	O fenômeno social do movimento de pedestres em centros urbanos Zampieri, Fabio Lúcio Lopes January 2012 (has links) Entender as causas geradoras do movimento de pedestres é muito importante para o planejamento urbano das cidades, a fim de inferir se as atitudes tomadas na concepção e manutenção dos espaços estão de fato contribuindo para o fortalecimento da dinâmica social. No entanto, determinar como funciona o movimento dos pedestres é uma tarefa difícil devido à complexidade das estruturas sociais que criam esse fenômeno. Uma maneira de contornar esse problema é através da criação de modelos que associam diretamente os atributos das calçadas e suas relações. Dentre as metodologias observadas para a construção do modelo, aquelas que mais se adequaram para entender os fatores contidos no espaço urbano foram a sintaxe espacial e o nível de serviço dos passeios. A Sintaxe Espacial ajuda na compreensão desse fenômeno, pois permite a análise das relações espaciais da cidade. No entanto, possui a limitação de prever apenas parte desse movimento e, ainda, só encontrar correlações significativas quando a medida de inteligibilidade (correlação entre a integração e conectividade) apresenta valores elevados. Do mesmo modo, a qualidade do passeio interfere diretamente no modo que o pedestre utiliza esse espaço. O fluxo de pedestre é um fenômeno complexo e, como tal, não pode ser entendido através de relações lineares ou de causa e efeito entre quaisquer variáveis, sejam elas espaciais ou não. Para entender a complexidade desse fenômeno será utilizada a abordagem conexionista das redes neurais artificiais, uma forma de processamento em paralelo com capacidade de trabalhar através de exemplos, generalizando e abstraindo as informações das variáveis e suas ligações. Deste modo, este trabalho pretende apresentar uma abordagem diferenciada para entender parte da dinâmica social através do fenômeno da movimentação de pedestres. / In Urban Planning, understanding pedestrian movement generative causes is key issue. Through it is possible to check if public policies targeting maintenance and improvement of open spaces are effectively reaching their goals in strengthening social dynamics. Nevertheless it is quite difficult to determinate how pedestrian movement functions due to social structure complexity driving these phenomena emergency and the extensive row of variables underlying it. One of the most effective problem-solving tools targeting the subject is modeling sidewalk attributes through the relations held amidst their physical variables and measured levels of use. Among several methods applied in model building, Space Syntax and Sidewalks Service Leveling were those which provided the best suitable tools to face the research problems. Space Syntax methods are helpful to understand the phenomena, since they allow to depict tendencies for pedestrian movement related directly to grid deformation patterns and spatial configuration. But its scope is limited since it substantiates movement potential rather than measured. Also because correlations found between pedestrian movement measurements and urban morphology are only robust when Integibility measure (correlating Integration to connectivity) is high. Likewise, sidewalk physical qualities are directly responsible for the way in which people move around, establish their routes or use public spaces, fulfilling or hijacking movement potential standards. Pedestrian fluxes are complex phenomena and being such, unable to be pictured by linear relations or explained through cause – effect conditioning between spatial or a-spatial variables. To depict the interdependence between variables amidst these phenomena complexity we opted for a connective approach found at Artificial Neural Nets; a way of parallel data processing which is able to modeling from samples due to their generalization capacity and that of extract information from the structuring variables and their relationships and linkages. Being so, this research presents an original combination of tools and methods that allow approaching social dynamics starting from pedestrian movement phenomena empirical data. Trafego de pedestres Movimento Distribuição espacial Redes neurais artificiais Cidades : Santa Maria (RS) Cidades : Florianópolis (SC) Cidades : Criciúma (SC) Mobilidade urbana Modelagem urbana Configuração urbana Urban planning Pedestrian movement Space syntax Artificial neural networks
1017	Statistical modelling by neural networks Fletcher, Lizelle 30 June 2002 (has links) In this thesis the two disciplines of Statistics and Artificial Neural Networks are combined into an integrated study of a data set of a weather modification Experiment. An extensive literature study on artificial neural network methodology has revealed the strongly interdisciplinary nature of the research and the applications in this field. An artificial neural networks are becoming increasingly popular with data analysts, statisticians are becoming more involved in the field. A recursive algoritlun is developed to optimize the number of hidden nodes in a feedforward artificial neural network to demonstrate how existing statistical techniques such as nonlinear regression and the likelihood-ratio test can be applied in innovative ways to develop and refine neural network methodology. This pruning algorithm is an original contribution to the field of artificial neural network methodology that simplifies the process of architecture selection, thereby reducing the number of training sessions that is needed to find a model that fits the data adequately. [n addition, a statistical model to classify weather modification data is developed using both a feedforward multilayer perceptron artificial neural network and a discriminant analysis. The two models are compared and the effectiveness of applying an artificial neural network model to a relatively small data set assessed. The formulation of the problem, the approach that has been followed to solve it and the novel modelling application all combine to make an original contribution to the interdisciplinary fields of Statistics and Artificial Neural Networks as well as to the discipline of meteorology. / Mathematical Sciences / D. Phil. (Statistics) Statistical modelling Artificial neural networks Nonlinear regression Multilayer perceptron Backpropagation Hidden nodes Pruning algorithm Classification Discriminant analysis Analysis of variance Weather modification 006.320727 Weather control -- Statistical methods
1018	Connectionist modelling in cognitive science: an exposition and appraisal Janeke, Hendrik Christiaan 28 February 2003 (has links) This thesis explores the use of artificial neural networks for modelling cognitive processes. It presents an exposition of the neural network paradigm, and evaluates its viability in relation to the classical, symbolic approach in cognitive science. Classical researchers have approached the description of cognition by concentrating mainly on an abstract, algorithmic level of description in which the information processing properties of cognitive processes are emphasised. The approach is founded on seminal ideas about computation, and about algorithmic description emanating, amongst others, from the work of Alan Turing in mathematical logic. In contrast to the classical conception of cognition, neural network approaches are based on a form of neurocomputation in which the parallel distributed processing mechanisms of the brain are highlighted. Although neural networks are generally accepted to be more neurally plausible than their classical counterparts, some classical researchers have argued that these networks are best viewed as implementation models, and that they are therefore not of much relevance to cognitive researchers because information processing models of cognition can be developed independently of considerations about implementation in physical systems. In the thesis I argue that the descriptions of cognitive phenomena deriving from neural network modelling cannot simply be reduced to classical, symbolic theories. The distributed representational mechanisms underlying some neural network models have interesting properties such as similarity-based representation, content-based retrieval, and coarse coding which do not have straightforward equivalents in classical systems. Moreover, by placing emphasis on how cognitive processes are carried out by brain-like mechanisms, neural network research has not only yielded a new metaphor for conceptualising cognition, but also a new methodology for studying cognitive phenomena. Neural network simulations can be lesioned to study the effect of such damage on the behaviour of the system, and these systems can be used to study the adaptive mechanisms underlying learning processes. For these reasons, neural network modelling is best viewed as a significant theoretical orientation in the cognitive sciences, instead of just an implementational endeavour. / Psychology / D. Litt. et Phil. (Psychology) Artificial neural networks Attractor memory models Classical cognitive approach Coarse coding Cognition and computation Cognitive science Connectionist modelling Distributed representations Functionalism Mental representation Physical symbol system Turing machine 612.82 Connectionism Cognitive science Neural networks (Neurobiology)
1019	Detecção e classificação de VTCDs em sistemas de distribuição de energia elétrica usando redes neurais artificiais. / Detection and classification of short duration voltage variations in power distribution systems using artificial neural networks. Richard Henrique Ribeiro Antunes 28 March 2012 (has links) Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro / O objetivo deste trabalho é conhecer e compreender melhor os imprevistos no fornecimento de energia elétrica, quando ocorrem as variações de tensão de curta duração (VTCD). O banco de dados necessário para os diagnósticos das faltas foi obtido através de simulações de um modelo de alimentador radial através do software PSCAD/EMTDC. Este trabalho utiliza um Phase-Locked Loop (PLL) com o intuito de detectar VTCDs e realizar a estimativa automática da frequência, do ângulo de fase e da amplitude das tensões e correntes da rede elétrica. Nesta pesquisa, desenvolveram-se duas redes neurais artificiais: uma para identificar e outra para localizar as VTCDs ocorridas no sistema de distribuição de energia elétrica. A técnica aqui proposta aplica-se a alimentadores trifásicos com cargas desequilibradas, que podem possuir ramais laterais trifásicos, bifásicos e monofásicos. No desenvolvimento da mesma, considera-se que há disponibilidade de medições de tensões e correntes no nó inicial do alimentador e também em alguns pontos esparsos ao longo do alimentador de distribuição. Os desempenhos das arquiteturas das redes neurais foram satisfatórios e demonstram a viabilidade das RNAs na obtenção das generalizações que habilitam o sistema para realizar a classificação de curtos-circuitos. / The objective of this work is to know and understand the unforeseen in the supply of electricity, when there are short duration voltage variations (SDVV). The required databases for the diagnosis of faults were obtained through simulations of a model of radial feeder through software PSCAD/EMTDC. This work uses a Phase-Locked Loop (PLL) in order to detect and perform the estimation SDVV automatic frequency, phase angle and amplitude of the voltage and current from the power grid. This research is developing two artificial neural networks: one to identify and another to locate the SDVV occurred in the distribution system of electricity. The technique proposed here applies to three-phase feeders with unbalanced loads, which can have side extensions triphasic, biphasic and monophasic. In developing the same, it is considered that there is availability of measurements of voltages and currents at the node of the initial feeder and also in some points scattered along the distribution feeder. The performances of the architectures of neural networks were satisfactory and demonstrate the feasibility of ANNs in obtaining the generalizations that enables the system for the classification of short circuits. Redes neurais artificiais Variação de tensão de curta duração VTCD Phase-locked loop PLL PSCAD/EMTDC Artificial neural networks Short duration voltage variations SDVV Phase-locked loop PLL Distribution electric power system PSCAD/EMTDC ENGENHARIAS
1020	A connectionist approach for incremental function approximation and on-line tasks / Uma abordagem conexionista para a aproximação incremental de funções e tarefas de tempo real Heinen, Milton Roberto January 2011 (has links) Este trabalho propõe uma nova abordagem conexionista, chamada de IGMN (do inglês Incremental Gaussian Mixture Network), para aproximação incremental de funções e tarefas de tempo real. Ela é inspirada em recentes teorias do cérebro, especialmente o MPF (do inglês Memory-Prediction Framework) e a Inteligência Artificial Construtivista, que fazem com que o modelo proposto possua características especiais que não estão presentes na maioria dos modelos de redes neurais existentes. Além disso, IGMN é baseado em sólidos princípios estatísticos (modelos de mistura gaussianos) e assintoticamente converge para a superfície de regressão ótima a medida que os dados de treinamento chegam. As principais vantagens do IGMN em relação a outros modelos de redes neurais são: (i) IGMN aprende instantaneamente analisando cada padrão de treinamento apenas uma vez (cada dado pode ser imediatamente utilizado e descartado); (ii) o modelo proposto produz estimativas razoáveis baseado em poucos dados de treinamento; (iii) IGMN aprende de forma contínua e perpétua a medida que novos dados de treinamento chegam (não existem fases separadas de treinamento e utilização); (iv) o modelo proposto resolve o dilema da estabilidade-plasticidade e não sofre de interferência catastrófica; (v) a topologia da rede neural é definida automaticamente e de forma incremental (novas unidades são adicionadas sempre que necessário); (vi) IGMN não é sensível às condições de inicialização (de fato IGMN não utiliza nenhuma decisão e/ou inicialização aleatória); (vii) a mesma rede neural IGMN pode ser utilizada em problemas diretos e inversos (o fluxo de informações é bidirecional) mesmo em regiões onde a função alvo tem múltiplas soluções; e (viii) IGMN fornece o nível de confiança de suas estimativas. Outra contribuição relevante desta tese é o uso do IGMN em importantes tarefas nas áreas de robótica e aprendizado de máquina, como por exemplo a identificação de modelos, a formação incremental de conceitos, o aprendizado por reforço, o mapeamento robótico e previsão de séries temporais. De fato, o poder de representação e a eficiência e do modelo proposto permitem expandir o conjunto de tarefas nas quais as redes neurais podem ser utilizadas, abrindo assim novas direções nos quais importantes contribuições do estado da arte podem ser feitas. Através de diversos experimentos, realizados utilizando o modelo proposto, é demonstrado que o IGMN é bastante robusto ao problema de overfitting, não requer um ajuste fino dos parâmetros de configuração e possui uma boa performance computacional que permite o seu uso em aplicações de controle em tempo real. Portanto pode-se afirmar que o IGMN é uma ferramenta de aprendizado de máquina bastante útil em tarefas de aprendizado incremental de funções e predição em tempo real. / This work proposes IGMN (standing for Incremental Gaussian Mixture Network), a new connectionist approach for incremental function approximation and real time tasks. It is inspired on recent theories about the brain, specially the Memory-Prediction Framework and the Constructivist Artificial Intelligence, which endows it with some unique features that are not present in most ANN models such as MLP, RBF and GRNN. Moreover, IGMN is based on strong statistical principles (Gaussian mixture models) and asymptotically converges to the optimal regression surface as more training data arrive. The main advantages of IGMN over other ANN models are: (i) IGMN learns incrementally using a single scan over the training data (each training pattern can be immediately used and discarded); (ii) it can produce reasonable estimates based on few training data; (iii) the learning process can proceed perpetually as new training data arrive (there is no separate phases for leaning and recalling); (iv) IGMN can handle the stability-plasticity dilemma and does not suffer from catastrophic interference; (v) the neural network topology is defined automatically and incrementally (new units added whenever is necessary); (vi) IGMN is not sensible to initialization conditions (in fact there is no random initialization/ decision in IGMN); (vii) the same neural network can be used to solve both forward and inverse problems (the information flow is bidirectional) even in regions where the target data are multi-valued; and (viii) IGMN can provide the confidence levels of its estimates. Another relevant contribution of this thesis is the use of IGMN in some important state-of-the-art machine learning and robotic tasks such as model identification, incremental concept formation, reinforcement learning, robotic mapping and time series prediction. In fact, the efficiency of IGMN and its representational power expand the set of potential tasks in which the neural networks can be applied, thus opening new research directions in which important contributions can be made. Through several experiments using the proposed model it is demonstrated that IGMN is also robust to overfitting, does not require fine-tunning of its configuration parameters and has a very good computational performance, thus allowing its use in real time control applications. Therefore, IGMN is a very useful machine learning tool for incremental function approximation and on-line prediction. Inteligência artificial Redes bayesianas Aprendizagem : Maquina Cluster Robótica Machine learning Artificial neural networks Incremental learning Bayesian methods Gaussian mixture models Function approximation Regression Clustering Reinforcement learning Autonomous mobile robots

Search results