Global ETD Search

1	Novel neural architectures & algorithms for efficient inference Kag, Anil 30 August 2023 (has links) In the last decade, the machine learning universe embraced deep neural networks (DNNs) wholeheartedly with the advent of neural architectures such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), transformers, etc. These models have empowered many applications, such as ChatGPT, Imagen, etc., and have achieved state-of-the-art (SOTA) performance on many vision, speech, and language modeling tasks. However, SOTA performance comes with various issues, such as large model size, compute-intensive training, increased inference latency, higher working memory, etc. This thesis aims at improving the resource efficiency of neural architectures, i.e., significantly reducing the computational, storage, and energy consumption of a DNN without any significant loss in performance. Towards this goal, we explore novel neural architectures as well as training algorithms that allow low-capacity models to achieve near SOTA performance. We divide this thesis into two dimensions: \textit{Efficient Low Complexity Models}, and \textit{Input Hardness Adaptive Models}. Along the first dimension, i.e., \textit{Efficient Low Complexity Models}, we improve DNN performance by addressing instabilities in the existing architectures and training methods. We propose novel neural architectures inspired by ordinary differential equations (ODEs) to reinforce input signals and attend to salient feature regions. In addition, we show that carefully designed training schemes improve the performance of existing neural networks. We divide this exploration into two parts: \textsc{(a) Efficient Low Complexity RNNs.} We improve RNN resource efficiency by addressing poor gradients, noise amplifications, and BPTT training issues. First, we improve RNNs by solving ODEs that eliminate vanishing and exploding gradients during the training. To do so, we present Incremental Recurrent Neural Networks (iRNNs) that keep track of increments in the equilibrium surface. Next, we propose Time Adaptive RNNs that mitigate the noise propagation issue in RNNs by modulating the time constants in the ODE-based transition function. We empirically demonstrate the superiority of ODE-based neural architectures over existing RNNs. Finally, we propose Forward Propagation Through Time (FPTT) algorithm for training RNNs. We show that FPTT yields significant gains compared to the more conventional Backward Propagation Through Time (BPTT) scheme. \textsc{(b) Efficient Low Complexity CNNs.} Next, we improve CNN architectures by reducing their resource usage. They require greater depth to generate high-level features, resulting in computationally expensive models. We design a novel residual block, the Global layer, that constrains the input and output features by approximately solving partial differential equations (PDEs). It yields better receptive fields than traditional convolutional blocks and thus results in shallower networks. Further, we reduce the model footprint by enforcing a novel inductive bias that formulates the output of a residual block as a spatial interpolation between high-compute anchor pixels and low-compute cheaper pixels. This results in spatially interpolated convolutional blocks (SI-CNNs) that have better compute and performance trade-offs. Finally, we propose an algorithm that enforces various distributional constraints during training in order to achieve better generalization. We refer to this scheme as distributionally constrained learning (DCL). In the second dimension, i.e., \textit{Input Hardness Adaptive Models}, we introduce the notion of the hardness of any input relative to any architecture. In the first dimension, a neural network allocates the same resources, such as compute, storage, and working memory, for all the inputs. It inherently assumes that all examples are equally hard for a model. In this dimension, we challenge this assumption using input hardness as our reasoning that some inputs are relatively easy for a network to predict compared to others. Input hardness enables us to create selective classifiers wherein a low-capacity network handles simple inputs while abstaining from a prediction on the complex inputs. Next, we create hybrid models that route the hard inputs from the low-capacity abstaining network to a high-capacity expert model. We design various architectures that adhere to this hybrid inference style. Further, input hardness enables us to selectively distill the knowledge of a high-capacity model into a low-capacity model by cleverly discarding hard inputs during the distillation procedure. Finally, we conclude this thesis by sketching out various interesting future research directions that emerge as an extension of different ideas explored in this work. Electrical engineering Distributionally constrained learning Forward propagation through time Hybrid edge cloud models Selective classification
2	Modelagem de circuitos neurais do sistema neuromotor e proprioceptor de insetos com o uso da transferência de informação entre conexões neurais / Neural circuits modeling of insects neuromotor system based on information transfer approach and neural connectivity Endo, Wagner 31 March 2014 (has links) Apresenta-se, neste trabalho, o desenvolvimento de um modelo bioinspirado a partir do circuito neural de insetos. Este modelo é obtido através da análise de primeira ordem dada pelo STA (Spike Triggered Average) e pela transferência de informação entre os sinais neurais. São aplicadas técnicas baseadas na identificação dos atrasos de tempo da máxima coerência da informação. Utilizam-se, para esta finalidade, os conceitos da teoria de informação: a DMI (Delayed Mutual Information) e a TE (Transfer Entropy). Essas duas abordagens têm aplicação em transferência de informação, cada uma com suas particularidades. A DMI é uma ferramenta mais simples do que a TE, do ponto de vista computacional, pois depende da análise estatística de funções densidades de probabilidades de segunda ordem, enquanto que a TE, de funções de terceira ordem. Dependendo dos recursos computacionais disponíveis, este é um fator que deve ser levado em consideração. Os resultados de atraso da informação são muito bem identificados pela DMI. No entanto, a DMI falha em distinguir a direção do fluxo de informação, quando se tem sistemas com transferência de informação indireta e com sobreposição da informação. Nesses casos, a TE é a ferramenta mais indicada para a determinação da direção do fluxo de informação, devido à dependência condicional imposta pelo histórico comum entre os sinais analisados. Em circuitos neurais, estas questões ocorrem em diversos casos. No gânglio metatorácico de insetos, os interneurônios locais possuem diferentes padrões de caminhos com sobreposição da informação, pois recebem sinais de diferentes neurônios sensores para o movimento das membros locomotores desses animais. O principal objetivo deste trabalho é propor um modelo do circuito neural do inseto, para mapear como os sinais neurais se comportam, quando sujeitos a um conjunto de movimentos aleatórios impostos no membro do inseto. As respostas neurais são reflexos provocados pelo estímulo táctil, que gera o movimento na junção femorotibial do membro posterior. Nestes circuitos neurais, os sinais neurais são processados por interneurônios locais dos tipos spiking e nonspiking que operam em paralelo para processar a informação vinda dos neurônios sensores. Esses interneurônios recebem sinais de entrada de mecanorreceptores do membro posterior e da junção motora dos insetos. A principal característica dos interneurônios locais é a sua capacidade de se comunicar com outros neurônios, tendo ou não a presença de impulsos nervosos (spiking e nonspiking). Assim, forma-se um circuito neural com sinais de entradas (neurônios sensores) e saídas (neurônios motores). Neste trabalho, os algoritmos propostos analisam desde a geração aleatória dos movimentos mecânicos e os estímulos nos neurônios sensores que chegam até o gânglio metatorácico, incluindo suas respostas nos neurônios motores. São implementados os algoritmos e seus respectivos pseudocódigos para a DMI e para a TE. É utilizada a técnica de Surrogate Data para inferir as medidas de significância estatística em relação à máxima coerência de informação entre os sinais neurais. Os resultados a partir dos Surrogate Data são utilizados para a compensação dos erros de desvio das medidas de transferência de informação. Um algoritmo, baseado na IAAFT (Iterative Amplitude Adjusted Fourier Transform), gera os dados substitutos, com mesmo espectro de potência e diferentes distribuições dos sinais originais. Os resultados da DMI e da TE com os Surrogate Data fornecem os valores das linhas de base quando ocorre a mínima transferência de informação. Além disso, foram utilizados dados simulados, para uma discussão sobre os efeitos dos tamanhos das amostras e as forças de associação da informação. Os sinais neurais coletados estão disponíveis em um banco de dados com diversos experimentos no gânglio metatorácico dos gafanhotos. No entanto, cada experimento possui poucos sinais coletados simultaneamente; assim, para diferentes experimentos, os sinais ficam sujeitos às variações de tamanho de amostras, além de ruídos que interferem nas medidas absolutas de transferência de informação. Para se mapear essas conexões neurais, foi utilizada a metodologia baseada na normalização e compensação dos erros de desvio para os cálculos da transferência de informação. As normalizações das medidas utilizam as entropias totais do sistema. Para a DMI, utiliza-se a média geométrica das entropias de X e Y , para a TE aplica-se a CMI (Conditional Mutual Information) para a sua normalização. Após a aplicação dessas abordagens, baseadas no STA e na transferência de informação, apresenta-se o modelo estrutural do circuito neural do sistema neuromotor de gafanhotos. São apresentados os resultados com o STA e a DMI para os neurônios sensores, dos quais são levantadas algumas hipóteses sobre o funcionamento desta parte do FeCO (Femoral Chordotonal Organ). Para cada tipo de neurônio foram identificados múltiplos caminhos no circuito neural, através dos tempos de atraso e dos valores de máxima coerência da informação. Nos interneurônios spiking obtiveram-se dois padrões de caminhos, enquanto que para os interneurônios nonspiking identificaram-se três padrões distintos. Esses resultados são obtidos computacionalmente e condizem com que é esperado a partir dos modelos biológicos descritos em Burrows (1996). / Herein, we present the development of a bioinspired model by the neural circuit of insects. This model is obtained by analyzing the first order from STA (Spike Triggered Average) and the transfer of information among neural signals. Techniques are applied based on the identification of the time delays of the information maximum coherence. For this purpose we use the concepts of the theory of information: DMI (Delayed Mutual Information) and TE (Transfer Entropy). These two approaches have applications on information transfer and each one has peculiarities. The DMI is a simpler tool than the TE, from the computational point of view. Therefore, DMI depends on the statistical analysis of second order probability density functions, whereas the TE depends on third order functions. If computational resources are a problem, those questions can be taken into consideration. The results of the information delay are very effective for DMI. However, DMI fails to distinguish the direction of the information flow when we have systems subjected to indirect information transfer and superposition of the information. In these cases, the TE is the most appropriate tool for determining the direction of the information flow, due to the conditional dependence imposed by a common history among the signals. In neural circuits, those issues occur in many cases. For example, in metathoracic ganglion of insects, the local interneurons have different pathways with superposition of the information. Therefore, the interneurons receive signals from different sensory neurons for moving the animals legs . The main objective of this work is propose a model of the neural circuit from an insect. Additionally, we map the neural signals when the hind leg is subjected to a set of movements. Neural responses are reflexes caused by tactile stimulus, which generates the movement of femoro-tibial joint of the hind leg. In these neural circuits, the signals are processed by neural spiking and nonspiking local interneurons. These types of neurons operate in parallel processing of the information from the sensory neurons. Interneurons receive input signals from mechanoreceptors by the leg and the insect knees. The main feature of local interneurons is their ability to communicate with others neurons. It can occur with or without of the presence of impulses (spiking and nonspiking). Thus, they form a neural circuit with input signals (sensory neurons) and outputs (motor neurons). The proposed algorithms analyze the random generation of movements and mechanical stimuli in sensory neurons. Which are processing in the metathoracic ganglion, including their responses in the motor neurons. The algorithms and the pseudo-code are implemented for TE and DMI. The technique of Surrogate Data is applied to infer the measures of statistical significance related to the information maximum coherence among neural signals. The results of the Surrogate Data are used for bias error compensation from information transfer. An algorithm, based on IAAFT (Iterative Amplitude Adjusted Fourier Transform), generates Surrogate Data with the same power spectrum and different distributions of the original signals. The results of the surrogate data, for DMI and TE, achieve the values of baselines when there are minimum information transfer. Additionally, we used simulated data to discuss the effects of sample sizes and different strengths of information connectivity. The collected neural signals are available from one database based on several experiments of the locusts metathoracic ganglion. However, each experiment has few simultaneously collected signals and the signals are subjected of variations in sample size and absolute measurements noisy of information transfer. We used a methodology based on normalization and compensation of the bias errors for computing the information transfer. The normalization of the measures uses the total entropy of the system. For the DMI, we applied the geometric mean of X and Y . Whereas, for the TE is computed the CMI (Conditional Mutual Information) for the normalization. We present the neural circuit structural model of the locusts neuromotor system, from those approaches based on STA and the information transfer. Some results are presented from STA and DMI for sensory neurones. Then, we achieve some new hypothesis about the neurophisiology function of FeCO. For each type of neuron, we identify multiple pathways in neural circuit through the time delay and the information maximum coherence. The spiking interneurons areyielded by two pathways, whereas the nonspiking interneurons has revealed three distinct patterns. These results are obtained computationally and are consistent with biological models described in Burrows (1996). STA Bioinspired engineering Conectividade neural Engenharia bioinspirada Information flow Motor neuroscience Neural layers Neurociência motora Neuronal connectivity Nível de significância estatística STA Statistical significance level Transfer entropy Transferência de entropia Transferência de informação
3	Modelagem de circuitos neurais do sistema neuromotor e proprioceptor de insetos com o uso da transferência de informação entre conexões neurais / Neural circuits modeling of insects neuromotor system based on information transfer approach and neural connectivity Wagner Endo 31 March 2014 (has links) Apresenta-se, neste trabalho, o desenvolvimento de um modelo bioinspirado a partir do circuito neural de insetos. Este modelo é obtido através da análise de primeira ordem dada pelo STA (Spike Triggered Average) e pela transferência de informação entre os sinais neurais. São aplicadas técnicas baseadas na identificação dos atrasos de tempo da máxima coerência da informação. Utilizam-se, para esta finalidade, os conceitos da teoria de informação: a DMI (Delayed Mutual Information) e a TE (Transfer Entropy). Essas duas abordagens têm aplicação em transferência de informação, cada uma com suas particularidades. A DMI é uma ferramenta mais simples do que a TE, do ponto de vista computacional, pois depende da análise estatística de funções densidades de probabilidades de segunda ordem, enquanto que a TE, de funções de terceira ordem. Dependendo dos recursos computacionais disponíveis, este é um fator que deve ser levado em consideração. Os resultados de atraso da informação são muito bem identificados pela DMI. No entanto, a DMI falha em distinguir a direção do fluxo de informação, quando se tem sistemas com transferência de informação indireta e com sobreposição da informação. Nesses casos, a TE é a ferramenta mais indicada para a determinação da direção do fluxo de informação, devido à dependência condicional imposta pelo histórico comum entre os sinais analisados. Em circuitos neurais, estas questões ocorrem em diversos casos. No gânglio metatorácico de insetos, os interneurônios locais possuem diferentes padrões de caminhos com sobreposição da informação, pois recebem sinais de diferentes neurônios sensores para o movimento das membros locomotores desses animais. O principal objetivo deste trabalho é propor um modelo do circuito neural do inseto, para mapear como os sinais neurais se comportam, quando sujeitos a um conjunto de movimentos aleatórios impostos no membro do inseto. As respostas neurais são reflexos provocados pelo estímulo táctil, que gera o movimento na junção femorotibial do membro posterior. Nestes circuitos neurais, os sinais neurais são processados por interneurônios locais dos tipos spiking e nonspiking que operam em paralelo para processar a informação vinda dos neurônios sensores. Esses interneurônios recebem sinais de entrada de mecanorreceptores do membro posterior e da junção motora dos insetos. A principal característica dos interneurônios locais é a sua capacidade de se comunicar com outros neurônios, tendo ou não a presença de impulsos nervosos (spiking e nonspiking). Assim, forma-se um circuito neural com sinais de entradas (neurônios sensores) e saídas (neurônios motores). Neste trabalho, os algoritmos propostos analisam desde a geração aleatória dos movimentos mecânicos e os estímulos nos neurônios sensores que chegam até o gânglio metatorácico, incluindo suas respostas nos neurônios motores. São implementados os algoritmos e seus respectivos pseudocódigos para a DMI e para a TE. É utilizada a técnica de Surrogate Data para inferir as medidas de significância estatística em relação à máxima coerência de informação entre os sinais neurais. Os resultados a partir dos Surrogate Data são utilizados para a compensação dos erros de desvio das medidas de transferência de informação. Um algoritmo, baseado na IAAFT (Iterative Amplitude Adjusted Fourier Transform), gera os dados substitutos, com mesmo espectro de potência e diferentes distribuições dos sinais originais. Os resultados da DMI e da TE com os Surrogate Data fornecem os valores das linhas de base quando ocorre a mínima transferência de informação. Além disso, foram utilizados dados simulados, para uma discussão sobre os efeitos dos tamanhos das amostras e as forças de associação da informação. Os sinais neurais coletados estão disponíveis em um banco de dados com diversos experimentos no gânglio metatorácico dos gafanhotos. No entanto, cada experimento possui poucos sinais coletados simultaneamente; assim, para diferentes experimentos, os sinais ficam sujeitos às variações de tamanho de amostras, além de ruídos que interferem nas medidas absolutas de transferência de informação. Para se mapear essas conexões neurais, foi utilizada a metodologia baseada na normalização e compensação dos erros de desvio para os cálculos da transferência de informação. As normalizações das medidas utilizam as entropias totais do sistema. Para a DMI, utiliza-se a média geométrica das entropias de X e Y , para a TE aplica-se a CMI (Conditional Mutual Information) para a sua normalização. Após a aplicação dessas abordagens, baseadas no STA e na transferência de informação, apresenta-se o modelo estrutural do circuito neural do sistema neuromotor de gafanhotos. São apresentados os resultados com o STA e a DMI para os neurônios sensores, dos quais são levantadas algumas hipóteses sobre o funcionamento desta parte do FeCO (Femoral Chordotonal Organ). Para cada tipo de neurônio foram identificados múltiplos caminhos no circuito neural, através dos tempos de atraso e dos valores de máxima coerência da informação. Nos interneurônios spiking obtiveram-se dois padrões de caminhos, enquanto que para os interneurônios nonspiking identificaram-se três padrões distintos. Esses resultados são obtidos computacionalmente e condizem com que é esperado a partir dos modelos biológicos descritos em Burrows (1996). / Herein, we present the development of a bioinspired model by the neural circuit of insects. This model is obtained by analyzing the first order from STA (Spike Triggered Average) and the transfer of information among neural signals. Techniques are applied based on the identification of the time delays of the information maximum coherence. For this purpose we use the concepts of the theory of information: DMI (Delayed Mutual Information) and TE (Transfer Entropy). These two approaches have applications on information transfer and each one has peculiarities. The DMI is a simpler tool than the TE, from the computational point of view. Therefore, DMI depends on the statistical analysis of second order probability density functions, whereas the TE depends on third order functions. If computational resources are a problem, those questions can be taken into consideration. The results of the information delay are very effective for DMI. However, DMI fails to distinguish the direction of the information flow when we have systems subjected to indirect information transfer and superposition of the information. In these cases, the TE is the most appropriate tool for determining the direction of the information flow, due to the conditional dependence imposed by a common history among the signals. In neural circuits, those issues occur in many cases. For example, in metathoracic ganglion of insects, the local interneurons have different pathways with superposition of the information. Therefore, the interneurons receive signals from different sensory neurons for moving the animals legs . The main objective of this work is propose a model of the neural circuit from an insect. Additionally, we map the neural signals when the hind leg is subjected to a set of movements. Neural responses are reflexes caused by tactile stimulus, which generates the movement of femoro-tibial joint of the hind leg. In these neural circuits, the signals are processed by neural spiking and nonspiking local interneurons. These types of neurons operate in parallel processing of the information from the sensory neurons. Interneurons receive input signals from mechanoreceptors by the leg and the insect knees. The main feature of local interneurons is their ability to communicate with others neurons. It can occur with or without of the presence of impulses (spiking and nonspiking). Thus, they form a neural circuit with input signals (sensory neurons) and outputs (motor neurons). The proposed algorithms analyze the random generation of movements and mechanical stimuli in sensory neurons. Which are processing in the metathoracic ganglion, including their responses in the motor neurons. The algorithms and the pseudo-code are implemented for TE and DMI. The technique of Surrogate Data is applied to infer the measures of statistical significance related to the information maximum coherence among neural signals. The results of the Surrogate Data are used for bias error compensation from information transfer. An algorithm, based on IAAFT (Iterative Amplitude Adjusted Fourier Transform), generates Surrogate Data with the same power spectrum and different distributions of the original signals. The results of the surrogate data, for DMI and TE, achieve the values of baselines when there are minimum information transfer. Additionally, we used simulated data to discuss the effects of sample sizes and different strengths of information connectivity. The collected neural signals are available from one database based on several experiments of the locusts metathoracic ganglion. However, each experiment has few simultaneously collected signals and the signals are subjected of variations in sample size and absolute measurements noisy of information transfer. We used a methodology based on normalization and compensation of the bias errors for computing the information transfer. The normalization of the measures uses the total entropy of the system. For the DMI, we applied the geometric mean of X and Y . Whereas, for the TE is computed the CMI (Conditional Mutual Information) for the normalization. We present the neural circuit structural model of the locusts neuromotor system, from those approaches based on STA and the information transfer. Some results are presented from STA and DMI for sensory neurones. Then, we achieve some new hypothesis about the neurophisiology function of FeCO. For each type of neuron, we identify multiple pathways in neural circuit through the time delay and the information maximum coherence. The spiking interneurons areyielded by two pathways, whereas the nonspiking interneurons has revealed three distinct patterns. These results are obtained computationally and are consistent with biological models described in Burrows (1996). STA Conectividade neural Engenharia bioinspirada Neurociência motora Nível de significância estatística Transferência de entropia Transferência de informação Bioinspired engineering Information flow Motor neuroscience Neural layers Neuronal connectivity STA Statistical significance level Transfer entropy

1

Page generated in 0.0351 seconds