Global ETD Search

1	Enhanced Neural Network Training Using Selective Backpropagation and Forward Propagation Bendelac, Shiri 22 June 2018 (has links) Neural networks are making headlines every day as the tool of the future, powering artificial intelligence programs and supporting technologies never seen before. However, the training of neural networks can take days or even weeks for bigger networks, and requires the use of super computers and GPUs in academia and industry in order to achieve state of the art results. This thesis discusses employing selective measures to determine when to backpropagate and forward propagate in order to reduce training time while maintaining classification performance. This thesis tests these new algorithms on the MNIST and CASIA datasets, and achieves successful results with both algorithms on the two datasets. The selective backpropagation algorithm shows a reduction of up to 93.3% of backpropagations completed, and the selective forward propagation algorithm shows a reduction of up to 72.90% in forward propagations and backpropagations completed compared to baseline runs of always forward propagating and backpropagating. This work also discusses employing the selective backpropagation algorithm on a modified dataset with disproportional under-representation of some classes compared to others. / Master of Science / Neural Networks are some of the most commonly used and best performing tools in machine learning. However, training them to perform well is a tedious task that can take days or even weeks, since bigger networks perform better but take exponentially longer to train. What can be done to reduce training time? Imagine a student studying for a test. The student likely solves practice problems that cover the different topics that may be covered on the test. The student then evaluates which topics he/she knew well, and forgoes extensive practice and review on those in favor of focusing on topics he/she missed or was not as confident on. This thesis discusses following a similar approach in training neural networks in order to reduce their training time needed to achieve desired performance levels. Machine learning neural networks convolutional neural networks backpropagation forward propagation training
2	Analysing the behaviour of neural networks Breutel, Stephan Werner January 2004 (has links) A new method is developed to determine a set of informative and refined interface assertions satisfied by functions that are represented by feed-forward neural networks. Neural networks have often been criticized for their low degree of comprehensibility.It is difficult to have confidence in software components if they have no clear and valid interface description. Precise and understandable interface assertions for a neural network based software component are required for safety critical applications and for theintegration into larger software systems. The interface assertions we are considering are of the form &quote if the input x of the neural network is in a region (alpha symbol) of the input space then the output f(x) of the neural network will be in the region (beta symbol) of the output space &quote and vice versa. We are interested in computing refined interface assertions, which can be viewed as the computation of the strongest pre- and postconditions a feed-forward neural network fulfills. Unions ofpolyhedra (polyhedra are the generalization of convex polygons in higher dimensional spaces) are well suited for describing arbitrary regions of higher dimensional vector spaces. Additionally, polyhedra are closed under affine transformations. Given a feed-forward neural network, our method produces an annotated neural network, where each layer is annotated with a set of valid linear inequality predicates. The main challenges for the computation of these assertions is to compute the solution of a non-linear optimization problem and the projection of a polyhedron onto a lower-dimensional subspace. artificial neural network annotated artificial neural netwrok rule-extraction validation of neural network polyhedra forward-propagation backward-propagation refinement process non-linear optimization polyhedral computation polyhedral projection techniques.
3	Novel neural architectures & algorithms for efficient inference Kag, Anil 30 August 2023 (has links) In the last decade, the machine learning universe embraced deep neural networks (DNNs) wholeheartedly with the advent of neural architectures such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), transformers, etc. These models have empowered many applications, such as ChatGPT, Imagen, etc., and have achieved state-of-the-art (SOTA) performance on many vision, speech, and language modeling tasks. However, SOTA performance comes with various issues, such as large model size, compute-intensive training, increased inference latency, higher working memory, etc. This thesis aims at improving the resource efficiency of neural architectures, i.e., significantly reducing the computational, storage, and energy consumption of a DNN without any significant loss in performance. Towards this goal, we explore novel neural architectures as well as training algorithms that allow low-capacity models to achieve near SOTA performance. We divide this thesis into two dimensions: \textit{Efficient Low Complexity Models}, and \textit{Input Hardness Adaptive Models}. Along the first dimension, i.e., \textit{Efficient Low Complexity Models}, we improve DNN performance by addressing instabilities in the existing architectures and training methods. We propose novel neural architectures inspired by ordinary differential equations (ODEs) to reinforce input signals and attend to salient feature regions. In addition, we show that carefully designed training schemes improve the performance of existing neural networks. We divide this exploration into two parts: \textsc{(a) Efficient Low Complexity RNNs.} We improve RNN resource efficiency by addressing poor gradients, noise amplifications, and BPTT training issues. First, we improve RNNs by solving ODEs that eliminate vanishing and exploding gradients during the training. To do so, we present Incremental Recurrent Neural Networks (iRNNs) that keep track of increments in the equilibrium surface. Next, we propose Time Adaptive RNNs that mitigate the noise propagation issue in RNNs by modulating the time constants in the ODE-based transition function. We empirically demonstrate the superiority of ODE-based neural architectures over existing RNNs. Finally, we propose Forward Propagation Through Time (FPTT) algorithm for training RNNs. We show that FPTT yields significant gains compared to the more conventional Backward Propagation Through Time (BPTT) scheme. \textsc{(b) Efficient Low Complexity CNNs.} Next, we improve CNN architectures by reducing their resource usage. They require greater depth to generate high-level features, resulting in computationally expensive models. We design a novel residual block, the Global layer, that constrains the input and output features by approximately solving partial differential equations (PDEs). It yields better receptive fields than traditional convolutional blocks and thus results in shallower networks. Further, we reduce the model footprint by enforcing a novel inductive bias that formulates the output of a residual block as a spatial interpolation between high-compute anchor pixels and low-compute cheaper pixels. This results in spatially interpolated convolutional blocks (SI-CNNs) that have better compute and performance trade-offs. Finally, we propose an algorithm that enforces various distributional constraints during training in order to achieve better generalization. We refer to this scheme as distributionally constrained learning (DCL). In the second dimension, i.e., \textit{Input Hardness Adaptive Models}, we introduce the notion of the hardness of any input relative to any architecture. In the first dimension, a neural network allocates the same resources, such as compute, storage, and working memory, for all the inputs. It inherently assumes that all examples are equally hard for a model. In this dimension, we challenge this assumption using input hardness as our reasoning that some inputs are relatively easy for a network to predict compared to others. Input hardness enables us to create selective classifiers wherein a low-capacity network handles simple inputs while abstaining from a prediction on the complex inputs. Next, we create hybrid models that route the hard inputs from the low-capacity abstaining network to a high-capacity expert model. We design various architectures that adhere to this hybrid inference style. Further, input hardness enables us to selectively distill the knowledge of a high-capacity model into a low-capacity model by cleverly discarding hard inputs during the distillation procedure. Finally, we conclude this thesis by sketching out various interesting future research directions that emerge as an extension of different ideas explored in this work. Electrical engineering Distributionally constrained learning Forward propagation through time Hybrid edge cloud models Selective classification
4	Etude de la signature EM bistatique d'une surface maritime hétérogène avec prise en compte des phénomènes hydrodynamiques / Study of EM bistatic signature of a heterogeneous sea surface with consideration of hydrodynamic phenomena Ben Khadra, Slahedine 07 December 2012 (has links) Le travail réalisé dans cette thèse s'intègre globalement dans le cadre de I'observation et la surveillance maritime.Afin d'améliorer la reconnaissance et I'identification automatique de cibles noyées dans un environnement perturbé, nous avons opté à la fusion de différentes connaissances et informations concernant une scène observée à distance par des capteurs micro-ondes. En effet, plusieurs phénomènes physiques co-existent et perturbent la propagation des ondes électromagnétiques au-dessus d'une surface et notamment au-dessus d'une surface maritime hétérogène (la réfraction due aux gradients d'indice, la rugosité de la surface de mer, les effets hydrodynamiques non linéaires du type vagues déferlantes, la présence d'objets, les polluants, sillage de navires, zones côtières, ...). Dans ce contexte, le travail présenté dans cette thèse porte sur l'étude de la signature électromagnétique (coefficients de diffusion) d'une surface maritime hétérogène avec la prise en compte des phénomènes hydrodynamiques (linéaires : vagues de capillarité et de gravité, non linéaires : vagues déferlantes). Cette estimation de la signature électromagnétique est effectuée en configuration bistatique (monostatique et propagation avant) et en bande X. L'étude complète de cette problématique est difficile. En effet, le déferlement est un processus dissipatif de l'énergie qui correspond à la dernière étape de la vie d'une vague et qui a donc le plus souvent lieu à I'approche du rivage. Ce phénomène non linéaire produit un pic de mer qui est une augmentation rapide des coefficients de diffusion et qui peut dépasser 10 dB dans une période de 100 ms. Ce pic peut conduire à des échos parasites, qui peuvent être identifiés comme des cibles virtuelles, et par la suite elles peuvent perturber le système de détection radar (fausses alarmes). Par conséquent, pour améliorer le processus de détection et pour réduire le taux de fausses alarmes, il est important de distinguer entre les cibles et les pics de mer générés par des vagues déferlantes. Ceci constitue I’une des motivations et aussi I'intérêt d'étudier la signature électromagnétique des vagues déferlantes dans différentes configurations d'observation de sorte que nous puissions facilement indiquer la présence voir I'identification des pics de mer. Pour contribuer à cette problématique, nous avons proposé une méthodologie basée sur un modèle électromagnétique hybride basé sur une combinaison d'une part de méthodes asymptotiques(SPMI utilisée dans le cadre de ce travail) pour simuler la réponse radar des vagues linéaire (vagues de capillarité et de gravité décrites via le spectre de mer d'Elfouhaily), et d'autre part de méthodes exactes (MoM, FB < Forward-Backward ) retenue dans le travail présenté dans ce manuscrit) pour calculer la réponse électromagnétique des vagues non-linéaires (profils considérés sont issus des résultats du code LONGTANK). Afìn de compléter l'étude théorique et les simulations réalisées, nous avons effectué une phase d'évaluation et de validation par des mesures de signature radar réalisées dans la chambre anéchoïque de I'Ensta Bretagne. / The work done in this thesis fits generally under the observation and maritime surveillance. To improve the detection and automatic identification of targets embedded in a noisy environment targets, we opted for the fusion of different knowledge and information regarding a remotely observed scene by microwave sensors. Indeed, several physical phenomena co-exist and interfere with the propagation of electromagnetic waves over a heterogeneous sea surface (the refraction due to the index gradients, the roughness of the sea surface, nonlinear hydrodynamic effects like waves breaking, the presence of objects, pollutants, ship wake, coastal areas,..). In this context, the work presented in this thesis focuses on the study of electromagnetic signature (diffusion coefficients) of a heterogeneous sea surface with consideration of hydrodynamic phenomena (linear: capillary and gravity waves, nonlinear: breaking waves). The electromagnetic signature is performed in bistatic configuration (monostatic and forward propagating) and in X-band. The complete study of this problem is difficult.Indeed, the breaking wave is a dissipative process of energy that corresponds to the last stage of the life of a wave and therefore has most often held in the shore. This nonlinear phenomenon produces a sea peak which is a rapid increase of the diffusion coefficients and can exceed l0 dB in a 100 ms period. This peak can lead to clutter, which can be identified as virtual targets, and then they can disrupt the detection radar system (false alarms). Therefore, to improve the detection process and reduce the false alarm rate, it is important to distinguish between targets and sea peaks generated by breaking waves. This represents one of the motivations and also the interest to study the electromagnetic signature of breaking waves in different observation configurations so that we can easily detect and identify the sea peaks. To solve this problem, we proposed a methodology based on a hybrid electromagnetic model which is on a combination of asymptotic methods (SPMI used in this work) to simulate the radar response of linear waves (capillary and gravity waves described via the Elfouhaily sea spectrum) and an exact methods, the method of moment (the FB "Forward-Backward" method is used in this work), to calculate the electromagnetic response of nonlinear waves (profiles are produced by the LONGTANK code). To complement the theoretical study and simulations, we carried out an evaluation and validation phase by measuring the radar signature of breaking wave profiles in the ENSTA Bretagne anechoic chamber. Surface maritime Effets hydrodynamiques Vagues déferlantes Méthodes asymptotiques Méthodes exactes Bande X Mesures en chambre anéchoïque Sea surface Hydrodynamic effects Breaking waves Asymptotic methods Exact methods Electromagnetic scattering coefficients X-band Measurements in an anechoic chamber 537
5	Využití umělé inteligence k monitorování stavu obráběcího stroje / Using artificial intelligence to monitor the state of the machine Kubisz, Jan January 2020 (has links) Diploma thesis focus on creation of neural network’s internal structure with goal of creation Artificial Neural Network capable of machine state monitoring and predicting its remaining usefull life. Main goal is creation of algorithm’s and library for design and learning of Artificial Neural Network, and deeper understanding of the problematics in the process, then by utilising existing libraries. Selected method was forward-propagation network with multi-layered perceptron architecture, and backpropagation learning. Achieved results was, that the network was able to determine parts state from vibration measurement and on its basis predict remaining usefull life.

1

Page generated in 0.0957 seconds