Global ETD Search

1	A new scheme for training ReLU-based multi-layer feedforward neural networks / Ett nytt system för att träna ReLU-baserade och framkopplade neurala nätverk med flera lager Wang, Hao January 2017 (has links) A new scheme for training Rectified Linear Unit (ReLU) based feedforward neural networks is examined in this thesis. The project starts with the row-by-row updating strategy designed for Single-hidden Layer Feedforward neural Networks (SLFNs). This strategy exploits the properties held by ReLUs and optimizes each row in the input weight matrix individually, under the common optimization scheme. Then the Direct Updating Strategy (DUS), which has two different versions: Vector-Based Method (VBM) and Matrix-Based Method (MBM), is proposed to optimize the input weight matrix as a whole. Finally DUS is extended to Multi-hidden Layer Feedforward neural Networks (MLFNs). Since the extension, for general ReLU-based MLFNs, faces an initialization dilemma, a special structure MLFN is presented. Verification experiments are conducted on six benchmark multi-class classification datasets. The results confirm that MBM algorithm for SLFNs improves the performance of neural networks, compared to its competitor, regularized extreme learning machine. For most datasets involved, MLFNs with the proposed special structure perform better when adding extra hidden layers. / Ett nytt schema för träning av rektifierad linjär enhet (ReLU)-baserade och framkopplade neurala nätverk undersöks i denna avhandling. Projektet börjar med en rad-för-rad-uppdateringsstrategi designad för framkopplade neurala nätverk med ett dolt lager (SLFNs). Denna strategi utnyttjar egenskaper i ReLUs och optimerar varje rad i inmatningsviktmatrisen individuellt, enligt en gemensam optimeringsmetod. Därefter föreslås den direkta uppdateringsstrategin (DUS), som har två olika versioner: vektorbaserad metod (VBM) respektive matrisbaserad metod (MBM), för att optimera ingångsviktmatrisen som helhet. Slutli- gen utvidgas DUS till framkopplade neurala nätverk med flera lager (MLFN). Eftersom utvidgningen för generella ReLU-baserade MLFN står inför ett initieringsdilemma presenteras därför en MLFN med en speciell struktur. Verifieringsexperiment utförs på sex datamängder för klassificering av flera klasser. Resultaten bekräftar att MBM-algoritmen för SLFN förbättrar prestanda hos neurala nätverk, jämfört med konkurrenten, den regulariserade extrema inlärningsmaskinen. För de flesta använda dataset, fungerar MLFNs med den föreslagna speciella strukturen bättre när man lägger till extra dolda lager. ReLU feedforward neural network ELM Computer Sciences Datavetenskap (datalogi)
2	Understanding a Block of Layers in Deep Neural Networks: Optimization, Probabilistic and Tropical Geometric Perspectives Bibi, Adel 04 1900 (has links) This dissertation aims at theoretically studying a block of layers that is common in al- most all deep learning models. The block of layers of interest is the composition of an affine layer followed by a nonlinear activation that is followed by another affine layer. We study this block from three perspectives. (i) An Optimization Perspective. Is it possible that the output of the forward pass through this block is an optimal solution to a certain convex optimization problem? We show an equivalency between the forward pass through this block and a single iteration of deterministic and stochastic algorithms solving a ten- sor formulated convex optimization problem. As consequence, we derive for the first time a formula for computing the singular values of convolutional layers surpassing the need for the prohibitive construction of the underlying linear operator. Thereafter, we show that several deep networks can have this block replaced with the corresponding optimiza- tion algorithm predicted by our theory resulting in networks with improved generalization performance. (ii) A Probabilistic Perspective. Is it possible to analytically analyze the output of a deep network upon subjecting the input to Gaussian noise? To that regard, we derive analytical formulas for the first and second moments of this block under Gaussian input noise. We demonstrate that the derived expressions can be used to efficiently analyze the output of an arbitrary deep network in addition to constructing Gaussian adversarial attacks surpassing any need for prohibitive data augmentation procedures. (iii) A Tropi- cal Geometry Perspective. Is it possible to characterize the decision boundaries of this block as a geometric structure representing a solution set to a certain class of polynomials (tropical polynomials)? If so, then, is it possible to utilize this geometric representation of the decision boundaries for novel reformulations to classical computer vision and machine learning tasks on arbitrary deep networks? We show that the decision boundaries of this block are a subset of a tropical hypersurface, which is intimately related to a the polytope that is the convex hull of two zonotopes. We utilize this geometric characterization to shed lights on new perspectives of network pruning. Block of Layers FFTLasso Affine-ReLU-Affine Tropical Geometry Network Moments Sparsity and CSC
3	Quantum ReLU activation for Convolutional Neural Networks to improve diagnosis of Parkinson’s disease and COVID-19 Parisi, Luca, Neagu, Daniel, Ma, R., Campean, Felician 17 September 2021 (has links) Yes / This study introduces a quantum-inspired computational paradigm to address the unresolved problem of Convolutional Neural Networks (CNNs) using the Rectified Linear Unit (ReLU) activation function (AF), i.e., the ‘dying ReLU’. This problem impacts the accuracy and the reliability in image classification tasks for critical applications, such as in healthcare. The proposed approach builds on the classical ReLU and Leaky ReLU, applying the quantum principles of entanglement and superposition at a computational level to derive two novel AFs, respectively the ‘Quantum ReLU’ (QReLU) and the ‘modified-QReLU’ (m-QReLU). The proposed AFs were validated when coupled with a CNN using seven image datasets on classification tasks involving the detection of COVID-19 and Parkinson’s Disease (PD). The out-of-sample/test classification accuracy and reliability (precision, recall and F1-score) of the CNN were compared against those of the same classifier when using nine classical AFs, including ReLU-based variations. Findings indicate higher accuracy and reliability for the CNN when using either QReLU or m-QReLU on five of the seven datasets evaluated. Whilst retaining the best classification accuracy and reliability for handwritten digits recognition on the MNIST dataset (ACC = 99%, F1-score = 99%), avoiding the ‘dying ReLU’ problem via the proposed quantum AFs improved recognition of PD-related patterns from spiral drawings with the QReLU especially, which achieved the highest classification accuracy and reliability (ACC = 92%, F1-score = 93%). Therefore, with these increased accuracy and reliability, QReLU and m-QReLU can aid critical image classification tasks, such as diagnoses of COVID-19 and PD. / The authors declare that this was the result of a HEIF 2020 University of Bradford COVID-19 response-funded project ‘Quantum ReLU-based COVID-19 Detector: A Quantum Activation Function for Deep Learning to Improve Diagnostics and Prognostics of COVID-19 from Non-ionising Medical Imaging’. However, the funding source was not involved in conducting the study and/or preparing the article. Activation functions ReLU Convolutional Neural Network Decision support COVID-19 Parkinson’s disease
4	Attractors of autoencoders : Memorization in neural networks / Attractors of autoencoders : Memorization in neural networks Strandqvist, Jonas January 2020 (has links) It is an important question in machine learning to understand how neural networks learn. This thesis sheds further light onto this by studying autoencoder neural networks which can memorize data by storing it as attractors.What this means is that an autoencoder can learn a training set and later produce parts or all of this training set even when using other inputs not belonging to this set. We seek out to illuminate the effect on how ReLU networks handle memorization when trained with different setups: with and without bias, for different widths and depths, and using two different types of training images -- from the CIFAR10 dataset and randomly generated. For this, we created controlled experiments in which we train autoencoders and compute the eigenvalues of their Jacobian matrices to discern the number of data points stored as attractors.We also manually verify and analyze these results for patterns and behavior. With this thesis we broaden the understanding of ReLU autoencoders: We find that the structure of the data has an impact on the number of attractors. For instance, we produced autoencoders where every training image became an attractor when we trained with random pictures but not with CIFAR10. Changes to depth and width on these two types of data also show different behaviour.Moreover, we observe that loss has less of an impact than expected on attractors of trained autoencoders. machine learning overfitting memorization neural network autoencoder attractor Jacobian eigenvalue CIFAR10 random data ReLU bias Computer Sciences Datavetenskap (datalogi)
5	Advanced analytics for process analysis of turbine plant and components Maharajh,Yashveer 26 November 2021 (has links) This research investigates the use of an alternate means of modelling the performance of a train of feed water heaters in a steam cycle power plant, using machine learning. The goal of this study was to use a simple artificial neural network (ANN) to predict the behaviour of the plant system, specifically the inlet bled steam (BS) mass flow rate and the outlet water temperature of each feedwater heater. The output of the model was validated through the use of a thermofluid engineering model built for the same plant. Another goal was to assess the ability of both the thermofluid model and ANN model to predict plant behaviour under out of normal operating circumstances. The thermofluid engineering model was built on FLOWNEX® SE using existing custom components for the various heat exchangers. The model was then tuned to current plant conditions by catering for plant degradation and maintenance effects. The artificial neural network was of a multi-layer perceptron (MLP) type, using the rectified linear unit (ReLU) activation function, mean squared error (MSE) loss function and adaptive moments (Adam) optimiser. It was constructed using Python programming language. The ANN model was trained using the same data as the FLOWNEX® SE model. Multiple architectures were tested resulting in the optimum model having two layers, 200 nodes or neurons in each layer with a batch size of 500, running over 100 epochs. This configuration attained a training accuracy of 0.9975 and validation accuracy of 0.9975. When used on a test set and to predict plant performance, it achieved a MSE of 0.23 and 0.45 respectively. Under normal operating conditions (six cases tested) the ANN model performed better than the FLOWNEX® SE model when compared to actual plant behaviour. Under out of normal conditions (four cases tested), the FLOWNEX SE® model performed better than the ANN. It is evident that the ANN model was unable to capture the “physics” of a heat exchanger or the feed heating process as a result of its poor performance in the out of normal scenarios. Further tuning by way of alternate activation functions and regularisation techniques had little effect on the ANN model performance. The ANN model was able to accurately predict an out of normal case only when it was trained to do so. This was achieved by augmenting the original training data with the inputs and results from the FLOWNEX SE® model for the same case. The conclusion drawn from this study is that this type of simple ANN model is able to predict plant performance so long as it is trained for it. The validity of the prediction is highly dependent on the integrity of the training data. Operating outside the range which the model was trained for will result in inaccurate predictions. It is recommended that out of normal scenarios commonly experienced by the plant be synthesised by engineering modelling tools like FLOWNEX® SE to augment the historic plant data. This provides a wider spectrum of training data enabling more generalised and accurate predictions from the ANN model. Thermofluid process modelling FLOWNEX® SE feed water heater machine learning deep learning artificial neural networks multi-layer perceptron ReLU Adam optimisation regularization data augmen

1

Page generated in 0.0161 seconds