Global ETD Search

21	Bayesian-based Multi-Objective Hyperparameter Optimization for Accurate, Fast, and Efficient Neuromorphic System Designs Maryam Parsa (9412388) 16 December 2020 (has links) <div>Neuromorphic systems promise a novel alternative to the standard von-Neumann architectures that are computationally expensive for analyzing big data, and are not efficient for learning and inference. This novel generation of computing aims at ``mimicking" the human brain based on deploying neural networks on event-driven hardware architectures. A key bottleneck in designing such brain-inspired architectures is the complexity of co-optimizing the algorithm’s speed and accuracy along with the hardware’s performance and energy efficiency. This complexity stems from numerous intrinsic hyperparameters in both software and hardware that need to be optimized for an optimum design.</div><div><br></div><div>In this work, we present a versatile hierarchical pseudo agent-based multi-objective hyperparameter optimization approach for automatically tuning the hyperparameters of several training algorithms (such as traditional artificial neural networks (ANN), and evolutionary-based, binary, back-propagation-based, and conversion-based techniques in spiking neural networks (SNNs)) on digital and mixed-signal neural accelerators. By utilizing the proposed hyperparameter optimization approach we achieve improved performance over the previous state-of-the-art on those training algorithms and close some of the performance gaps that exist between SNNs and standard deep learning architectures.</div><div><br></div><div>We demonstrate >2% improvement in accuracy and more than 5X reduction in the training/inference time for a back-propagation-based SNN algorithm on the dynamic vision sensor (DVS) gesture dataset. In the case of ANN-SNN conversion-based techniques, we demonstrate 30% reduction in time-steps while surpassing the accuracy of state-of-the-art networks on an image classification dataset (CIFAR10) on a simpler and shallower architecture. Further, our analysis shows that in some cases even a seemingly minor change in hyperparameters may change the accuracy of these networks by 5‑6X. From the application perspective, we show that the optimum set of hyperparameters might drastically improve the performance (52% to 71% for Pole-Balance control application). In addition, we demonstrate resiliency of different input/output encoding, training neural network, or the underlying accelerator modules in a neuromorphic system to the changes of the hyperparameters.</div> Neuromorphic Computing Energy Efficient Machine Learning Hyperparameter Optimization Bayesian Optimization Multi-Objective Optimization
22	Bayesian Topology Optimization for Efficient Design of Origami Folding Structures Shende, Sourabh 15 June 2020 (has links) No description available. Mechanical Engineering Bayesian Optimization Gaussian Process Non-linear Finite Element Methods Topology Optimization Hyperparameter Tuning Automatic Relevance Determination
23	Investigating techniques for improving accuracy and limiting overfitting for YOLO and real-time object detection on iOS Güven, Jakup January 2019 (has links) I detta arbete genomförs utvecklingen av ett realtids objektdetekteringssystem för iOS. För detta ändamål används YOLO, en ett-stegs objektdetekterare och ett s.k. ihoplänkat neuralt nätverk vilket åstadkommer betydligt bättre prestanda än övriga realtidsdetek- terare i termer av hastighet och precision. En dörrdetekterare baserad på YOLO tränas och implementeras i en systemutvecklingsprocess. Maskininlärningsprocessen sammanfat- tas och praxis för att undvika överträning eller “overfitting” samt för att öka precision och hastighet diskuteras och appliceras. Vidare genomförs en rad experiment vilka pekar på att dataaugmentation och inkludering av negativ data i ett dataset medför ökad precision. Hyperparameteroptimisering och kunskapsöverföring pekas även ut som medel för att öka en objektdetekringsmodells prestanda. Författaren lyckas öka modellens mAP, ett sätt att mäta precision för objektdetekterare, från 63.76% till 86.73% utifrån de erfarenheter som dras av experimenten. En modells tendens för överträning utforskas även med resultat som pekar på att träning med över 300 epoker rimligen orsakar en övertränad modell. / This paper features the creation of a real time object detection system for mobile iOS using YOLO, a state-of-the-art one stage object detector and convoluted neural network far surpassing other real time object detectors in speed and accuracy. In this process an object detecting model is trained to detect doors. The machine learning process is outlined and practices to combat overfitting and increasing accuracy and speed are discussed. A series of experiments are conducted, the results of which suggests that data augmentation, including negative data in a dataset, hyperparameter optimisation and transfer learning are viable techniques in improving the performance of an object detection model. The author is able to increase mAP, a measurement of accuracy for object detectors, from 63.76% to 86.73% based on the results of experiments. The tendency for overfitting is also explored and results suggest that training beyond 300 epochs is likely to produce an overfitted model. YOLO object detection overfitting dataset composition hyperparameter optimisation transfer learning iOS real-time improving accuracy Engineering and Technology Teknik och teknologier
24	An Artificial Neural Network for Bankruptcy Prediction Magdefrau, Walter D 01 June 2021 (has links) (PDF) Assessing the financial health of organizations remains a topic of great interest to economists, financial institutions, and invested stakeholders. For more than a century, research into financial distress has focused primarily on traditional applications of statistical analysis; however, modern advances in computational efficiency have created a significant opportunity for more sophisticated approaches. This thesis investigates the application of artificial intelligence on company bankruptcy prediction. The proposed neural network model is evaluated using the Polish Companies Bankruptcy dataset and yields a 5-year prediction accuracy of 96.5% and an AUC (area under receiver operating characteristic curve) measure of 92.4%. Artificial Neural Network Artificial Intelligence Parameter Optimization Bankruptcy Prediction Financial Insolvency Hyperparameter Tuning
25	Duality, Derivative-Based Training Methods and Hyperparameter Optimization for Support Vector Machines Strasdat, Nico 18 October 2023 (has links) In this thesis we consider the application of Fenchel's duality theory and gradient-based methods for the training and hyperparameter optimization of Support Vector Machines. We show that the dualization of convex training problems is possible theoretically in a rather general formulation. For training problems following a special structure (for instance, standard training problems) we find that the resulting optimality conditions can be interpreted concretely. This approach immediately leads to the well-known notion of support vectors and a formulation of the Representer Theorem. The proposed theory is applied to several examples such that dual formulations of training problems and associated optimality conditions can be derived straightforwardly. Furthermore, we consider different formulations of the primal training problem which are equivalent under certain conditions. We also argue that the relation of the corresponding solutions to the solution of the dual training problem is not always intuitive. Based on the previous findings, we consider the application of customized optimization methods to the primal and dual training problems. A particular realization of Newton's method is derived which could be used to solve the primal training problem accurately. Moreover, we introduce a general convergence framework covering different types of decomposition methods for the solution of the dual training problem. In doing so, we are able to generalize well-known convergence results for the SMO method. Additionally, a discussion of the complexity of the SMO method and a motivation for a shrinking strategy reducing the computational effort is provided. In a last theoretical part, we consider the problem of hyperparameter optimization. We argue that this problem can be handled efficiently by means of gradient-based methods if the training problems are formulated appropriately. Finally, we evaluate the theoretical results concerning the training and hyperparameter optimization approaches practically by means of several example training problems. info:eu-repo/classification/ddc/510 ddc:510
26	Object Tracking Achieved by Implementing Predictive Methods with Static Object Detectors Trained on the Single Shot Detector Inception V2 Network / Objektdetektering Uppnådd genom Implementering av Prediktiva Metoder med Statiska Objektdetektorer Tränade på Entagningsdetektor Inception V2 Nätverket Barkman, Richard Dan William January 2019 (has links) In this work, the possibility of realising object tracking by implementing predictive methods with static object detectors is explored. The static object detectors are obtained as models trained on a machine learning algorithm, or in other words, a deep neural network. Specifically, it is the single shot detector inception v2 network that will be used to train such models. Predictive methods will be incorporated to the end of improving the obtained models’ precision, i.e. their performance with respect to accuracy. Namely, Lagrangian mechanics will be employed to derived equations of motion for three different scenarios in which the object is to be tracked. These equations of motion will be implemented as predictive methods by discretising and combining them with four different iterative formulae. In ch. 1, the fundamentals of supervised machine learning, neural networks, convolutional neural networks as well as the workings of the single shot detector algorithm, approaches to hyperparameter optimisation and other relevant theory is established. This includes derivations of the relevant equations of motion and the iterative formulae with which they were implemented. In ch. 2, the experimental set-up that was utilised during data collection, and the manner by which the acquired data was used to produce training, validation and test datasets is described. This is followed by a description of how the approach of random search was used to train 64 models on 300×300 datasets, and 32 models on 512×512 datasets. Consecutively, these models are evaluated based on their performance with respect to camera-to-object distance and object velocity. In ch. 3, the trained models were verified to possess multi-scale detection capabilities, as is characteristic of models trained on the single shot detector network. While the former is found to be true irrespective of the resolution-setting of the dataset that the model has been trained on, it is found that the performance with respect to varying object velocity is significantly more consistent for the lower resolution models as they operate at a higher detection rate. Ch. 3 continues with that the implemented predictive methods are evaluated. This is done by comparing the resulting deviations when they are let to predict the missing data points from a collected detection pattern, with varying sampling percentages. It is found that the best predictive methods are those that make use of the least amount of previous data points. This followed from that the data upon which evaluations were made contained an unreasonable amount of noise, considering that the iterative formulae implemented do not take noise into account. Moreover, the lower resolution models were found to benefit more than those trained on the higher resolution datasets because of the higher detection frequency they can employ. In ch. 4, it is argued that the concept of combining predictive methods with static object detectors to the end of obtaining an object tracker is promising. Moreover, the models obtained on the single shot detector network are concluded to be good candidates for such applications. However, the predictive methods studied in this thesis should be replaced with some method that can account for noise, or be extended to be able to account for it. A profound finding is that the single shot detector inception v2 models trained on a low-resolution dataset were found to outperform those trained on a high-resolution dataset in certain regards due to the higher detection rate possible on lower resolution frames. Namely, in performance with respect to object velocity and in that predictive methods performed better on the low-resolution models. / I detta arbete undersöks möjligheten att åstadkomma objektefterföljning genom att implementera prediktiva metoder med statiska objektdetektorer. De statiska objektdetektorerna erhålls som modeller tränade på en maskininlärnings-algoritm, det vill säga djupa neurala nätverk. Specifikt så är det en modifierad version av entagningsdetektor-nätverket, så kallat entagningsdetektor inception v2 nätverket, som används för att träna modellerna. Prediktiva metoder inkorporeras sedan för att förbättra modellernas förmåga att kunna finna ett eftersökt objekt. Nämligen används Lagrangiansk mekanik för härleda rörelseekvationer för vissa scenarion i vilka objektet är tänkt att efterföljas. Rörelseekvationerna implementeras genom att låta diskretisera dem och därefter kombinera dem med fyra olika iterationsformler. I kap. 2 behandlas grundläggande teori för övervakad maskininlärning, neurala nätverk, faltande neurala nätverk men också de grundläggande principer för entagningsdetektor-nätverket, närmanden till hyperparameter-optimering och övrig relevant teori. Detta inkluderar härledningar av rörelseekvationerna och de iterationsformler som de skall kombineras med. I kap. 3 så redogörs för den experimentella uppställning som användes vid datainsamling samt hur denna data användes för att producera olika data set. Därefter följer en skildring av hur random search kunde användas för att träna 64 modeller på data av upplösning 300×300 och 32 modeller på data av upplösning 512×512. Vidare utvärderades modellerna med avseende på deras prestanda för varierande kamera-till-objekt avstånd och objekthastighet. I kap. 4 så verifieras det att modellerna har en förmåga att detektera på flera skalor, vilket är ett karaktäristiskt drag för modeller tränade på entagninsdetektor-nätverk. Medan detta gällde för de tränade modellerna oavsett vilken upplösning av data de blivit tränade på, så fanns detekteringsprestandan med avseende på objekthastighet vara betydligt mer konsekvent för modellerna som tränats på data av lägre upplösning. Detta resulterade av att dessa modeller kan arbeta med en högre detekteringsfrekvens. Kap. 4 fortsätter med att de prediktiva metoderna utvärderas, vilket de kunde göras genom att jämföra den resulterande avvikelsen de respektive metoderna innebar då de läts arbeta på ett samplat detektionsmönster, sparat från då en tränad modell körts. I och med denna utvärdering så testades modellerna för olika samplingsgrader. Det visade sig att de bästa iterationsformlerna var de som byggde på färre tidigare datapunkter. Anledningen för detta är att den insamlade data, som testerna utfördes på, innehöll en avsevärd mängd brus. Med tanke på att de implementerade iterationsformlerna inte tar hänsyn till brus, så fick detta avgörande konsekvenser. Det fanns även att alla prediktiva metoder förbättrade objektdetekteringsförmågan till en högre utsträckning för modellerna som var tränade på data av lägre upplösning, vilket följer från att de kan arbeta med en högre detekteringsfrekvens. I kap. 5, argumenteras det, bland annat, för att konceptet att kombinera prediktiva metoder med statiska objektdetektorer för att åstadkomma objektefterföljning är lovande. Det slutleds även att modeller som erhålls från entagningsdetektor-nätverket är lovande kandidater för detta applikationsområde, till följd av deras höga detekteringsfrekvenser och förmåga att kunna detektera på flera skalor. Metoderna som användes för att förutsäga det efterföljda föremålets position fanns vara odugliga på grund av deras oförmåga att kunna hantera brus. Det slutleddes därmed att dessa antingen bör utökas till att kunna hantera brus eller ersättas av lämpligare metoder. Den mest väsentliga slutsats detta arbete presenterar är att lågupplösta entagninsdetektormodeller utgör bättre kandidater än de tränade på data av högre upplösning till följd av den ökade detekteringsfrekvens de erbjuder. Supervised Machine Learning Hyperparameter Optimisation Convolutional Neural Networks Lagrangian Mechanics Predictive Methods
27	Towards adaptive learning and inference : applications to hyperparameter tuning and astroparticle physics / Contributions à l'apprentissage et l'inférence adaptatifs : applications à l'ajustement d'hyperparamètres et à la physique des astroparticules Bardenet, Rémi 19 November 2012 (has links) Les algorithmes d'inférence ou d'optimisation possèdent généralement des hyperparamètres qu'il est nécessaire d'ajuster. Nous nous intéressons ici à l'automatisation de cette étape d'ajustement et considérons différentes méthodes qui y parviennent en apprenant en ligne la structure du problème considéré.La première moitié de cette thèse explore l'ajustement des hyperparamètres en apprentissage artificiel. Après avoir présenté et amélioré le cadre générique de l'optimisation séquentielle à base de modèles (SMBO), nous montrons que SMBO s'applique avec succès à l'ajustement des hyperparamètres de réseaux de neurones profonds. Nous proposons ensuite un algorithme collaboratif d'ajustement qui mime la mémoire qu'ont les humains d'expériences passées avec le même algorithme sur d'autres données.La seconde moitié de cette thèse porte sur les algorithmes MCMC adaptatifs, des algorithmes d'échantillonnage qui explorent des distributions de probabilité souvent complexes en ajustant leurs paramètres internes en ligne. Pour motiver leur étude, nous décrivons d'abord l'observatoire Pierre Auger, une expérience de physique des particules dédiée à l'étude des rayons cosmiques. Nous proposons une première partie du modèle génératif d'Auger et introduisons une procédure d'inférence des paramètres individuels de chaque événement d'Auger qui ne requiert que ce premier modèle. Ensuite, nous remarquons que ce modèle est sujet à un problème connu sous le nom de label switching. Après avoir présenté les solutions existantes, nous proposons AMOR, le premier algorithme MCMC adaptatif doté d'un réétiquetage en ligne qui résout le label switching. Nous présentons une étude empirique et des résultats théoriques de consistance d'AMOR, qui mettent en lumière des liens entre le réétiquetage et la quantification vectorielle / Inference and optimization algorithms usually have hyperparameters that require to be tuned in order to achieve efficiency. We consider here different approaches to efficiently automatize the hyperparameter tuning step by learning online the structure of the addressed problem. The first half of this thesis is devoted to hyperparameter tuning in machine learning. After presenting and improving the generic sequential model-based optimization (SMBO) framework, we show that SMBO successfully applies to the task of tuning the numerous hyperparameters of deep belief networks. We then propose an algorithm that performs tuning across datasets, mimicking the memory that humans have of past experiments with the same algorithm on different datasets. The second half of this thesis deals with adaptive Markov chain Monte Carlo (MCMC) algorithms, sampling-based algorithms that explore complex probability distributions while self-tuning their internal parameters on the fly. We start by describing the Pierre Auger observatory, a large-scale particle physics experiment dedicated to the observation of atmospheric showers triggered by cosmic rays. The models involved in the analysis of Auger data motivated our study of adaptive MCMC. We derive the first part of the Auger generative model and introduce a procedure to perform inference on shower parameters that requires only this bottom part. Our model inherently suffers from label switching, a common difficulty in MCMC inference, which makes marginal inference useless because of redundant modes of the target distribution. After reviewing existing solutions to label switching, we propose AMOR, the first adaptive MCMC algorithm with online relabeling. We give both an empirical and theoretical study of AMOR, unveiling interesting links between relabeling algorithms and vector quantization. Ajustement des hyperparamètres Apprentissage artificiel MCMC adaptatif Label switching Physique expérimentale Hyperparameter tuning Machine learning Sequential model-based optimization Adaptive MCMC Label switching Experimental physics
28	Obstacle Avoidance for an Autonomous Robot Car using Deep Learning / En autonom robotbil undviker hinder med hjälp av djupinlärning Norén, Karl January 2019 (has links) The focus of this study was deep learning. A small, autonomous robot car was used for obstacle avoidance experiments. The robot car used a camera for taking images of its surroundings. A convolutional neural network used the images for obstacle detection. The available dataset of 31 022 images was trained with the Xception model. We compared two different implementations for making the robot car avoid obstacles. Mapping image classes to steering commands was used as a reference implementation. The main implementation of this study was to separate obstacle detection and steering logic in different modules. The former reached an obstacle avoidance ratio of 80 %, the latter reached 88 %. Different hyperparameters were looked at during training. We found that frozen layers and number of epochs were important to optimize. Weights were loaded from ImageNet before training. Frozen layers decided how many layers that were trainable after that. Training all layers (no frozen layers) was proven to work best. Number of epochs decided how many epochs a model trained. We found that it was important to train between 10-25 epochs. The best model used no frozen layers and trained for 21 epochs. It reached a test accuracy of 85.2 %. deep learning convolutional neural network autonomous robot obstacle avoidance Xception hyperparameter optimization maskininlärning djupinlärning neurala nätverk självkörande robot
29	Hyperparameter optimisation using Q-learning based algorithms / Hyperparameteroptimering med hjälp av Q-learning-baserade algoritmer Karlsson, Daniel January 2020 (has links) Machine learning algorithms have many applications, both for academic and industrial purposes. Examples of applications are classification of diffraction patterns in materials science and classification of properties in chemical compounds within the pharmaceutical industry. For these algorithms to be successful they need to be optimised, part of this is achieved by training the algorithm, but there are components of the algorithms that cannot be trained. These hyperparameters have to be tuned separately. The focus of this work was optimisation of hyperparameters in classification algorithms based on convolutional neural networks. The purpose of this thesis was to investigate the possibility of using reinforcement learning algorithms, primarily Q-learning, as the optimising algorithm. Three different algorithms were investigated, Q-learning, double Q-learning and a Q-learning inspired algorithm, which was designed during this work. The algorithms were evaluated on different problems and compared to a random search algorithm, which is one of the most common optimisation tools for this type of problem. All three algorithms were capable of some learning, however the Q-learning inspired algorithm was the only one to outperform the random search algorithm on the test problems. Further, an iterative scheme of the Q-learning inspired algorithm was implemented, where the algorithm was allowed to refine the search space available to it. This showed further improvements of the algorithms performance and the results indicate that similar performance to the random search may be achieved in a shorter period of time, sometimes reducing the computational time by up to 40%. / Maskininlärningsalgoritmer har många tillämpningsområden, både akademiska och inom industrin. Exempel på tillämpningar är, klassificering av diffraktionsmönster inom materialvetenskap och klassificering av egenskaper hos kemiska sammansättningar inom läkemedelsindustrin. För att dessa algoritmer ska prestera bra behöver de optimeras. En del av optimering sker vid träning av algoritmerna, men det finns komponenter som inte kan tränas. Dessa hyperparametrar måste justeras separat. Fokuset för det här arbetet var optimering av hyperparametrar till klassificeringsalgoritmer baserade på faltande neurala nätverk. Syftet med avhandlingen var att undersöka möjligheterna att använda förstärkningsinlärningsalgoritmer, främst ''Q-learning'', som den optimerande algoritmen. Tre olika algoritmer undersöktes, ''Q-learning'', dubbel ''Q-learning'' samt en algoritm inspirerad av ''Q-learning'', denna utvecklades under arbetets gång. Algoritmerna utvärderades på olika testproblem och jämfördes mot resultat uppnådda med en slumpmässig sökning av hyperparameterrymden, vilket är en av de vanligare metoderna för att optimera den här typen av algoritmer. Alla tre algoritmer påvisade någon form av inlärning, men endast den ''Q-learning'' inspirerade algoritmen presterade bättre än den slumpmässiga sökningen. En iterativ implemetation av den ''Q-learning'' inspirerade algoritmen utvecklades också. Den iterativa metoden tillät den tillgängliga hyperparameterrymden att förfinas mellan varje iteration. Detta medförde ytterligare förbättringar av resultaten som indikerade att beräkningstiden i vissa fall kunde minskas med upp till 40% jämfört med den slumpmässiga sökningen med bibehållet eller förbättrat resultat. Hyperparameter optimisation Reinforcement learning Convolutional neural networks Hyperparameteroptimering Förstärkningsinlärning Faltande neurala nätverk Engineering and Technology Teknik och teknologier Computer and Information Sciences Data- och informationsvetenskap
30	Hyperparameters relationship to the test accuracy of a convolutional neural network Lundh, Felix, Barta, Oscar January 2021 (has links) Machine learning for image classification is a hot topic and it is increasing in popularity. Therefore the aim of this study is to provide a better understanding of convolutional neural network hyperparameters by comparing the test accuracy of convolutional neural network models with different hyperparameter value configurations. The focus of this study is to see whether there is an influence in the learning process depending on which hyperparameter values were used. For conducting the experiments convolutional neural network models were developed using the programming language Python utilizing the library Keras. The dataset used for this study iscifar-10, it includes 60000 colour images of 10 categories ranging from man-made objects to different animal species. Grid search is used for instantiating models with varying learning rate and momentum, width and depth values. Learning rate is only tested combined with momentum and width is only tested combined with depth. Activation functions, convolutional layers and batch size are tested individually. Grid search is compared against Bayesian optimization to see which technique will find the most optimized learning rate and momentum values. Results illustrate that the impact different hyperparameters have on the overall test accuracy varies. Learning rate and momentum affects the test accuracy greatly, however suboptimal values for learning rate and momentum can decrease the test accuracy severely. Activation function, width and depth, convolutional layer and batch size have a lesser impact on test accuracy. Regarding Bayesian optimization compared to grid search, results show that Bayesian optimization will not necessarily find more optimal hyperparameter values. Machine learning image classification hyperparameter convolutional neural network grid search Bayesian optimization cifar-10 Information Systems, Social aspects

Search results