Global ETD Search

241	Popis fotografií pomocí rekurentních neuronových sítí / Image Captioning with Recurrent Neural Networks Kvita, Jakub January 2016 (has links) Tato práce se zabývá automatickým generovaním popisů obrázků s využitím několika druhů neuronových sítí. Práce je založena na článcích z MS COCO Captioning Challenge 2015 a znakových jazykových modelech, popularizovaných A. Karpathym. Navržený model je kombinací konvoluční a rekurentní neuronové sítě s architekturou kodér--dekodér. Vektor reprezentující zakódovaný obrázek je předáván jazykovému modelu jako hodnoty paměti LSTM vrstev v síti. Práce zkoumá, na jaké úrovni je model s takto jednoduchou architekturou schopen popisovat obrázky a jak si stojí v porovnání s ostatními současnými modely. Jedním ze závěrů práce je, že navržená architektura není dostatečná pro jakýkoli popis obrázků.
242	Rozpoznávání obrazů konvolučními neuronovými sítěmi - základní koncepty / Image Recognition by Convolutional Neural Networks - Basic Concepts Zapletal, Ondřej January 2017 (has links) This thesis is studying basic concepts of Convolutional Neural Networks. Influence of structural elements on ability of the network to train is investigated. Result of this thesis is comparisons of designed model of Convolutional Neural Network with results from ILSVRC competition.
243	Increasing CNN Representational Power Using Absolute Cosine Value Regularization William Steven Singleton (8740647) 21 April 2020 (has links) The Convolutional Neural Network (CNN) is a mathematical model designed to distill input information into a more useful representation. This distillation process removes information over time through a series of dimensionality reductions, which ultimately, grant the model the ability to resist noise, and generalize effectively. However, CNNs often contain elements that are ineffective at contributing towards useful representations. This Thesis aims at providing a remedy for this problem by introducing Absolute Cosine Value Regularization (ACVR). This is a regularization technique hypothesized to increase the representational power of CNNs by using a Gradient Descent Orthogonalization algorithm to force the vectors that constitute their filters at any given convolutional layer to occupy unique positions in R<sup>n</sup>. This method should in theory, lead to a more effective balance between information loss and representational power, ultimately, increasing network performance. The following Thesis proposes and examines the mathematics and intuition behind ACVR, and goes on to propose Dynamic-ACVR (D-ACVR). This Thesis also proposes and examines the effects of ACVR on the filters of a low-dimensional CNN, as well as the effects of ACVR and D-ACVR on traditional Convolutional filters in VGG-19. Finally, this Thesis proposes and examines regularization of the Pointwise filters in MobileNetv1. Absolute Cosine Value Regularization CIFAR-10 Convolutional Neural Networks Imaging Gradient Descent Orthogonalization MobileNetV1 VGG-19 D-ACVR
244	Hyperparameter optimisation using Q-learning based algorithms / Hyperparameteroptimering med hjälp av Q-learning-baserade algoritmer Karlsson, Daniel January 2020 (has links) Machine learning algorithms have many applications, both for academic and industrial purposes. Examples of applications are classification of diffraction patterns in materials science and classification of properties in chemical compounds within the pharmaceutical industry. For these algorithms to be successful they need to be optimised, part of this is achieved by training the algorithm, but there are components of the algorithms that cannot be trained. These hyperparameters have to be tuned separately. The focus of this work was optimisation of hyperparameters in classification algorithms based on convolutional neural networks. The purpose of this thesis was to investigate the possibility of using reinforcement learning algorithms, primarily Q-learning, as the optimising algorithm. Three different algorithms were investigated, Q-learning, double Q-learning and a Q-learning inspired algorithm, which was designed during this work. The algorithms were evaluated on different problems and compared to a random search algorithm, which is one of the most common optimisation tools for this type of problem. All three algorithms were capable of some learning, however the Q-learning inspired algorithm was the only one to outperform the random search algorithm on the test problems. Further, an iterative scheme of the Q-learning inspired algorithm was implemented, where the algorithm was allowed to refine the search space available to it. This showed further improvements of the algorithms performance and the results indicate that similar performance to the random search may be achieved in a shorter period of time, sometimes reducing the computational time by up to 40%. / Maskininlärningsalgoritmer har många tillämpningsområden, både akademiska och inom industrin. Exempel på tillämpningar är, klassificering av diffraktionsmönster inom materialvetenskap och klassificering av egenskaper hos kemiska sammansättningar inom läkemedelsindustrin. För att dessa algoritmer ska prestera bra behöver de optimeras. En del av optimering sker vid träning av algoritmerna, men det finns komponenter som inte kan tränas. Dessa hyperparametrar måste justeras separat. Fokuset för det här arbetet var optimering av hyperparametrar till klassificeringsalgoritmer baserade på faltande neurala nätverk. Syftet med avhandlingen var att undersöka möjligheterna att använda förstärkningsinlärningsalgoritmer, främst ''Q-learning'', som den optimerande algoritmen. Tre olika algoritmer undersöktes, ''Q-learning'', dubbel ''Q-learning'' samt en algoritm inspirerad av ''Q-learning'', denna utvecklades under arbetets gång. Algoritmerna utvärderades på olika testproblem och jämfördes mot resultat uppnådda med en slumpmässig sökning av hyperparameterrymden, vilket är en av de vanligare metoderna för att optimera den här typen av algoritmer. Alla tre algoritmer påvisade någon form av inlärning, men endast den ''Q-learning'' inspirerade algoritmen presterade bättre än den slumpmässiga sökningen. En iterativ implemetation av den ''Q-learning'' inspirerade algoritmen utvecklades också. Den iterativa metoden tillät den tillgängliga hyperparameterrymden att förfinas mellan varje iteration. Detta medförde ytterligare förbättringar av resultaten som indikerade att beräkningstiden i vissa fall kunde minskas med upp till 40% jämfört med den slumpmässiga sökningen med bibehållet eller förbättrat resultat. Hyperparameter optimisation Reinforcement learning Convolutional neural networks Hyperparameteroptimering Förstärkningsinlärning Faltande neurala nätverk Engineering and Technology Teknik och teknologier Computer and Information Sciences Data- och informationsvetenskap
245	Cascade Mask R-CNN and Keypoint Detection used in Floorplan Parsing Eklund, Anton January 2020 (has links) Parsing floorplans have been a problem in automatic document analysis for long and have up until recent years been approached with algorithmic methods. With the rise of convolutional neural networks (CNN), this problem too has seen an upswing in performance. In this thesis the task is to recover, as accurately as possible, spatial and geometric information from floorplans. This project builds around instance segmentation models like Cascade Mask R-CNN to extract the bulk of information from a floorplan image. To complement the segmentation, a new style of using keypoint-CNN is presented to find precise locations of corners. These are then combined in a post-processing step to give the resulting segmentation. The resulting segmentation scores exceed the current baseline of the CubiCasa5k floorplan dataset with a mean IoU of 72.7% compared to 57.5%. Further, the mean IoU for individual classes is also improved for almost every class. It is also shown that Cascade Mask R-CNN is better suited than Mask R-CNN for this task. mask r-cnn cascade r-cnn cascade mask r-cnn floorplan floorplans computer vision CNN convolutional neural networks Engineering and Technology Teknik och teknologier
246	Automatic segmentation of articular cartilage in arthroscopic images using deep neural networks and multifractal analysis Ångman, Mikael, Viken, Hampus January 2020 (has links) Osteoarthritis is a large problem affecting many patients globally, and diagnosis of osteoarthritis is often done using evidence from arthroscopic surgeries. Making a correct diagnosis is hard, and takes years of experience and training on thousands of images. Therefore, developing an automatic solution to perform the diagnosis would be extremely helpful to the medical field. Since machine learning has been proven to be useful and effective at classifying and segmenting medical images, this thesis aimed at solving the problem using machine learning methods. Multifractal analysis has also been used extensively for medical imaging segmentation. This study proposes two methods of automatic segmentation using neural networks and multifractal analysis. The thesis was performed using real arthroscopic images from surgeries. MultiResUnet architecture is shown to be well suited for pixel perfect segmentation. Classification of multifractal features using neural networks is also shown to perform well when compared to related studies. Convolutional neural networks multifractal analysis arthroscopy semantic segmentation wavelet leaders wavelet p-leaders Medical Image Processing Medicinsk bildbehandling Computer Engineering Datorteknik
247	ROOM CATEGORIZATION USING SIMULTANEOUS LOCALIZATION AND MAPPING AND CONVOLUTIONAL NEURAL NETWORK Iman Yazdansepas (9001001) 23 June 2020 (has links) Robotic industries are growing faster than in any other era with the demand and rise of in home robots or assisted robots. Such a robot should be able to navigate between different rooms in the house autonomously. For autonomous navigation, the robot needs to build a map of the surrounding unknown environment and localize itself within the map. For home robots, distinguishing between different rooms improves the functionality of the robot. In this research, Simultaneously Localization And Mapping (SLAM) utilizing a LiDAR sensor is used to construct the environment map. LiDAR is more accurate and not sensitive to light intensity compared to vision. The SLAM method used is Gmapping to create a map of the environment. Gmapping is one of the robust and user-friendly packages in the Robotic Operating System (ROS), which creates a more accurate map, and requires less computational power. The constructed map is then used for room categorization using Convolutional Neural Network (CNN). Since CNN is one of the powerful techniques to classify the rooms based on the generated 2D map images. To demonstrate the applicability of the approach, simulations and experiments are designed and performed on campus and an apartment environment. The results indicate the Gmapping provides an accurate map. Each room used in the experimental design, undergoes training by using the Convolutional Neural Network with a data set of different apartment maps, to classify the room that was mapped using Gmapping. The room categorization results are compared with other approaches in the literature using the same data set to indicate the performance. The classification results show the applicability of using CNN for room categorization for applications such as assisted robots. CNN room categorization Gazebo simulation gmapping
248	Human Age Prediction Based on Real and Simulated RR Intervals using Temporal Convolutional Neural Networks and Gaussian Processes Pfundstein, Maximilian January 2020 (has links) Electrocardiography (ECG) is a non-invasive method used in medicine to track the electrical pulses sent by the heart. The time between two subsequent electrical impulses and hence the heartbeat of a subject, is referred to as an RR interval. Previous studies show that RR intervals can be used for identifying sleep patterns and cardiovascular diseases. Additional research indicates that RR intervals can be used to predict the cardiovascular age of a subject. This thesis investigates, if this assumption is true, based on two different datasets as well as simulated data based on Gaussian Processes. The datasets used are Holter recordings provided by the University of Gdańsk as well as a dataset provided by Physionet. The former represents a balanced dataset of recordings during nocturnal sleep of healthy subjects whereas the latter one describes an imbalanced dataset of records of a whole day of subjects that suffered from myocardial infarction. Feature-based models as well as a deep learning architecture called DeepSleep, based on a paper for sleep stage detection, are trained. The results show, that the prediction of a subject's age, only based in RR intervals, is difficult. For the first dataset, the highest obtained test accuracy is 37.84 per cent, with a baseline of 18.23 per cent. For the second dataset, the highest obtained accuracy is 42.58 per cent with a baseline of 39.14 per cent. Furthermore, data is simulated by fitting Gaussian Processes to the first dataset and following a Bayesian approach by assuming a distribution for all hyperparameters of the kernel function in use. The distributions for the hyperparameters are continuously updated by fitting a Gaussian Process to a slices of around 2.5 minutes. Then, samples from the fitted Gaussian Process are taken as simulated data, handling impurity and padding. The results show that the highest accuracy achieved is 31.12 per cent with a baseline of 18.23 per cent. Concludingly, cardiovascular age prediction based on RR intervals is a difficult problem and complex handling of impurity does not necessarily improve the results. Statistics Machine Learning RR RR intervals ECG Gaussian Processes Health Care Convolutional Neural Networks Time Series Data Simulation Human Age Prediction Probability Theory and Statistics Sannolikhetsteori och statistik
249	Multi-Task Convolutional Learning for Flame Characterization Ur Rehman, Obaid January 2020 (has links) This thesis explores multi-task learning for combustion flame characterization i.e to learn different characteristics of the combustion flame. We propose a multi-task convolutional neural network for two tasks i.e. PFR (Pilot fuel ratio) and fuel type classification based on the images of stable combustion. We utilize transfer learning and adopt VGG16 to develop a multi-task convolutional neural network to jointly learn the aforementioned tasks. We also compare the performance of the individual CNN model for two tasks with multi-task CNN which learns these two tasks jointly by sharing visual knowledge among the tasks. We share the effectiveness of our proposed approach to a private company’s dataset. To the best of our knowledge, this is the first work being done for jointly learning different characteristics of the combustion flame. / <p>This wrok as done with Siemens, and we have applied for a patent which is still pending.</p> Multi task learning multi task convolutional learning transfer learning VGG16 CNN convolutional neural networks MTL MTL CNN Computer Systems Datorsystem Probability Theory and Statistics Sannolikhetsteori och statistik
250	Using Mask R-CNN for Instance Segmentation of Eyeglass Lenses / Användning av Mask R-CNN för instanssegmentering av glasögonlinser Norrman, Marcus, Shihab, Saad January 2021 (has links) This thesis investigates the performance of Mask R-CNN when utilizing transfer learning on a small dataset. The aim was to instance segment eyeglass lenses as accurately as possible from self-portrait images. Five different models were trained, where the key difference was the types of eyeglasses the models were trained on. The eyeglasses were grouped into three types, fully rimmed, semi-rimless, and rimless glasses. 1550 images were used for training, validation, and testing. The model's performances were evaluated using TensorBoard training data and mean Intersection over Union scores (mIoU). No major differences in performance were found in four of the models, which grouped all three types of glasses into one class. Their mIoU scores range from 0.913 to 0.94 whereas the model with one class for each group of glasses, performed worse, with a mIoU of 0.85. The thesis revealed that one can achieve great instance segmentation results using a limited dataset when taking advantage of transfer learning. / Denna uppsats undersöker prestandan för Mask R-CNN vid användning av överföringsinlärning på en liten datamängd. Syftet med arbetet var att segmentera glasögonlinser så exakt som möjligt från självporträttbilder. Fem olika modeller tränades, där den viktigaste skillnaden var de typer av glasögon som modellerna tränades på. Glasögonen delades in i 3 typer, helbåge, halvbåge och båglösa. Totalt samlades 1550 träningsbilder in, dessa annoterades och användes för att träna modellerna. Modellens prestanda utvärderades med TensorBoard träningsdata samt genomsnittlig Intersection over Union (IoU). Inga större skillnader i prestanda hittades mellan modellerna som endast tränades på en klass av glasögon. Deras genomsnittliga IoU varierar mellan 0,913 och 0,94. Modellen där varje glasögonkategori representerades som en unik klass, presterade sämre med en genomsnittlig IoU på 0,85. Resultatet av uppsatsen påvisar att goda instanssegmenteringsresultat går att uppnå med hjälp av en begränsad datamängd om överföringsinlärning används. Machine Learning Computer Vision Instance Segmentation Mask R-CNN CNN Convolutional Neural Networks Transfer Learning Maskininlärning Datorseende Instanssegmentering Mask R-CNN CNN Konvolutionella neurala nätverk Överföringsinlärning Mathematics Matematik

Search results