Global ETD Search

251	Deep Active Learning for Image Classification using Different Sampling Strategies Saleh, Shahin January 2021 (has links) Convolutional Neural Networks (CNNs) have been proved to deliver great results in the area of computer vision, however, one fundamental bottleneck with CNNs is the fact that it is heavily dependant on the ground truth, that is, labeled training data. A labeled dataset is a group of samples that have been tagged with one or more labels. In this degree project, we mitigate the data greedy behavior of CNNs by applying deep active learning with various kinds of sampling strategies. The main focus will be on the sampling strategies random sampling, least confidence sampling, margin sampling, entropy sampling, and K- means sampling. We choose to study the random sampling strategy since it will work as a baseline to the other sampling strategies. Moreover, the least confidence sampling, margin sampling, and entropy sampling strategies are uncertainty based sampling strategies, hence, it is interesting to study how they perform in comparison with the geometrical based K- means sampling strategy. These sampling strategies will help to find the most informative/representative samples amongst all unlabeled samples, thus, allowing us to label fewer samples. Furthermore, the benchmark datasets MNIST and CIFAR10 will be used to verify the performance of the various sampling strategies. The performance will be measured in terms of accuracy and less data needed. Lastly, we concluded that by using least confidence sampling and margin sampling we reduced the number of labeled samples by 79.25% in comparison with the random sampling strategy for the MNIST dataset. Moreover, by using entropy sampling we reduced the number of labeled samples by 67.92% for the CIFAR10 dataset. / Faltningsnätverk har visat sig leverera bra resultat inom området datorseende, men en fundamental flaskhals med Faltningsnätverk är det faktum att den är starkt beroende av klassificerade datapunkter. I det här examensarbetet hanterar vi Faltningsnätverkens giriga beteende av klassificerade datapunkter genom att använda deep active learning med olika typer av urvalsstrategier. Huvudfokus kommer ligga på urvalsstrategierna slumpmässigt urval, minst tillförlitlig urval, marginal baserad urval, entropi baserad urval och K- means urval. Vi väljer att studera den slumpmässiga urvalsstrategin eftersom att den kommer användas för att mäta prestandan hos de andra urvalsstrategierna. Dessutom valde vi urvalsstrategierna minst tillförlitlig urval, marginal baserad urval, entropi baserad urval eftersom att dessa är osäkerhetsbaserade strategier som är intressanta att jämföra med den geometribaserade strategin K- means. Dessa urvalsstrategier hjälper till att hitta de mest informativa/representativa datapunkter bland alla oklassificerade datapunkter, vilket gör att vi behöver klassificera färre datapunkter. Vidare kommer standard dastaseten MNIST och CIFAR10 att användas för att verifiera prestandan för de olika urvalsstrategierna. Slutligen drog vi slutsatsen att genom att använda minst tillförlitlig urval och marginal baserad urval minskade vi mängden klassificerade datapunkter med 79, 25%, i jämförelse med den slumpmässiga urvalsstrategin, för MNIST- datasetet. Dessutom minskade vi mängden klassificerade datapunkter med 67, 92% med hjälp av entropi baserad urval för CIFAR10datasetet. Convolutional Neural Network Deep Active Learning Deep Learning Image Classification Sampling Strategies SemiSupervised Learning. Bildklassificering Faltningsnätverk Deep Active Learning Djupinlärning Semiövervakat lärande Urvalsstrategier. Computer and Information Sciences Data- och informationsvetenskap
252	A Framework for the Development and Validation of Phenomenologically Derived Cochlear Implant Stimulation Strategies Andres Felipe Llico Gallardo (11189976) 27 July 2021 (has links) <div>Cochlear implants (CI) are sensory neuroprostheses capable of partially restoring hearing loss by electrically stimulating the auditory nerve to mimic normal hearing conditions. Despite their success and ongoing advances in both hardware and software, CI patients can still struggle to understand speech, most notably in complex auditory settings, also referred to as the cocktail party problem. Efforts to develop new CI algorithms to overcome this challenge rely on CI simulators and vocoders to test with normal hearing (NH) patients. However, recent studies have suggested that these tools fail to reproduce the stimuli perceived by CI patients. It is therefore critical to develop tools capable of producing better representations of the stimuli as perceived by CI patients. Thus, this work proposes a framework that incorporates physiological models of the peripheral auditory nerve. Using these models, the framework generates stimulations that elicit a neural response at the auditory nerve closer to that observed in NH conditions. Stimulations generated by the framework were evaluated by performing a vowel identification task. However, the task was performed by a classifier trained using deep learning techniques instead of a CI patient. These results give insight into how the framework could be applied for the development and validation of CI stimulation strategies.</div> Neuroscience Cochlear Implants auditory nerve modeling convolutional neural network stimulation strategies phenomenological model Hearing Loss Electrical Stimulation word recognition task
253	Deep Learning-Based Vehicle Recognition Schemes for Intelligent Transportation Systems Ma, Xiren 02 June 2021 (has links) With the increasing highlighted security concerns in Intelligent Transportation System (ITS), Vision-based Automated Vehicle Recognition (VAVR) has attracted considerable attention recently. A comprehensive VAVR system contains three components: Vehicle Detection (VD), Vehicle Make and Model Recognition (VMMR), and Vehicle Re-identification (VReID). These components perform coarse-to-fine recognition tasks in three steps. The VAVR system can be widely used in suspicious vehicle recognition, urban traffic monitoring, and automated driving system. Vehicle recognition is complicated due to the subtle visual differences between different vehicle models. Therefore, how to build a VAVR system that can fast and accurately recognize vehicle information has gained tremendous attention. In this work, by taking advantage of the emerging deep learning methods, which have powerful feature extraction and pattern learning abilities, we propose several models used for vehicle recognition. First, we propose a novel Recurrent Attention Unit (RAU) to expand the standard Convolutional Neural Network (CNN) architecture for VMMR. RAU learns to recognize the discriminative part of a vehicle on multiple scales and builds up a connection with the prominent information in a recurrent way. The proposed ResNet101-RAU achieves excellent recognition accuracy of 93.81% on the Stanford Cars dataset and 97.84% on the CompCars dataset. Second, to construct efficient vehicle recognition models, we simplify the structure of RAU and propose a Lightweight Recurrent Attention Unit (LRAU). The proposed LRAU extracts the discriminative part features by generating attention masks to locate the keypoints of a vehicle (e.g., logo, headlight). The attention mask is generated based on the feature maps received by the LRAU and the preceding attention state generated by the preceding LRAU. Then, by adding LRAUs to the standard CNN architectures, we construct three efficient VMMR models. Our models achieve the state-of-the-art results with 93.94% accuracy on the Stanford Cars dataset, 98.31% accuracy on the CompCars dataset, and 99.41% on the NTOU-MMR dataset. In addition, we construct a one-stage Vehicle Detection and Fine-grained Recognition (VDFG) model by combining our LRAU with the general object detection model. Results show the proposed VDFG model can achieve excellent performance with real-time processing speed. Third, to address the VReID task, we design the Compact Attention Unit (CAU). CAU has a compact structure, and it relies on a single attention map to extract the discriminative local features of a vehicle. We add two CAUs to the truncated ResNet to construct a small but efficient VReID model, ResNetT-CAU. Compared with the original ResNet, the model size of ResNetT-CAU is reduced by 60%. Extensive experiments on the VeRi and VehicleID dataset indicate the proposed ResNetT-CAU achieve the best re-identification results on both datasets. In summary, the experimental results on the challenging benchmark VMMR and VReID datasets indicate our models achieve the best VMMR and VReID performance, and our models have a small model size and fast image processing speed. Deep Learning Computer Vision Vehicle Recognition Vehicle Make and Model Recognition Vehicle Detection Vehicle Re-identification Intelligent Transportation Systems Convolutional Neural Network Visual Attention
254	Hyperparameters relationship to the test accuracy of a convolutional neural network Lundh, Felix, Barta, Oscar January 2021 (has links) Machine learning for image classification is a hot topic and it is increasing in popularity. Therefore the aim of this study is to provide a better understanding of convolutional neural network hyperparameters by comparing the test accuracy of convolutional neural network models with different hyperparameter value configurations. The focus of this study is to see whether there is an influence in the learning process depending on which hyperparameter values were used. For conducting the experiments convolutional neural network models were developed using the programming language Python utilizing the library Keras. The dataset used for this study iscifar-10, it includes 60000 colour images of 10 categories ranging from man-made objects to different animal species. Grid search is used for instantiating models with varying learning rate and momentum, width and depth values. Learning rate is only tested combined with momentum and width is only tested combined with depth. Activation functions, convolutional layers and batch size are tested individually. Grid search is compared against Bayesian optimization to see which technique will find the most optimized learning rate and momentum values. Results illustrate that the impact different hyperparameters have on the overall test accuracy varies. Learning rate and momentum affects the test accuracy greatly, however suboptimal values for learning rate and momentum can decrease the test accuracy severely. Activation function, width and depth, convolutional layer and batch size have a lesser impact on test accuracy. Regarding Bayesian optimization compared to grid search, results show that Bayesian optimization will not necessarily find more optimal hyperparameter values. Machine learning image classification hyperparameter convolutional neural network grid search Bayesian optimization cifar-10 Information Systems, Social aspects
255	Comparing machine learning methods for classification and generation of footprints of buildings from aerial imagery Jerkenhag, Joakim January 2019 (has links) The up to date mapping data is of great importance in social services and disaster relief as well as in city planning. The vast amounts of data and the constant increase of geographical changes lead to large loads of continuous manual analysis. This thesis takes the process of updating maps and breaks it down to the problem of discovering buildings by comparing different machine learning methods to automate the finding of buildings. The chosen methods, YOLOv3 and Mask R-CNN, are based on Region Convolutional Neural Network(R-CNN) due to their capabilities of image analysis in both speed and accuracy. The image data supplied by Lantmäteriet makes up the training and testing data; this data is then used by the chosen machine learning methods. The methods are trained at different time limits, the generated models are tested and the results analysed. The results lay ground for whether the model is reasonable to use in a fully or partly automated system for updating mapping data from aerial imagery. The tested methods showed volatile results through their first hour of training, with YOLOv3 being more so than Mask R-CNN. After the first hour and until the eight hour YOLOv3 shows a higher level of accuracy compared to Mask R-CNN. For YOLOv3, it seems that with more training, the recall increases while precision decreases. For Mask R-CNN, however, there is some trade-off between the recall and precision throughout the eight hours of training. While there is a 90 % confidence interval that the accuracy of YOLOv3 is decreasing for each hour of training after the first hour, the Mask R-CNN method shows that its accuracy is increasing for every hour of training,however, with a low confidence and can therefore not be scientifically relied upon. Due to differences in setups the image size varies between the methods, even though they train and test on the same areas; this results in a fair evaluation where YOLOv3 analyses one square kilometre 1.5 times faster than the Mask R-CNN method does. Both methods show potential for automated generation of footprints, however, the YOLOv3 method solely generates bounding boxes, leaving the step of polygonization to manual work while the Mask R-CNN does, as the name implies, create a mask of which the object is encapsulated. This extra step is thought to further automate the manual process and with viable results speed up the updating of map data. / Uppdaterad kartdata är av stor betydelse för sociala tjänster och katastrofhjälp såväl som inom stadsplanering. De enorma mängderna data och den ständiga ökningen av geografiska förändringar leder till mycket arbete för kontinuerlig manuell analys. Denna avhandling kommer att behandla detta problem med att uppdatera kartor, bryta ned det till det specifika problemet att upptäcka byggnader och ur den synvinkelen jämföra olika maskininlärningsmetoder för automatisera detektering av byggnader. De valda metoderna, YOLOv3 och Mask R-CNN, är baserade på Region Convolutional Neural Network (R-CNN) på grund av dess förmåga av bildanalys i både hastighet och träffsäkerhet. Bildmaterial från Lantmäteriet utgör tränings- och testdatan, denna data används sedan av de utvalda maskininlärningmetoderna. Metoderna tränas med olika tidsgränser och de genererade modellerna testas och resultaten analyseras. Resultaten lägger grund för huruvida modellen är rimlig att använda i ett helt eller delvis automatiserat system för uppdatering av kartdata från flygbilder. De testade metoderna visade varierande resultat under sin första timmes träning, med YOLOv3 mer så än Mask R-CNN. Efter den första timmen fram till den åttonde timmen visar YOLOv3 en högre nivå av precision jämfört med Mask R-CNN. För YOLOv3 ser det ut som att mer träning ökar recall samtidigt som precision minskar. För Mask R-CNN är det emellertid en avvägning mellan recall och precision under de åtta timmarnas träning. Medan det finns en 90 % konfidens att accuracy minskar med YOLOv3 för varje timmes träning efter första timmen så visar Mask R-CNN-metoden att dess accuracy ökar för varje timmes träning, det är dock med låg konfidens och har därmed inte vetenskapligt stöd. På grund av skillnader i konfigurationer varierar bildstorleken mellan metoderna, de tränar och testar dock på samma områden för att ge en rättvis jämförelse. I dessa test analyserar YOLOv3 en kvadratkilometer 1.5 gånger snabbare än Mask R-CNN. Båda metoderna visar potential för en automatiserad generering av footprints. Dock så genererar YOLOv3-metoden endast en bounding box, vilket gör att polygoniseringen återstår för manuellt arbete medan Mask R-CNN, som namnet antyder, skapar en mask som objektet inkapslas i. Detta extrasteg är tänkt att automatisera den manuella processen och med rimliga resultat påskynda uppdateringen av kartdata. Machine Learning Computer Vision Region Convolutional Neural Network Geographical Data Aerial Imagery
256	A Self-policing Smart Parking Solution Dalkic, Yurdaer, Deknache, Hadi January 2019 (has links) With the exponential growth of vehicles on our streets, the need for finding an unoccupied parking spot today could most of the time be problematic, but even more in the coming future. Smart parking solutions have proved to be a helpful approach to facilitate the localization of unoccupied parking spots. In many smart parking solutions, sensors are used to determine the vacancy of a parking spot. The use of sensors can provide a highly accurate solution in terms of determining the status of parking lots. However, this is not ideal from a scalability point of view, since the need for installing and maintaining each of the sensors is not considered cost-effective. In the latest years vision based solutions have been considered more when building a smart parking solution, since cameras can easily be installed and used on a large parking area. Furthermore, the use of cameras can be developed to provide a more advanced solution for checking in at a parking spot and also for providing the information about whether a vehicle is placed unlawfully. In our thesis, we developed a dynamic vision-based smart parking prototype with the aim to detect vacant parking spots and illegally parked vehicles. IoT Smart Parking Smart City Deep Learning Convolutional Neural Network Machine Learning Vision-based solution Sensors Illegally parked vehicles Vacant parking detection Engineering and Technology Teknik och teknologier
257	Estimation de profondeur à partir d'images monoculaires par apprentissage profond / Depth estimation from monocular images by deep learning Moukari, Michel 01 July 2019 (has links) La vision par ordinateur est une branche de l'intelligence artificielle dont le but est de permettre à une machine d'analyser, de traiter et de comprendre le contenu d'images numériques. La compréhension de scène en particulier est un enjeu majeur en vision par ordinateur. Elle passe par une caractérisation à la fois sémantique et structurelle de l'image, permettant d'une part d'en décrire le contenu et, d'autre part, d'en comprendre la géométrie. Cependant tandis que l'espace réel est de nature tridimensionnelle, l'image qui le représente, elle, est bidimensionnelle. Une partie de l'information 3D est donc perdue lors du processus de formation de l'image et il est d'autant plus complexe de décrire la géométrie d'une scène à partir d'images 2D de celle-ci.Il existe plusieurs manières de retrouver l'information de profondeur perdue lors de la formation de l'image. Dans cette thèse nous nous intéressons à l’estimation d'une carte de profondeur étant donné une seule image de la scène. Dans ce cas, l'information de profondeur correspond, pour chaque pixel, à la distance entre la caméra et l'objet représenté en ce pixel. L'estimation automatique d'une carte de distances de la scène à partir d'une image est en effet une brique algorithmique critique dans de très nombreux domaines, en particulier celui des véhicules autonomes (détection d’obstacles, aide à la navigation).Bien que le problème de l'estimation de profondeur à partir d'une seule image soit un problème difficile et intrinsèquement mal posé, nous savons que l'Homme peut apprécier les distances avec un seul œil. Cette capacité n'est pas innée mais acquise et elle est possible en grande partie grâce à l'identification d'indices reflétant la connaissance a priori des objets qui nous entourent. Par ailleurs, nous savons que des algorithmes d'apprentissage peuvent extraire ces indices directement depuis des images. Nous nous intéressons en particulier aux méthodes d’apprentissage statistique basées sur des réseaux de neurones profond qui ont récemment permis des percées majeures dans de nombreux domaines et nous étudions le cas de l'estimation de profondeur monoculaire. / Computer vision is a branch of artificial intelligence whose purpose is to enable a machine to analyze, process and understand the content of digital images. Scene understanding in particular is a major issue in computer vision. It goes through a semantic and structural characterization of the image, on one hand to describe its content and, on the other hand, to understand its geometry. However, while the real space is three-dimensional, the image representing it is two-dimensional. Part of the 3D information is thus lost during the process of image formation and it is therefore non trivial to describe the geometry of a scene from 2D images of it.There are several ways to retrieve the depth information lost in the image. In this thesis we are interested in estimating a depth map given a single image of the scene. In this case, the depth information corresponds, for each pixel, to the distance between the camera and the object represented in this pixel. The automatic estimation of a distance map of the scene from an image is indeed a critical algorithmic brick in a very large number of domains, in particular that of autonomous vehicles (obstacle detection, navigation aids).Although the problem of estimating depth from a single image is a difficult and inherently ill-posed problem, we know that humans can appreciate distances with one eye. This capacity is not innate but acquired and made possible mostly thanks to the identification of indices reflecting the prior knowledge of the surrounding objects. Moreover, we know that learning algorithms can extract these clues directly from images. We are particularly interested in statistical learning methods based on deep neural networks that have recently led to major breakthroughs in many fields and we are studying the case of the monocular depth estimation. Apprentissage Statistique Réseau de Neurones Convolutionnel Estimation de Profondeur Complétion de Profondeur Evaluation d'Incertitude Deep Learning Convolutional Neural Network Computer Vision Depth Estimation Monocular 3D Uncertainty Assessment Depth Completion
258	Klasifikace objektů zpracováním obrazu na základě změny topologie / Object clasification based on its topology change using image processing Zbavitel, Tomáš January 2021 (has links) The aim of the present work is to select a suitable object classification method for the recognition of one-handed finger alphabet characters. For this purpose, a sufficiently robust dataset has been created and is included in this work. The creation of the dataset is necessary for training the convolutional neural network. Further more, a suitable topology for data classification was found. The whole work is implemented using Python and the open-source library Keras was used.
259	Detecting Faulty Piles of Wood using Anomaly Detection Techniques Olsson, Jonathan January 2021 (has links) The forestry and the sawmill industry have a lot of incoming and outgoing piles of wood. It's important to maintain quality and efficiency. This motivates an examination of whether machine learning- or more specifically, anomaly detection techniques can be implemented and used to detect faulty shipments. This thesis presents and evaluates some computer vision techniques and some deep learning techniques. Deep learning can be divided into groups; supervised, semi-supervised and unsupervised. In this thesis, all three groups were examined and it covers supervised methods such as Convolutional Neural Networks, semi-supervised methods such as a modified Convolutional Autoencoder (CAE) and lastly, an unsupervised technique such as Generative Adversarial Network (GAN) was being tested and evaluated. A version of a GAN model proved to perform best for this thesis in terms of the accuracy of faulty detecting shipments with an accuracy rate of 68.2% and 79.8\% overall, which was satisfactory given the problems that were discovered during the progress of the thesis. Machine learning Deep learning Anomaly Detection Convolutional Neural Network Autoencoder Generative Adversarial Network Computer vision Elektroteknik och elektronik
260	Towards Condition-Based Maintenance of Catenary wires using computer vision : Deep Learning applications on eMaintenance & Industrial AI for railway industry Moussallik, Laila January 2021 (has links) Railways are a main element of a sustainable transport policy in several countries as they are considered a safe, efficient and green mode of transportation. Owing to these advantages, there is a cumulative request for the railway industry to increase the performance, the capacity and the availability in addition to safely transport goods and people at higher speeds. To meet the demand, large adjustment of the infrastructure and improvement of maintenance process are required. Inspection activities are essential in establishing the required maintenance, and it is periodically required to reduce unexpected failures and to prevent dangerous consequences. Maintenance of railway catenary systems is a critical task for warranting the safety of electrical railway operation.Usually, the catenary inspection is performed manually by trained personnel. However, as in all human-based inspections characterized by slowness and lack of objectivity, might have a number of crucial disadvantages and potentially lead to dangerous consequences. With the rapid progress of artificial intelligence, it is appropriate for computer vision detection approaches to replace the traditional manual methods during inspections. In this thesis, a strategy for monitoring the health of catenary wires is developed, which include the various steps needed to detect anomalies in this component. Moreover, a solution for detecting different types of wires in the railway catenary system was implemented, in which a deep learning framework is developed by combining the Convolutional Neural Network (CNN) and the Region Proposal Network (RPN). eMaintenance computer vision Condition-Based Maintenance Industrial AI railway catenary system automatic visual detection health monitoring Deep learning Machine learning Convolutional Neural Network object detection. Civil Engineering Samhällsbyggnadsteknik

Search results