Global ETD Search

391	Cross-layer optimization for joint visual-inertial localization and object detection on resource-constrained devices Baldassari, Elisa January 2021 (has links) The expectations in performing high-performance cyber-physical applications in resource-constrained devices are continuously increasing. The available hardware is still a main limitation in this context, both in terms of computation capability and energy limits. On the other hand, one must ensure the robust and accurate execution of the applications deployed, since their failure may entail risks for humans and the surrounding environment. The limits and risks are enhanced when multiple applications are executed on the same device. The focus of this thesis is to provide a trade-off between the required performance and power consumption. The focus is on two fundamental applications in the mobile autonomous vehicles scenario: localization and object detection. The multi-objective optimization is performed in a cross-layer manner, exploring both applications and platform configurable parameters with Design Space Exploration (DSE). The focus is on localization and detection accuracy, detection latency and power consumption. Predictive models are designed to estimate the metrics of interest and ensure robust execution, excluding potential faulty configurations from the design space. The research is approached empirically, performing tests on the Nvidia Jetson AGX and NX platforms. Results show that optimal configurations for a single application are in general sub-optimal or faulty for the concurrent execution case, while the opposite is sometimes applicable. / Resursbegränsade enheter förväntas utföra mer och mer krävande cyberfysiska program. Hårdvaran är en av de huvudsakliga begränsningarna både vad gäller beräkningshastighet och energigränser. Samtidigt måste programmen som körs vara robusta och noggranna, eftersom ett fel kan påverka människor och deras omgivning. När flera program körs på samma enhet blir både begränsningar och risker större. Den här avhandlingen fokuserar på att göra en avvägning mellan krav på prestanda och energiförbrukning för två tillämpningar inom området autonoma fordon: lokalisering och objektigenkänning. Med hjälp av Design Space Exploration (DSE) utforskas parametrar både i applikationerna och på plattformen genom att utföra tvärlageroptimering med flera mål. Lokaliserings- och detekteringsnoggrannhet, fördröjning i igenkänning och energiförbrukning är egenskaper i fokus. Prediktiva modeller designas för att estimera måtten som är av intresse och garantera robust körning genom att utesluta potentiellt felaktiga konfigurationer. Empirisk forskning görs med tester på Nvidia Jetson AGXoch NX-plattformarna. Resultaten visar att de optimala konfigurationerna för ett enda program i allmänhet är suboptimala eller felaktiga vid körning av flera program samtidigt, medan motsatsen ibland är tillämplig. Read more Cross-layer optimization Resource-constrained edge devices Concurrent applications Object detection Localization Optimering i flera lager Resursbegränsade edge-enheter Samtidiga program Objektidentifiering Lokalisering Computer and Information Sciences Data- och informationsvetenskap
392	Hybrid pool based deep active learning for object detection using intermediate network embeddings Marbinah, Johan January 2021 (has links) With the advancements in deep learning, object detection networks have become more robust. Nevertheless, a challenge with training deep networks is finding enough labelled training data for the model to perform well, due to constraints associated with acquiring relevant data. For this reason, active learning is used to minimize the cost by sampling the unlabeled samples that increase the performance the most. In the field of object detection, few works have been done in exploring effective hybrid active learning strategies that exploit the intermediate feature embeddings in neural networks. In this work, hybrid active learning methods are proposed and tested, using various uncertainty sampling techniques and the well-respected core-set method as the representative strategy. In addition, experiments are conducted with network embeddings to find a suitable strategy to model representation of all available samples. Experiments show mixed outcomes as to whether hybrid methods perform better than the core-set method used separately. / Med framstegen inom djupinlärning, har neurala nätverk för objektdetektering blivit mer robusta. En utmaning med att träna djupa neurala nätverk är att hitta en tillräcklig mängd träningsdata för att ett nätverk ska prestera bra, med tanke på de begränsningar som är förknippade med anskaffningen av relevant data. Av denna anledning används aktiv maskininlärning för att minimera kostnaden med att förvärva nya datapunkter, genom att göra kontinuerliga urval av de omärkta bilder som ökar prestandan mest. När det gäller objektsdetektering har få arbeten gjorts för att utforska effektiva hybridstrategier som utnyttjar de mellanliggande lagren som finns i ett neuralt nätverk. I det här arbetet föreslås och testas hybridmetoder i kontext av aktiv maskininlärning med hjälp av olika tekniker för att göra urval av datamängder baserade på osäkerhetsberäkningar men även beräkningar med hänsyn till representation (core-set-metoden). Dessutom utförs experiment med mellanliggande nätverksinbäddningar för att hitta en lämplig strategi för att modellera representation av alla tillgängliga bilder i datasetet. Experimenten visar blandade resultat när det gäller huruvida hybridmetoderna presterar bättre i jämförelse med seperata aktiv maskininlärning strategier där core-set metoden inte används. Read more Active learning deep learning object detection intermediate feature embeddings YOLOv5. aktiv maskininlärning djupinlärning objektdetektering mellanliggande nätverksinbäddningar YOLOv5. Computer and Information Sciences Data- och informationsvetenskap
393	Dataset Evaluation Method for Vehicle Detection Using TensorFlow Object Detection API / Utvärderingsmetod för dataset inom fordonsigenkänning med användning avTensorFlow Object Detection API Furundzic, Bojan, Mathisson, Fabian January 2021 (has links) Recent developments in the field of object detection have highlighted a significant variation in quality between visual datasets. As a result, there is a need for a standardized approach of validating visual dataset features and their performance contribution. With a focus on vehicle detection, this thesis aims to develop an evaluation method utilized for comparing visual datasets. This method was utilized to determine the dataset that contributed to the detection model with the greatest ability to detect vehicles. The visual datasets compared in this research were BDD100K, KITTI and Udacity, each one being trained on individual models. Applying the developed evaluation method, a strong indication of BDD100K's performance superiority was determined. Further analysis and feature extraction of dataset size, label distribution and average labels per image was conducted. In addition, real-world experimental conduction was performed in order to validate the developed evaluation method. It could be determined that all features and experimental results pointed to BDD100K's superiority over the other datasets, validating the developed evaluation method. Furthermore, the TensorFlow Object Detection API's ability to improve performance gain from a visual dataset was studied. Through the use of augmentations, it was concluded that the TensorFlow Object Detection API serves as a great tool to increase performance gain for visual datasets. / Inom fältet av objektdetektering har ny utveckling demonstrerat stor kvalitetsvariation mellan visuella dataset. Till följd av detta finns det ett behov av standardiserade valideringsmetoder för att jämföra visuella dataset och deras prestationsförmåga. Detta examensarbete har, med ett fokus på fordonsigenkänning, som syfte att utveckla en pålitlig valideringsmetod som kan användas för att jämföra visuella dataset. Denna valideringsmetod användes därefter för att fastställa det dataset som bidrog till systemet med bäst förmåga att detektera fordon. De dataset som användes i denna studien var BDD100K, KITTI och Udacity, som tränades på individuella igenkänningsmodeller. Genom att applicera denna valideringsmetod, fastställdes det att BDD100K var det dataset som bidrog till systemet med bäst presterande igenkänningsförmåga. En analys av dataset storlek, etikettdistribution och genomsnittliga antalet etiketter per bild var även genomförd. Tillsammans med ett experiment som genomfördes för att testa modellerna i verkliga sammanhang, kunde det avgöras att valideringsmetoden stämde överens med de fastställda resultaten. Slutligen studerades TensorFlow Object Detection APIs förmåga att förbättra prestandan som erhålls av ett visuellt dataset. Genom användning av ett modifierat dataset, kunde det fastställas att TensorFlow Object Detection API är ett lämpligt modifieringsverktyg som kan användas för att öka prestandan av ett visuellt dataset. Read more Deep Learning Vehicle Detection Machine Learning Dataset Evaluation Method Artificial Intelligence TensorFlow Object Detection SSD Faster R-CNN CNN Neural Networks Engineering and Technology Teknik och teknologier
394	Training of Object Detection Spiking Neural Networks for Event-Based Vision Johansson, Olof January 2021 (has links) Event-based vision offers high dynamic range, time resolution and lower latency than conventional frame-based vision sensors. These attributes are useful in varying light condition and fast motion. However, there are no neural network models and training protocols optimized for object detection with event data, and conventional artificial neural networks for frame-based data are not directly suitable for that task. Spiking neural networks are natural candidates but further work is required to develop an efficient object detection architecture and end-to-end training protocol. For example, object detection in varying light conditions is identified as a challenging problem for the automation of construction equipment such as earth-moving machines, aiming to increase the safety of operators and make repetitive processes less tedious. This work focuses on the development and evaluation of a neural network for object detection with data from an event-based sensor. Furthermore, the strengths and weaknesses of an event-based vision solution are discussed in relation to the known challenges described in former works on automation of earth-moving machines. A solution for object detection with event data is implemented as a modified YOLOv3 network with spiking convolutional layers trained with a backpropagation algorithm adapted for spiking neural networks. The performance is evaluated on the N-Caltech101 dataset with classes for airplanes and motorbikes, resulting in a mAP of 95.8% for the combined network and 98.8% for the original YOLOv3 network with the same architecture. The solution is investigated as a proof of concept and suggestions for further work is described based on a recurrent spiking neural network. Read more Event-Based Vision YOLO SNN Spiking Neural Network Neuromorphic Computer Vision Object Detection Earth-Moving Machines
395	Advanced Data Augmentation : With Generative Adversarial Networks and Computer-Aided Design Thaung, Ludwig January 2020 (has links) CNN-based (Convolutional Neural Network) visual object detectors often reach human level of accuracy but need to be trained with large amounts of manually annotated data. Collecting and annotating this data can frequently be time-consuming and financially expensive. Using generative models to augment the data can help minimize the amount of data required and increase detection per-formance. Many state-of-the-art generative models are Generative Adversarial Networks (GANs). This thesis investigates if and how one can utilize image data to generate new data through GANs to train a YOLO-based (You Only Look Once) object detector, and how CAD (Computer-Aided Design) models can aid in this process. In the experiments, different models of GANs are trained and evaluated by visual inspection or with the Fréchet Inception Distance (FID) metric. The data provided by Ericsson Research consists of images of antenna and baseband equipment along with annotations and segmentations. Ericsson Research supplied the YOLO detector, and no modifications are made to this detector. Finally, the YOLO detector is trained on data generated by the chosen model and evaluated by the Average Precision (AP). The results show that the generative models designed in this work can produce RGB images of high quality. However, the quality reduces if binary segmentation masks are to be generated as well. The experiments with CAD input data did not result in images that could be used for the training of the detector. The GAN designed in this work is able to successfully replace objects in images with the style of other objects. The results show that training the YOLO detector with GAN-modified data compared to training with real data leads to the same detection performance. The results also show that the shapes and backgrounds of the antennas contributed more to detection performance than their style and colour. Read more computer vision machine learning YOLO GANs data augmentation object recognition object detection CAD
396	Object representation in local feature spaces : application to real-time tracking and detection / Représentation d'objets dans des espaces de caractéristiques locales : application à la poursuite de cibles temps-réel et à la détection Tran, Antoine 25 October 2017 (has links) La représentation visuelle est un problème fondamental en vision par ordinateur. Le but est de réduire l'information au strict nécessaire pour une tâche désirée. Plusieurs types de représentation existent, comme les caractéristiques de couleur (histogrammes, attributs de couleurs...), de forme (dérivées, points d'intérêt...) ou d'autres, comme les bancs de filtres.Les caractéristiques bas-niveau (locales) sont rapides à calculer. Elles ont un pouvoir de représentation limité, mais leur généricité présente un intérêt pour des systèmes autonomes et multi-tâches, puisque les caractéristiques haut-niveau découlent d'elles.Le but de cette thèse est de construire puis d'étudier l'impact de représentations fondées seulement sur des caractéristiques locales de bas-niveau (couleurs, dérivées spatiales) pour deux tâches : la poursuite d'objets génériques, nécessitant des caractéristiques robustes aux variations d'aspect de l'objet et du contexte au cours du temps; la détection d'objets, où la représentation doit décrire une classe d'objets en tenant compte des variations intra-classe. Plutôt que de construire des descripteurs d'objets globaux dédiés, nous nous appuyons entièrement sur les caractéristiques locales et sur des mécanismes statistiques flexibles visant à estimer leur distribution (histogrammes) et leurs co-occurrences (Transformée de Hough Généralisée). La Transformée de Hough Généralisée (THG), créée pour la détection de formes quelconques, consiste à créer une structure de données représentant un objet, une classe... Cette structure, d'abord indexée par l'orientation du gradient, a été étendue à d'autres caractéristiques. Travaillant sur des caractéristiques locales, nous voulons rester proche de la THG originale.En poursuite d'objets, après avoir présenté nos premiers travaux, combinant la THG avec un filtre particulaire (utilisant un histogramme de couleurs), nous présentons un algorithme plus léger et rapide (100fps), plus précis et robuste. Nous présentons une évaluation qualitative et étudierons l'impact des caractéristiques utilisées (espace de couleur, formulation des dérivées partielles...). En détection, nous avons utilisé l'algorithme de Gall appelé forêts de Hough. Notre but est de réduire l'espace de caractéristiques utilisé par Gall, en supprimant celles de type HOG, pour ne garder que les dérivées partielles et les caractéristiques de couleur. Pour compenser cette réduction, nous avons amélioré deux étapes de l'entraînement : le support des descripteurs locaux (patchs) est partiellement produit selon une mesure géométrique, et l'entraînement des nœuds se fait en générant une carte de probabilité spécifique prenant en compte les patchs utilisés pour cette étape. Avec l'espace de caractéristiques réduit, le détecteur n'est pas plus précis. Avec les mêmes caractéristiques que Gall, sur une même durée d'entraînement, nos travaux ont permis d'avoir des résultats identiques, mais avec une variance plus faible et donc une meilleure répétabilité. / Visual representation is a fundamental problem in computer vision. The aim is to reduce the information to the strict necessary for a query task. Many types of representation exist, like color features (histograms, color attributes...), shape ones (derivatives, keypoints...) or filterbanks.Low-level (and local) features are fast to compute. Their power of representation are limited, but their genericity have an interest for autonomous or multi-task systems, as higher level ones derivate from them. We aim to build, then study impact of low-level and local feature spaces (color and derivatives only) for two tasks: generic object tracking, requiring features robust to object and environment's aspect changes over the time; object detection, for which the representation should describe object class and cope with intra-class variations.Then, rather than using global object descriptors, we use entirely local features and statisticals mecanisms to estimate their distribution (histograms) and their co-occurrences (Generalized Hough Transform).The Generalized Hough Transform (GHT), created for detection of any shape, consists in building a codebook, originally indexed by gradient orientation, then to diverse features, modeling an object, a class. As we work on local features, we aim to remain close to the original GHT.In tracking, after presenting preliminary works combining the GHT with a particle filter (using color histograms), we present a lighter and fast (100 fps) tracker, more accurate and robust.We present a qualitative evaluation and study the impact of used features (color space, spatial derivative formulation).In detection, we used Gall's Hough Forest. We aim to reduce Gall's feature space and discard HOG features, to keep only derivatives and color ones.To compensate the reduction, we enhanced two steps: the support of local descriptors (patches) are partially chosen using a geometrical measure, and node training is done by using a specific probability map based on patches used at this step.With reduced feature space, the detector is less accurate than with Gall's feature space, but for the same training time, our works lead to identical results, but with higher stability and then better repeatability. Read more Vision par ordinateur Espace de caractéristiques locales Poursuite de cibles Détecrtion d'objets Transformée de Hough Computer vision Local feature space Object tracking Object detection Hough Transform 006.4
397	Vytěžování snímků z panoramatické kamery mobilního mapování / Exploitation of images from panoramatic camera of mobile mapping system Belanis, Pavel January 2019 (has links) This diploma thesis deals with an automated detection of vertical traffic signs in images from the panoramic camera Ladybug5. From the detected signs with help of a classifier, a GIS data set is automatically created, usable for example to passportisation of traffic signs. The first part of the thesis describes a theoretical basis needed to understand the given problematics. The second part is devoted to a specific procedure leading to the reliable classifier, its testing on an independent set of images and automated creation of the GIS data set. The output of the work are the trained classifiers and the GIS data sets containing vertical traffic signs.
398	Towards Condition-Based Maintenance of Catenary wires using computer vision : Deep Learning applications on eMaintenance & Industrial AI for railway industry Moussallik, Laila January 2021 (has links) Railways are a main element of a sustainable transport policy in several countries as they are considered a safe, efficient and green mode of transportation. Owing to these advantages, there is a cumulative request for the railway industry to increase the performance, the capacity and the availability in addition to safely transport goods and people at higher speeds. To meet the demand, large adjustment of the infrastructure and improvement of maintenance process are required. Inspection activities are essential in establishing the required maintenance, and it is periodically required to reduce unexpected failures and to prevent dangerous consequences. Maintenance of railway catenary systems is a critical task for warranting the safety of electrical railway operation.Usually, the catenary inspection is performed manually by trained personnel. However, as in all human-based inspections characterized by slowness and lack of objectivity, might have a number of crucial disadvantages and potentially lead to dangerous consequences. With the rapid progress of artificial intelligence, it is appropriate for computer vision detection approaches to replace the traditional manual methods during inspections. In this thesis, a strategy for monitoring the health of catenary wires is developed, which include the various steps needed to detect anomalies in this component. Moreover, a solution for detecting different types of wires in the railway catenary system was implemented, in which a deep learning framework is developed by combining the Convolutional Neural Network (CNN) and the Region Proposal Network (RPN). Read more eMaintenance computer vision Condition-Based Maintenance Industrial AI railway catenary system automatic visual detection health monitoring Deep learning Machine learning Convolutional Neural Network object detection. Civil Engineering Samhällsbyggnadsteknik
399	Implementation of an Approach for 3D Vehicle Detection in Monocular Traffic Surveillance Videos Mishra, Abhinav 19 February 2021 (has links) Recent advancements in the field of Computer Vision are a by-product of breakthroughs in the domain of Artificial Intelligence. Object detection in monocular images is now realized by an amalgamation of Computer Vision and Deep Learning. While most approaches detect objects as a mere two dimensional (2D) bounding box, there are a few that exploit rather traditional representation of the 3D object. Such approaches detect an object either as a 3D bounding box or exploit its shape primitives using active shape models which results in a wireframe-like detection. Such a wireframe detection is represented as combinations of detected keypoints (or landmarks) of the desired object. Apart from a faithful retrieval of the object’s true shape, wireframe based approaches are relatively robust in handling occlusions. The central task of this thesis was to find such an approach and to implement it with the goal of its performance evaluation. The object of interest is the vehicle class (cars, mini vans, trucks etc.) and the evaluation data is monocular traffic surveillance videos collected by the supervising chair. A wireframe type detection can aid several facets of traffic analysis by improved (compared to 2D bounding box) estimation of the detected object’s ground plane. The thesis encompasses the process of implementation of the chosen approach called Occlusion-Net [40], including its design details and a qualitative evaluation on traffic surveillance videos. The implementation reproduces most of the published results across several occlusion categories except the truncated car category. Occlusion-Net’s erratic detections are mostly caused by incorrect detection of the initial region of interest. It employs three instances of Graph Neural Networks for occlusion reasoning and localization. The thesis also provides a didactic introduction to the field of Machine and Deep Learning including intuitions of mathematical concepts required to understand the two disciplines and the implemented approach.:Contents 1 Introduction 1 2 Technical Background 7 2.1 AI, Machine Learning and Deep Learning 7 2.1.1 But what is AI ? 7 2.1.2 Representational composition by Deep Learning 10 2.2 Essential Mathematics for ML 14 2.2.1 Linear Algebra 15 2.2.2 Probability and Statistics 25 2.2.3 Calculus 34 2.3 Mathematical Introduction to ML 39 2.3.1 Ingredients of a Machine Learning Problem 39 2.3.2 The Perceptron 40 2.3.3 Feature Transformation 46 2.3.4 Logistic Regression 48 2.3.5 Artificial Neural Networks: ANN 53 2.3.6 Convolutional Neural Network: CNN 61 2.3.7 Graph Neural Networks 68 2.4 Specific Topics in Computer Vision 72 2.5 Previous work 76 3 Design of Implemented Approach 81 3.1 Training Dataset 81 3.2 Keypoint Detection : MaskRCNN 83 3.3 Occluded Edge Prediction : 2D-KGNN Encoder 84 3.4 Occluded Keypoint Localization : 2D-KGNN Decoder 86 3.5 3D Shape Estimation: 3D-KGNN Encoder 88 4 Implementation 93 4.1 Open-Source Tools and Libraries 93 4.1.1 Code Packaging: NVIDIA-Docker 94 4.1.2 Data Processing Libraries 94 4.1.3 Libraries for Neural Networks 95 4.1.4 Computer Vision Library 95 4.2 Dataset Acquisition and Training 96 4.2.1 Acquiring Dataset 96 4.2.2 Training Occlusion-Net 96 4.3 Refactoring 97 4.3.1 Error in Docker File 97 4.3.2 Image Directories as Input 97 4.3.3 Frame Extraction in Parallel 98 4.3.4 Video as Input 100 4.4 Functional changes 100 4.4.1 Keypoints In Output 100 4.4.2 Mismatched BB and Keypoints 101 4.4.3 Incorrect Class Labels 101 4.4.4 Bounding Box Overlay 101 5 Evaluation 103 5.1 Qualitative Evaluation 103 5.1.1 Evaluation Across Occlusion Categories 103 5.1.2 Performance on Moderate and Heavy Vehicles 105 5.2 Verification of Failure Analysis 106 5.2.1 Truncated Cars 107 5.2.2 Overlapping Cars 108 5.3 Analysis of Missing Frames 109 5.4 Test Performance 110 6 Conclusion 113 7 Future Work 117 Bibliography 119 Read more info:eu-repo/classification/ddc/380 ddc:380 info:eu-repo/classification/ddc/620 ddc:620 info:eu-repo/classification/ddc/006 ddc:006
400	Design and implementation of an affordable reversing camera system with object detection and OBD-2 integration for commercial vehicles / Design och implementering av ett prisvärt backkamerasystem med objektdetektering och OBD-2-integration för kommersiella fordon Ebrahimi, Alireza, Akbari, Esmatullah January 2023 (has links) This thesis is about designing and implementing an affordable reversing camera sys-tem with object detection and OBD-2 integration for commercial vehicles. The aim is to improve the safety and efficiency of these vehicles by giving drivers a clear view of their surroundings behind the vehicle and alerting them to the presence of nearby obstacles. Ultrasonic sensors are used for object detection and give the driver control over the environment behind the vehicle and warn of present obstacles. The system is also integrated with the vehicle's on-board diagnostics system (OBD-2), which provides important information on speed and engine performance, among other things. This project contributes to making safety systems more accessible to com-mercial vehicles and reduces the risk of accidents and collisions. / Detta examensarbete handlar om att utforma och implementera ett prisvärt backkamerasystem objektdetektering och integration med On-Board Diagnostics 2 för kommersiella fordon. Syftet är att förbättra säkerheten och effektiviteten för dessa fordon genom att ge förarna en tydlig vy av deras omgivningar bakom fordonet och varna dem för närvaron av hinder i närheten. Ultraljudssensorer används för objekt-detektering och ger föraren en kontroll över omgivningen bakom fordonet samt var-nar för närvarande hinder. Systemet är också integrerat med fordonets omborddia-gnostiksystem (OBD-2), som ger viktig information om bland annat hastighet och motorprestanda. Detta projekt bidrar till att göra säkerhetssystem mer tillgängliga för kommersiella fordon och minskar risken för olyckor och kollisioner. Read more Reversing camera system object detection OBD-2-integration ultrasonic sensors accident risk driver assistance. Backkamerasystem objektdetektering OBD-2-integration ultraljudssensorer olycksrisk förarassistans. Embedded Systems Inbäddad systemteknik

Search results