Global ETD Search

391	Advanced Data Augmentation : With Generative Adversarial Networks and Computer-Aided Design Thaung, Ludwig January 2020 (has links) CNN-based (Convolutional Neural Network) visual object detectors often reach human level of accuracy but need to be trained with large amounts of manually annotated data. Collecting and annotating this data can frequently be time-consuming and financially expensive. Using generative models to augment the data can help minimize the amount of data required and increase detection per-formance. Many state-of-the-art generative models are Generative Adversarial Networks (GANs). This thesis investigates if and how one can utilize image data to generate new data through GANs to train a YOLO-based (You Only Look Once) object detector, and how CAD (Computer-Aided Design) models can aid in this process. In the experiments, different models of GANs are trained and evaluated by visual inspection or with the Fréchet Inception Distance (FID) metric. The data provided by Ericsson Research consists of images of antenna and baseband equipment along with annotations and segmentations. Ericsson Research supplied the YOLO detector, and no modifications are made to this detector. Finally, the YOLO detector is trained on data generated by the chosen model and evaluated by the Average Precision (AP). The results show that the generative models designed in this work can produce RGB images of high quality. However, the quality reduces if binary segmentation masks are to be generated as well. The experiments with CAD input data did not result in images that could be used for the training of the detector. The GAN designed in this work is able to successfully replace objects in images with the style of other objects. The results show that training the YOLO detector with GAN-modified data compared to training with real data leads to the same detection performance. The results also show that the shapes and backgrounds of the antennas contributed more to detection performance than their style and colour. computer vision machine learning YOLO GANs data augmentation object recognition object detection CAD
392	Object representation in local feature spaces : application to real-time tracking and detection / Représentation d'objets dans des espaces de caractéristiques locales : application à la poursuite de cibles temps-réel et à la détection Tran, Antoine 25 October 2017 (has links) La représentation visuelle est un problème fondamental en vision par ordinateur. Le but est de réduire l'information au strict nécessaire pour une tâche désirée. Plusieurs types de représentation existent, comme les caractéristiques de couleur (histogrammes, attributs de couleurs...), de forme (dérivées, points d'intérêt...) ou d'autres, comme les bancs de filtres.Les caractéristiques bas-niveau (locales) sont rapides à calculer. Elles ont un pouvoir de représentation limité, mais leur généricité présente un intérêt pour des systèmes autonomes et multi-tâches, puisque les caractéristiques haut-niveau découlent d'elles.Le but de cette thèse est de construire puis d'étudier l'impact de représentations fondées seulement sur des caractéristiques locales de bas-niveau (couleurs, dérivées spatiales) pour deux tâches : la poursuite d'objets génériques, nécessitant des caractéristiques robustes aux variations d'aspect de l'objet et du contexte au cours du temps; la détection d'objets, où la représentation doit décrire une classe d'objets en tenant compte des variations intra-classe. Plutôt que de construire des descripteurs d'objets globaux dédiés, nous nous appuyons entièrement sur les caractéristiques locales et sur des mécanismes statistiques flexibles visant à estimer leur distribution (histogrammes) et leurs co-occurrences (Transformée de Hough Généralisée). La Transformée de Hough Généralisée (THG), créée pour la détection de formes quelconques, consiste à créer une structure de données représentant un objet, une classe... Cette structure, d'abord indexée par l'orientation du gradient, a été étendue à d'autres caractéristiques. Travaillant sur des caractéristiques locales, nous voulons rester proche de la THG originale.En poursuite d'objets, après avoir présenté nos premiers travaux, combinant la THG avec un filtre particulaire (utilisant un histogramme de couleurs), nous présentons un algorithme plus léger et rapide (100fps), plus précis et robuste. Nous présentons une évaluation qualitative et étudierons l'impact des caractéristiques utilisées (espace de couleur, formulation des dérivées partielles...). En détection, nous avons utilisé l'algorithme de Gall appelé forêts de Hough. Notre but est de réduire l'espace de caractéristiques utilisé par Gall, en supprimant celles de type HOG, pour ne garder que les dérivées partielles et les caractéristiques de couleur. Pour compenser cette réduction, nous avons amélioré deux étapes de l'entraînement : le support des descripteurs locaux (patchs) est partiellement produit selon une mesure géométrique, et l'entraînement des nœuds se fait en générant une carte de probabilité spécifique prenant en compte les patchs utilisés pour cette étape. Avec l'espace de caractéristiques réduit, le détecteur n'est pas plus précis. Avec les mêmes caractéristiques que Gall, sur une même durée d'entraînement, nos travaux ont permis d'avoir des résultats identiques, mais avec une variance plus faible et donc une meilleure répétabilité. / Visual representation is a fundamental problem in computer vision. The aim is to reduce the information to the strict necessary for a query task. Many types of representation exist, like color features (histograms, color attributes...), shape ones (derivatives, keypoints...) or filterbanks.Low-level (and local) features are fast to compute. Their power of representation are limited, but their genericity have an interest for autonomous or multi-task systems, as higher level ones derivate from them. We aim to build, then study impact of low-level and local feature spaces (color and derivatives only) for two tasks: generic object tracking, requiring features robust to object and environment's aspect changes over the time; object detection, for which the representation should describe object class and cope with intra-class variations.Then, rather than using global object descriptors, we use entirely local features and statisticals mecanisms to estimate their distribution (histograms) and their co-occurrences (Generalized Hough Transform).The Generalized Hough Transform (GHT), created for detection of any shape, consists in building a codebook, originally indexed by gradient orientation, then to diverse features, modeling an object, a class. As we work on local features, we aim to remain close to the original GHT.In tracking, after presenting preliminary works combining the GHT with a particle filter (using color histograms), we present a lighter and fast (100 fps) tracker, more accurate and robust.We present a qualitative evaluation and study the impact of used features (color space, spatial derivative formulation).In detection, we used Gall's Hough Forest. We aim to reduce Gall's feature space and discard HOG features, to keep only derivatives and color ones.To compensate the reduction, we enhanced two steps: the support of local descriptors (patches) are partially chosen using a geometrical measure, and node training is done by using a specific probability map based on patches used at this step.With reduced feature space, the detector is less accurate than with Gall's feature space, but for the same training time, our works lead to identical results, but with higher stability and then better repeatability. Vision par ordinateur Espace de caractéristiques locales Poursuite de cibles Détecrtion d'objets Transformée de Hough Computer vision Local feature space Object tracking Object detection Hough Transform 006.4
393	Vytěžování snímků z panoramatické kamery mobilního mapování / Exploitation of images from panoramatic camera of mobile mapping system Belanis, Pavel January 2019 (has links) This diploma thesis deals with an automated detection of vertical traffic signs in images from the panoramic camera Ladybug5. From the detected signs with help of a classifier, a GIS data set is automatically created, usable for example to passportisation of traffic signs. The first part of the thesis describes a theoretical basis needed to understand the given problematics. The second part is devoted to a specific procedure leading to the reliable classifier, its testing on an independent set of images and automated creation of the GIS data set. The output of the work are the trained classifiers and the GIS data sets containing vertical traffic signs.
394	Towards Condition-Based Maintenance of Catenary wires using computer vision : Deep Learning applications on eMaintenance & Industrial AI for railway industry Moussallik, Laila January 2021 (has links) Railways are a main element of a sustainable transport policy in several countries as they are considered a safe, efficient and green mode of transportation. Owing to these advantages, there is a cumulative request for the railway industry to increase the performance, the capacity and the availability in addition to safely transport goods and people at higher speeds. To meet the demand, large adjustment of the infrastructure and improvement of maintenance process are required. Inspection activities are essential in establishing the required maintenance, and it is periodically required to reduce unexpected failures and to prevent dangerous consequences. Maintenance of railway catenary systems is a critical task for warranting the safety of electrical railway operation.Usually, the catenary inspection is performed manually by trained personnel. However, as in all human-based inspections characterized by slowness and lack of objectivity, might have a number of crucial disadvantages and potentially lead to dangerous consequences. With the rapid progress of artificial intelligence, it is appropriate for computer vision detection approaches to replace the traditional manual methods during inspections. In this thesis, a strategy for monitoring the health of catenary wires is developed, which include the various steps needed to detect anomalies in this component. Moreover, a solution for detecting different types of wires in the railway catenary system was implemented, in which a deep learning framework is developed by combining the Convolutional Neural Network (CNN) and the Region Proposal Network (RPN). eMaintenance computer vision Condition-Based Maintenance Industrial AI railway catenary system automatic visual detection health monitoring Deep learning Machine learning Convolutional Neural Network object detection. Civil Engineering Samhällsbyggnadsteknik
395	Implementation of an Approach for 3D Vehicle Detection in Monocular Traffic Surveillance Videos Mishra, Abhinav 19 February 2021 (has links) Recent advancements in the field of Computer Vision are a by-product of breakthroughs in the domain of Artificial Intelligence. Object detection in monocular images is now realized by an amalgamation of Computer Vision and Deep Learning. While most approaches detect objects as a mere two dimensional (2D) bounding box, there are a few that exploit rather traditional representation of the 3D object. Such approaches detect an object either as a 3D bounding box or exploit its shape primitives using active shape models which results in a wireframe-like detection. Such a wireframe detection is represented as combinations of detected keypoints (or landmarks) of the desired object. Apart from a faithful retrieval of the object’s true shape, wireframe based approaches are relatively robust in handling occlusions. The central task of this thesis was to find such an approach and to implement it with the goal of its performance evaluation. The object of interest is the vehicle class (cars, mini vans, trucks etc.) and the evaluation data is monocular traffic surveillance videos collected by the supervising chair. A wireframe type detection can aid several facets of traffic analysis by improved (compared to 2D bounding box) estimation of the detected object’s ground plane. The thesis encompasses the process of implementation of the chosen approach called Occlusion-Net [40], including its design details and a qualitative evaluation on traffic surveillance videos. The implementation reproduces most of the published results across several occlusion categories except the truncated car category. Occlusion-Net’s erratic detections are mostly caused by incorrect detection of the initial region of interest. It employs three instances of Graph Neural Networks for occlusion reasoning and localization. The thesis also provides a didactic introduction to the field of Machine and Deep Learning including intuitions of mathematical concepts required to understand the two disciplines and the implemented approach.:Contents 1 Introduction 1 2 Technical Background 7 2.1 AI, Machine Learning and Deep Learning 7 2.1.1 But what is AI ? 7 2.1.2 Representational composition by Deep Learning 10 2.2 Essential Mathematics for ML 14 2.2.1 Linear Algebra 15 2.2.2 Probability and Statistics 25 2.2.3 Calculus 34 2.3 Mathematical Introduction to ML 39 2.3.1 Ingredients of a Machine Learning Problem 39 2.3.2 The Perceptron 40 2.3.3 Feature Transformation 46 2.3.4 Logistic Regression 48 2.3.5 Artificial Neural Networks: ANN 53 2.3.6 Convolutional Neural Network: CNN 61 2.3.7 Graph Neural Networks 68 2.4 Specific Topics in Computer Vision 72 2.5 Previous work 76 3 Design of Implemented Approach 81 3.1 Training Dataset 81 3.2 Keypoint Detection : MaskRCNN 83 3.3 Occluded Edge Prediction : 2D-KGNN Encoder 84 3.4 Occluded Keypoint Localization : 2D-KGNN Decoder 86 3.5 3D Shape Estimation: 3D-KGNN Encoder 88 4 Implementation 93 4.1 Open-Source Tools and Libraries 93 4.1.1 Code Packaging: NVIDIA-Docker 94 4.1.2 Data Processing Libraries 94 4.1.3 Libraries for Neural Networks 95 4.1.4 Computer Vision Library 95 4.2 Dataset Acquisition and Training 96 4.2.1 Acquiring Dataset 96 4.2.2 Training Occlusion-Net 96 4.3 Refactoring 97 4.3.1 Error in Docker File 97 4.3.2 Image Directories as Input 97 4.3.3 Frame Extraction in Parallel 98 4.3.4 Video as Input 100 4.4 Functional changes 100 4.4.1 Keypoints In Output 100 4.4.2 Mismatched BB and Keypoints 101 4.4.3 Incorrect Class Labels 101 4.4.4 Bounding Box Overlay 101 5 Evaluation 103 5.1 Qualitative Evaluation 103 5.1.1 Evaluation Across Occlusion Categories 103 5.1.2 Performance on Moderate and Heavy Vehicles 105 5.2 Verification of Failure Analysis 106 5.2.1 Truncated Cars 107 5.2.2 Overlapping Cars 108 5.3 Analysis of Missing Frames 109 5.4 Test Performance 110 6 Conclusion 113 7 Future Work 117 Bibliography 119 info:eu-repo/classification/ddc/380 ddc:380 info:eu-repo/classification/ddc/620 ddc:620 info:eu-repo/classification/ddc/006 ddc:006
396	Design and implementation of an affordable reversing camera system with object detection and OBD-2 integration for commercial vehicles / Design och implementering av ett prisvärt backkamerasystem med objektdetektering och OBD-2-integration för kommersiella fordon Ebrahimi, Alireza, Akbari, Esmatullah January 2023 (has links) This thesis is about designing and implementing an affordable reversing camera sys-tem with object detection and OBD-2 integration for commercial vehicles. The aim is to improve the safety and efficiency of these vehicles by giving drivers a clear view of their surroundings behind the vehicle and alerting them to the presence of nearby obstacles. Ultrasonic sensors are used for object detection and give the driver control over the environment behind the vehicle and warn of present obstacles. The system is also integrated with the vehicle's on-board diagnostics system (OBD-2), which provides important information on speed and engine performance, among other things. This project contributes to making safety systems more accessible to com-mercial vehicles and reduces the risk of accidents and collisions. / Detta examensarbete handlar om att utforma och implementera ett prisvärt backkamerasystem objektdetektering och integration med On-Board Diagnostics 2 för kommersiella fordon. Syftet är att förbättra säkerheten och effektiviteten för dessa fordon genom att ge förarna en tydlig vy av deras omgivningar bakom fordonet och varna dem för närvaron av hinder i närheten. Ultraljudssensorer används för objekt-detektering och ger föraren en kontroll över omgivningen bakom fordonet samt var-nar för närvarande hinder. Systemet är också integrerat med fordonets omborddia-gnostiksystem (OBD-2), som ger viktig information om bland annat hastighet och motorprestanda. Detta projekt bidrar till att göra säkerhetssystem mer tillgängliga för kommersiella fordon och minskar risken för olyckor och kollisioner. Reversing camera system object detection OBD-2-integration ultrasonic sensors accident risk driver assistance. Backkamerasystem objektdetektering OBD-2-integration ultraljudssensorer olycksrisk förarassistans. Embedded Systems Inbäddad systemteknik
397	Multi-site Organ Detection in CT Images using Deep Learning / Regionsoberoende organdetektion i CT-bilder meddjupinlärning Jacobzon, Gustaf January 2020 (has links) When optimizing a controlled dose in radiotherapy, high resolution spatial information about healthy organs in close proximity to the malignant cells are necessary in order to mitigate dispersion into these organs-at-risk. This information can be provided by deep volumetric segmentation networks, such as 3D U-Net. However, due to limitations of memory in modern graphical processing units, it is not feasible to train a volumetric segmentation network on full image volumes and subsampling the volume gives a too coarse segmentation. An alternative is to sample a region of interest from the image volume and train an organ-specific network. This approach requires knowledge of which region in the image volume that should be sampled and can be provided by a 3D object detection network. Typically the detection network will also be region specific, although a larger region such as the thorax region, and requires human assistance in choosing the appropriate network for a certain region in the body. Instead, we propose a multi-site object detection network based onYOLOv3 trained on 43 different organs, which may operate on arbitrary chosen axial patches in the body. Our model identifies the organs present (whole or truncated) in the image volume and may automatically sample a region from the input and feed to the appropriate volumetric segmentation network. We train our model on four small (as low as 20 images) site-specific datasets in a weakly-supervised manner in order to handle the partially unlabeled nature of site-specific datasets. Our model is able to generate organ-specific regions of interests that enclose 92% of the organs present in the test set. / Vid optimering av en kontrollerad dos inom strålbehandling krävs det information om friska organ, så kallade riskorgan, i närheten av de maligna cellerna för att minimera strålningen i dessa organ. Denna information kan tillhandahållas av djupa volymetriskta segmenteringsnätverk, till exempel 3D U-Net. Begränsningar i minnesstorleken hos moderna grafikkort gör att det inte är möjligt att träna ett volymetriskt segmenteringsnätverk på hela bildvolymen utan att först nedsampla volymen. Detta leder dock till en lågupplöst segmentering av organen som inte är tillräckligt precis för att kunna användas vid optimeringen. Ett alternativ är att endast behandla en intresseregion som innesluter ett eller ett fåtal organ från bildvolymen och träna ett regionspecifikt nätverk på denna mindre volym. Detta tillvägagångssätt kräver dock information om vilket område i bildvolymen som ska skickas till det regionspecifika segmenteringsnätverket. Denna information kan tillhandahållas av ett 3Dobjektdetekteringsnätverk. I regel är även detta nätverk regionsspecifikt, till exempel thorax-regionen, och kräver mänsklig assistans för att välja rätt nätverk för en viss region i kroppen. Vi föreslår istället ett multiregions-detekteringsnätverk baserat påYOLOv3 som kan detektera 43 olika organ och fungerar på godtyckligt valda axiella fönster i kroppen. Vår modell identifierar närvarande organ (hela eller trunkerade) i bilden och kan automatiskt ge information om vilken region som ska behandlas av varje regionsspecifikt segmenteringsnätverk. Vi tränar vår modell på fyra små (så lågt som 20 bilder) platsspecifika datamängder med svag övervakning för att hantera den delvis icke-annoterade egenskapen hos datamängderna. Vår modell genererar en organ-specifik intresseregion för 92 % av organen som finns i testmängden. Organ Detection Organs-at-risk 3D Object Detection Segmentation Deep Learning Machine Learning Weakly-supervised Learning YOLOv3 3D U-Net Elektroteknik och elektronik
398	Supervision : Object motion interpretation using hyperdimensional computing based on object detection run on the edge Andersson Svensson, Albin January 2022 (has links) This thesis demonstrates a technique for developing efficient applications interpreting spacial deep learning output using Hyper Dimensional Computing (HDC), also known as Vector Symbolic Architecture (VSA). As a part of the application demonstration, a novel preprocessing technique for motion using state machines and spacial semantic pointers will be explained. The application will be evaluated and run on a Google Coral edge TPU interpreting real time inference of a compressed object detection model. Machine learning Object detection Motion Interpretation Hyperdimensional Computing Vector Symbolic Architecture Computer Systems Datorsystem Embedded Systems Inbäddad systemteknik Robotics Robotteknik och automation Signal Processing Signalbehandling
399	SENSOR FUSION IN NEURAL NETWORKS FOR OBJECT DETECTION Sheetal Prasanna (12447189) 12 July 2022 (has links) <p>Object detection is an increasingly popular tool used in many fields, especially in the<br> development of autonomous vehicles. The task of object detections involves the localization<br> of objects in an image, constructing a bounding box to determine the presence and loca-<br> tion of the object, and classifying each object into its appropriate class. Object detection<br> applications are commonly implemented using convolutional neural networks along with the<br> construction of feature pyramid networks to extract data.<br> Another commonly used technique in the automotive industry is sensor fusion. Each<br> automotive sensor – camera, radar, and lidar – have their own advantages and disadvantages.<br> Fusing two or more sensors together and using the combined information is a popular method<br> of balancing the strengths and weakness of each independent sensor. Together, using sensor<br> fusion within an object detection network has been found to be an effective method of<br> obtaining accurate models. Accurate detections and classifications of images is a vital step<br> in the development of autonomous vehicles or self-driving cars.<br> Many studies have proposed methods to improve neural networks or object detection<br> networks. Some of these techniques involve data augmentation and hyperparameter opti-<br> mization. This thesis achieves the goal of improving a camera and radar fusion network by<br> implementing various techniques within these areas. Additionally, a novel idea of integrating<br> a third sensor, the lidar, into an existing camera and radar fusion network is explored in this<br> research work.<br> The models were trained on the Nuscenes dataset, one of the biggest automotive datasets<br> available today. Using the concepts of augmentation, hyperparameter optimization, sensor<br> fusion, and annotation filters, the CRF-Net was trained to achieve an accuracy score that<br> was 69.13% higher than the baseline</p> Digital processor architectures Computer vision Object detection Sensor Fusion Nuscenes Machine Learning Autonomous Vehicles Radar Lidar Camera Computer Engineering Computer Vision
400	Evaluation and Analysis of Perception Systems for Autonomous Driving Sharma, Devendra January 2020 (has links) For safe mobility, an autonomous vehicle must perceive the surroundings accurately. There are many perception tasks associated with understanding the local environment such as object detection, localization, and lane analysis. Object detection, in particular, plays a vital role in determining an object’s location and classifying it correctly and is one of the challenging tasks in the self-driving research area. Before employing an object detection module in autonomous vehicle testing, an organization needs to have a precise analysis of the module. Hence, it becomes crucial for a company to have an evaluation framework to evaluate an object detection algorithm’s performance. This thesis develops a comprehensive framework for evaluating and analyzing object detection algorithms, both 2D (camera images based) and 3D (LiDAR point cloud-based). The pipeline developed in this thesis provides the ability to evaluate multiple models with ease, signified by the key performance metrics, Average Precision, F-score, and Mean Average Precision. 40-point interpolation method is used to calculate the Average Precision. / För säker rörlighet måste ett autonomt fordon uppfatta omgivningen exakt. Det finns många uppfattningsuppgifter associerade med att förstå den lokala miljön, såsom objektdetektering, lokalisering och filanalys. I synnerhet objektdetektering spelar en viktig roll för att bestämma ett objekts plats och klassificera det korrekt och är en av de utmanande uppgifterna inom det självdrivande forskningsområdet. Innan en anställd detekteringsmodul används i autonoma fordonsprovningar måste en organisation ha en exakt analys av modulen. Därför blir det avgörande för ett företag att ha en utvärderingsram för att utvärdera en objektdetekteringsalgoritms prestanda. Denna avhandling utvecklar ett omfattande ramverk för utvärdering och analys av objektdetekteringsalgoritmer, både 2 D (kamerabilder baserade) och 3 D (LiDAR-punktmolnbaserade). Rörledningen som utvecklats i denna avhandling ger möjlighet att enkelt utvärdera flera modeller, betecknad med nyckelprestandamätvärdena, Genomsnittlig precision, F-poäng och genomsnittlig genomsnittlig precision. 40-punkts interpoleringsmetod används för att beräkna medelprecisionen. Object detection CARLA evaluation simulator autonomous vehicles KITTI LiDAR stereo camera Objektdetektering CARLA utvärdering simulator autonoma fordon KITTI LiDAR stereokamera Computer and Information Sciences Data- och informationsvetenskap

Search results