481 |
Computer Vision for Camera Trap Footage : Comparing classification with object detectionÖrn, Fredrik January 2021 (has links)
Monitoring wildlife is of great interest to ecologists and is arguably even more important in the Arctic, the region in focus for the research network INTERACT, where the effects of climate change are greater than on the rest of the planet. This master thesis studies how artificial intelligence (AI) and computer vision can be used together with camera traps to achieve an effective way to monitor populations. The study uses an image data set, containing both humans and animals. The images were taken by camera traps from ECN Cairngorms, a station in the INTERACT network. The goal of the project is to classify these images into one of three categories: "Empty", "Animal" and "Human". Three different methods are compared, a DenseNet201 classifier, a YOLOv3 object detector, and the pre-trained MegaDetector, developed by Microsoft. No sufficient results were achieved with the classifier, but YOLOv3 performed well on human detection, with an average precision (AP) of 0.8 on both training and validation data. The animal detections for YOLOv3 did not reach an as high AP and this was likely because of the smaller amount of training examples. The best results were achieved by MegaDetector in combination with an added method to determine if the detected animals were dogs, reaching an average precision of 0.85 for animals and 0.99 for humans. This is the method that is recommended for future use, but there is potential to improve all the models and reach even more impressive results.Teknisk-naturvetenskapliga
|
482 |
Radar-based Application of Pedestrian and Cyclist Micro-Doppler Signatures for Automotive Safety SystemsHeld, Patrick 12 May 2022 (has links)
Die sensorbasierte Erfassung des Nahfeldes im Kontext des hochautomatisierten Fahrens erfährt einen spürbaren Trend bei der Integration von Radarsensorik. Fortschritte in der Mikroelektronik erlauben den Einsatz von hochauflösenden Radarsensoren, die durch effiziente Verfahren sowohl im Winkel als auch in der Entfernung und im Doppler die Messgenauigkeit kontinuierlich ansteigen lassen. Dadurch ergeben sich neuartige Möglichkeiten bei der Bestimmung der geometrischen und kinematischen Beschaffenheit ausgedehnter Ziele im Fahrzeugumfeld, die zur gezielten Entwicklung von automotiven Sicherheitssystemen herangezogen werden können.
Im Rahmen dieser Arbeit werden ungeschützte Verkehrsteilnehmer wie Fußgänger und Radfahrer mittels eines hochauflösenden Automotive-Radars analysiert. Dabei steht die Erscheinung des Mikro-Doppler-Effekts, hervorgerufen durch das hohe Maß an kinematischen Freiheitsgraden der Objekte, im Vordergrund der Betrachtung. Die durch den Mikro-Doppler-Effekt entstehenden charakteristischen Radar-Signaturen erlauben eine detailliertere Perzeption der Objekte und können in direkten Zusammenhang zu ihren aktuellen Bewegungszuständen gesetzt werden. Es werden neuartige Methoden vorgestellt, die die geometrischen und kinematischen Ausdehnungen der Objekte berücksichtigen und echtzeitfähige Ansätze zur Klassifikation und Verhaltensindikation realisieren.
Wird ein ausgedehntes Ziel (z.B. Radfahrer) von einem Radarsensor detektiert, können aus dessen Mikro-Doppler-Signatur wesentliche Eigenschaften bezüglich seines Bewegungszustandes innerhalb eines Messzyklus erfasst werden. Die Geschwindigkeitsverteilungen der sich drehenden Räder erlauben eine adaptive Eingrenzung der Tretbewegung, deren Verhalten essentielle Merkmale im Hinblick auf eine vorausschauende Unfallprädiktion aufweist. Ferner unterliegen ausgedehnte Radarziele einer Orientierungsabhängigkeit, die deren geometrischen und kinematischen Profile direkt beeinflusst. Dies kann sich sowohl negativ auf die Klassifikations-Performance als auch auf die Verwertbarkeit von Parametern
auswirken, die eine Absichtsbekundung des Radarziels konstituieren. Am Beispiel des Radfahrers wird hierzu ein Verfahren vorgestellt, das die orientierungsabhängigen Parameter in Entfernung und Doppler normalisiert und die gemessenen Mehrdeutigkeiten kompensiert.
Ferner wird in dieser Arbeit eine Methodik vorgestellt, die auf Grundlage des Mikro-
Doppler-Profils eines Fußgängers dessen Beinbewegungen über die Zeit schätzt (Tracking) und wertvolle Objektinformationen hinsichtlich seines Bewegungsverhaltens offenbart. Dazu wird ein Bewegungsmodell entwickelt, das die nichtlineare Fortbewegung des Beins approximiert und dessen hohes Maß an biomechanischer Variabilität abbildet. Durch die Einbeziehung einer wahrscheinlichkeitsbasierten Datenassoziation werden die Radar-Detektionen ihren jeweils hervorrufenden Quellen (linkes und rechtes Bein) zugeordnet und
eine Trennung der Gliedmaßen realisiert. Im Gegensatz zu bisherigen Tracking-Verfahren weist die vorgestellte Methodik eine Steigerung in der Genauigkeit der Objektinformationen auf und stellt damit einen entscheidenden Vorteil für zukünftige Fahrerassistenzsysteme dar, um deutlich schneller auf kritische Verkehrssituationen reagieren zu können.:1 Introduction 1
1.1 Automotive environmental perception 2
1.2 Contributions of this work 4
1.3 Thesis overview 6
2 Automotive radar 9
2.1 Physical fundamentals 9
2.1.1 Radar cross section 9
2.1.2 Radar equation 10
2.1.3 Micro-Doppler effect 11
2.2 Radar measurement model 15
2.2.1 FMCW radar 15
2.2.2 Chirp sequence modulation 17
2.2.3 Direction-of-arrival estimation 22
2.3 Signal processing 25
2.3.1 Target properties 26
2.3.2 Target extraction 28
Power detection 28
Clustering 30
2.3.3 Real radar data example 31
2.4 Conclusion 33
3 Micro-Doppler applications of a cyclist 35
3.1 Physical fundamentals 35
3.1.1 Micro-Doppler signatures of a cyclist 35
3.1.2 Orientation dependence 36
3.2 Cyclist feature extraction 38
3.2.1 Adaptive pedaling extraction 38
Ellipticity constraints 38
Ellipse fitting algorithm 39
3.2.2 Experimental results 42
3.3 Normalization of the orientation dependence 44
3.3.1 Geometric correction 44
3.3.2 Kinematic correction 45
3.3.3 Experimental results 45
3.4 Conclusion 47
3.5 Discussion and outlook 47
4 Micro-Doppler applications of a pedestrian 49
4.1 Pedestrian detection 49
4.1.1 Human kinematics 49
4.1.2 Micro-Doppler signatures of a pedestrian 51
4.1.3 Experimental results 52
Radially moving pedestrian 52
Crossing pedestrian 54
4.2 Pedestrian feature extraction 57
4.2.1 Frequency-based limb separation 58
4.2.2 Extraction of body parts 60
4.2.3 Experimental results 62
4.3 Pedestrian tracking 64
4.3.1 Probabilistic state estimation 65
4.3.2 Gaussian filters 67
4.3.3 The Kalman filter 67
4.3.4 The extended Kalman filter 69
4.3.5 Multiple-object tracking 71
4.3.6 Data association 74
4.3.7 Joint probabilistic data association 80
4.4 Kinematic-based pedestrian tracking 84
4.4.1 Kinematic modeling 84
4.4.2 Tracking motion model 87
4.4.3 4-D radar point cloud 91
4.4.4 Tracking implementation 92
4.4.5 Experimental results 96
Longitudinal trajectory 96
Crossing trajectory with sudden turn 98
4.5 Conclusion 102
4.6 Discussion and outlook 103
5 Summary and outlook 105
5.1 Developed algorithms 105
5.1.1 Adaptive pedaling extraction 105
5.1.2 Normalization of the orientation dependence 105
5.1.3 Model-based pedestrian tracking 106
5.2 Outlook 106
Bibliography 109
List of Acronyms 119
List of Figures 124
List of Tables 125
Appendix 127
A Derivation of the rotation matrix 2.26 127
B Derivation of the mixed radar signal 2.52 129
C Calculation of the marginal association probabilities 4.51 131
Curriculum Vitae 135 / Sensor-based detection of the near field in the context of highly automated driving is experiencing a noticeable trend in the integration of radar sensor technology. Advances in
microelectronics allow the use of high-resolution radar sensors that continuously increase measurement accuracy through efficient processes in angle as well as distance and Doppler.
This opens up novel possibilities in determining the geometric and kinematic nature of extended targets in the vehicle environment, which can be used for the specific development
of automotive safety systems.
In this work, vulnerable road users such as pedestrians and cyclists are analyzed using a high-resolution automotive radar. The focus is on the appearance of the micro-Doppler
effect, caused by the objects’ high kinematic degree of freedom. The characteristic radar signatures produced by the micro-Doppler effect allow a clearer perception of the objects
and can be directly related to their current state of motion. Novel methods are presented that consider the geometric and kinematic extents of the objects and realize real-time
approaches to classification and behavioral indication.
When a radar sensor detects an extended target (e.g., bicyclist), its motion state’s fundamental properties can be captured from its micro-Doppler signature within a measurement
cycle. The spinning wheels’ velocity distributions allow an adaptive containment of the pedaling motion, whose behavior exhibits essential characteristics concerning predictive
accident prediction. Furthermore, extended radar targets are subject to orientation dependence, directly affecting their geometric and kinematic profiles. This can negatively affect
both the classification performance and the usability of parameters constituting the radar target’s intention statement. For this purpose, using the cyclist as an example, a method
is presented that normalizes the orientation-dependent parameters in range and Doppler and compensates for the measured ambiguities.
Furthermore, this paper presents a methodology that estimates a pedestrian’s leg motion over time (tracking) based on the pedestrian’s micro-Doppler profile and reveals valuable
object information regarding his motion behavior. To this end, a motion model is developed that approximates the leg’s nonlinear locomotion and represents its high degree of biomechanical variability. By incorporating likelihood-based data association, radar detections are assigned to their respective evoking sources (left and right leg), and limb separation is
realized. In contrast to previous tracking methods, the presented methodology shows an increase in the object information’s accuracy. It thus represents a decisive advantage for
future driver assistance systems in order to be able to react significantly faster to critical traffic situations.:1 Introduction 1
1.1 Automotive environmental perception 2
1.2 Contributions of this work 4
1.3 Thesis overview 6
2 Automotive radar 9
2.1 Physical fundamentals 9
2.1.1 Radar cross section 9
2.1.2 Radar equation 10
2.1.3 Micro-Doppler effect 11
2.2 Radar measurement model 15
2.2.1 FMCW radar 15
2.2.2 Chirp sequence modulation 17
2.2.3 Direction-of-arrival estimation 22
2.3 Signal processing 25
2.3.1 Target properties 26
2.3.2 Target extraction 28
Power detection 28
Clustering 30
2.3.3 Real radar data example 31
2.4 Conclusion 33
3 Micro-Doppler applications of a cyclist 35
3.1 Physical fundamentals 35
3.1.1 Micro-Doppler signatures of a cyclist 35
3.1.2 Orientation dependence 36
3.2 Cyclist feature extraction 38
3.2.1 Adaptive pedaling extraction 38
Ellipticity constraints 38
Ellipse fitting algorithm 39
3.2.2 Experimental results 42
3.3 Normalization of the orientation dependence 44
3.3.1 Geometric correction 44
3.3.2 Kinematic correction 45
3.3.3 Experimental results 45
3.4 Conclusion 47
3.5 Discussion and outlook 47
4 Micro-Doppler applications of a pedestrian 49
4.1 Pedestrian detection 49
4.1.1 Human kinematics 49
4.1.2 Micro-Doppler signatures of a pedestrian 51
4.1.3 Experimental results 52
Radially moving pedestrian 52
Crossing pedestrian 54
4.2 Pedestrian feature extraction 57
4.2.1 Frequency-based limb separation 58
4.2.2 Extraction of body parts 60
4.2.3 Experimental results 62
4.3 Pedestrian tracking 64
4.3.1 Probabilistic state estimation 65
4.3.2 Gaussian filters 67
4.3.3 The Kalman filter 67
4.3.4 The extended Kalman filter 69
4.3.5 Multiple-object tracking 71
4.3.6 Data association 74
4.3.7 Joint probabilistic data association 80
4.4 Kinematic-based pedestrian tracking 84
4.4.1 Kinematic modeling 84
4.4.2 Tracking motion model 87
4.4.3 4-D radar point cloud 91
4.4.4 Tracking implementation 92
4.4.5 Experimental results 96
Longitudinal trajectory 96
Crossing trajectory with sudden turn 98
4.5 Conclusion 102
4.6 Discussion and outlook 103
5 Summary and outlook 105
5.1 Developed algorithms 105
5.1.1 Adaptive pedaling extraction 105
5.1.2 Normalization of the orientation dependence 105
5.1.3 Model-based pedestrian tracking 106
5.2 Outlook 106
Bibliography 109
List of Acronyms 119
List of Figures 124
List of Tables 125
Appendix 127
A Derivation of the rotation matrix 2.26 127
B Derivation of the mixed radar signal 2.52 129
C Calculation of the marginal association probabilities 4.51 131
Curriculum Vitae 135
|
483 |
Segmentation and structuring of video documents for indexing applications / Segmentation et structuration de documents video pour l'indexationTapu, Ruxandra Georgina 07 December 2012 (has links)
Les progrès récents en matière de télécommunications, collaboré avec le développement des dispositifs d'acquisition d’images et de vidéos a conduit à une croissance spectaculaire de la quantité des données vidéo stockées, transmises et échangées sur l’Internet. Dans ce contexte, l'élaboration d'outils efficaces pour accéder aux éléments d’information présents dans le contenu vidéo est devenue un enjeu crucial. Dans le Chapitre 2 nous introduisons un nouvel algorithme pour la détection de changement de plans vidéo. La technique est basée sur la partition des graphes combinée avec une analyse multi-résolution et d'une opération de filtrage non-linéaire. La complexité globale de calcul est réduite par l’application d'une stratégie deux passes. Dans le Chapitre 3 le problème d’abstraction automatique est considéré. Dans notre cas, nous avons adopté un système de représentation image-clés qui extrait un nombre variable d'images de chaque plan vidéo détecté, en fonction de la variation du contenu visuel. Le Chapitre 4 traite la segmentation de haut niveau sémantique. En exploitant l'observation que les plans vidéo appartenant à la même scène ont les mêmes caractéristiques visuelles, nous introduisons un nouvel algorithme de regroupement avec contraintes temporelles, qui utilise le seuillage adaptatif et les plans vidéo neutralisés. Dans le Chapitre 5 nous abordons le thème de détection d’objets vidéo saillants. Dans ce contexte, nous avons introduit une nouvelle approche pour modéliser l'attention spatio-temporelle utilisant : la correspondance entre les points d'intérêt, les transformations géométriques et l’estimation des classes de mouvement / Recent advances in telecommunications, collaborated with the development of image and video processing and acquisition devices has lead to a spectacular growth of the amount of the visual content data stored, transmitted and exchanged over Internet. Within this context, elaborating efficient tools to access, browse and retrieve video content has become a crucial challenge. In Chapter 2 we introduce and validate a novel shot boundary detection algorithm able to identify abrupt and gradual transitions. The technique is based on an enhanced graph partition model, combined with a multi-resolution analysis and a non-linear filtering operation. The global computational complexity is reduced by implementing a two-pass approach strategy. In Chapter 3 the video abstraction problem is considered. In our case, we have developed a keyframe representation system that extracts a variable number of images from each detected shot, depending on the visual content variation. The Chapter 4 deals with the issue of high level semantic segmentation into scenes. Here, a novel scene/DVD chapter detection method is introduced and validated. Spatio-temporal coherent shots are clustered into the same scene based on a set of temporal constraints, adaptive thresholds and neutralized shots. Chapter 5 considers the issue of object detection and segmentation. Here we introduce a novel spatio-temporal visual saliency system based on: region contrast, interest points correspondence, geometric transforms, motion classes’ estimation and regions temporal consistency. The proposed technique is extended on 3D videos by representing the stereoscopic perception as a 2D video and its associated depth
|
484 |
3D Object Detection based on Unsupervised Depth EstimationManoharan, Shanmugapriyan 25 January 2022 (has links)
Estimating depth and detection of object instances in 3D space is fundamental in autonomous navigation, localization, and mapping, robotic object manipulation, and
augmented reality. RGB-D images and LiDAR point clouds are the most illustrative formats of depth information. However, depth sensors offer many shortcomings,
such as low effective spatial resolutions and capturing of a scene from a single perspective.
The thesis focuses on reproducing denser and comprehensive 3D scene structure for given monocular RGB images using depth and 3D object detection.
The first contribution of this thesis is the pipeline for the depth estimation based on an unsupervised learning framework. This thesis proposes two architectures to
analyze structure from motion and 3D geometric constraint methods. The proposed architectures trained and evaluated using only RGB images and no ground truth
depth data. The architecture proposed in this thesis achieved better results than the state-of-the-art methods.
The second contribution of this thesis is the application of the estimated depth map, which includes two algorithms: point cloud generation and collision avoidance.
The predicted depth map and RGB image are used to generate the point cloud data using the proposed point cloud algorithm. The collision avoidance algorithm predicts
the possibility of collision and provides the collision warning message based on decoding the color in the estimated depth map. This algorithm design is adaptable
to different color map with slight changes and perceives collision information in the sequence of frames.
Our third contribution is a two-stage pipeline to detect the 3D objects from a monocular image. The first stage pipeline used to detect the 2D objects and crop
the patch of the image and the same provided as the input to the second stage. In the second stage, the 3D regression network train to estimate the 3D bounding boxes
to the target objects. There are two architectures proposed for this 3D regression network model. This approach achieves better average precision than state-of-theart
for truncation of 15% or fully visible objects and lowers but comparable results for truncation more than 30% or partly/fully occluded objects.
|
485 |
3D Object Detection Using Virtual Environment Assisted Deep Network TrainingDale, Ashley S. 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / An RGBZ synthetic dataset consisting of five object classes in a variety of virtual environments and orientations was combined with a small sample of real-world
image data and used to train the Mask R-CNN (MR-CNN) architecture in a variety
of configurations. When the MR-CNN architecture was initialized with MS COCO
weights and the heads were trained with a mix of synthetic data and real world data,
F1 scores improved in four of the five classes: The average maximum F1-score of
all classes and all epochs for the networks trained with synthetic data is F1∗ = 0.91,
compared to F1 = 0.89 for the networks trained exclusively with real data, and the
standard deviation of the maximum mean F1-score for synthetically trained networks
is σ∗ = 0.015, compared to σ_F1 = 0.020 for the networks trained exclusively with real F1
data. Various backgrounds in synthetic data were shown to have negligible impact on F1 scores, opening the door to abstract backgrounds and minimizing the need for intensive synthetic data fabrication. When the MR-CNN architecture was initialized with MS COCO weights and depth data was included in the training data, the net- work was shown to rely heavily on the initial convolutional input to feed features into the network, the image depth channel was shown to influence mask generation, and the image color channels were shown to influence object classification. A set of latent variables for a subset of the synthetic datatset was generated with a Variational Autoencoder then analyzed using Principle Component Analysis and Uniform Manifold Projection and Approximation (UMAP). The UMAP analysis showed no meaningful distinction between real-world and synthetic data, and a small bias towards clustering based on image background.
|
486 |
Využití GPU pro algoritmy grafiky a zpracování obrazu / Exploitation of GPU in graphics and image processing algorithmsJošth, Radovan January 2015 (has links)
Táto práca popisuje niekoľko vybraných algoritmov, ktoré boli primárne vyvinuté pre CPU procesory, avšak vzhľadom k vysokému dopytu po ich vylepšeniach sme sa rozhodli ich využiť v prospech GPGPU (procesorov grafického adaptéra). Modifikácia týchto algoritmov bola zároveň cieľom nášho výskumu, ktorý bol prevedený pomocou CUDA rozhrania. Práca je členená podľa troch skupín algoritmov, ktorým sme sa venovali: detekcia objektov v reálnom čase, spektrálna analýza obrazu a detekcia čiar v reálnom čase. Pre výskum detekcie objektov v reálnom čase sme zvolili použitie LRD a LRP funkcií. Výskum spektrálnej analýzy obrazu bol prevedný pomocou PCA a NTF algoritmov. Pre potreby skúmania detekcie čiar v reálnom čase sme používali dva rôzne spôsoby modifikovanej akumulačnej schémy Houghovej transformácie. Pred samotnou časťou práce venujúcej sa konkrétnym algoritmom a predmetu skúmania, je v úvodných kapitolách, hneď po kapitole ozrejmujúcej dôvody skúmania vybranej problematiky, stručný prehľad architektúry GPU a GPGPU. Záverečné kapitoly sú zamerané na konkretizovanie vlastného prínosu autora, jeho zameranie, dosiahnuté výsledky a zvolený prístup k ich dosiahnutiu. Súčasťou výsledkov je niekoľko vyvinutých produktov.
|
487 |
Imaging and Object Detection under Extreme Lighting Conditions and Real World Adversarial AttacksXiangyu Qu (16385259) 22 June 2023 (has links)
<p>Imaging and computer vision systems deployed in real-world environments face the challenge of accommodating a wide range of lighting conditions. However, the cost, the demand for high resolution, and the miniaturization of imaging devices impose physical constraints on sensor design, limiting both the dynamic range and effective aperture size of each pixel. Consequently, conventional CMOS sensors fail to deliver satisfactory capture in high dynamic range scenes or under photon-limited conditions, thereby impacting the performance of downstream vision tasks. In this thesis, we address two key problems: 1) exploring the utilization of spatial multiplexing, specifically spatially varying exposure tiling, to extend sensor dynamic range and optimize scene capture, and 2) developing techniques to enhance the robustness of object detection systems under photon-limited conditions.</p>
<p><br></p>
<p>In addition to challenges imposed by natural environments, real-world vision systems are susceptible to adversarial attacks in the form of artificially added digital content. Therefore, this thesis presents a comprehensive pipeline for constructing a robust and scalable system to counter such attacks.</p>
|
488 |
Anchor-free object detection in surveillance applicationsMagnusson, Peter January 2023 (has links)
Computer vision object detection is the task of detecting and identifying objects present in an image or a video sequence. Models based on artificial convolutional neural networks are commonly used as detector models. Object detection precision and inference efficiency are crucial for surveillance-based applications. A decrease in the detector model complexity as well as in the complexity of the post-processing computations promotes increased inference efficiency. Modern object detectors for surveillance applications usually make use of a regression algorithm and bounding box priors referred to as anchor boxes to compute bounding box proposals, and the proposal selection algorithm contributes to the computational cost at inference. In this study, an anchor-free and low complexity deep learning detector model was implemented within a surveillance applications setting, and was evaluated and compared to a reference baseline state-of-the-art anchor-based object detector. A key-point-based detector model (CenterNet), predicting Gaussian distribution based object centers, was selected for the evaluation against the baseline. The surveillance applications adapted anchor-free detector exhibited a factor 2.4 lower complexity than the baseline detector. Further, a significant redistribution to shorter post-processing times was demonstrated at inference for the anchor-free surveillance adapted CenterNet detector, giving a modal values factor 0.6 of the baseline detector post-processing time. Furthermore, the surveillance adapted CenterNet model was shown to outperform the baseline in terms of detection precision for several surveillance applications relevant classes and for objects of smaller spatial scale.
|
489 |
Utveckling av stöd för synskadade med hjälp av AI och datorseende : Designprinciper för icke-visuella gränssnittSchill, William, Berngarn, Philip January 2022 (has links)
Denna studie ämnar att undersöka och identifiera lämpliga designprinciper för interaktiva system med icke-visuella gränssnitt. Genom att utveckla och ta fram ett hjälpmedel för synskadade människor med hjälp av AI och datorseende, är det möjligt att identifiera och utvärdera viktiga designprinciper. Teorier har samlats in kring interaktiva system, designprinciper, AI och datorseende för att både kunna utveckla en artefakt men också förstå befintliga designprinciper för interaktiva system. Design Science Research Methodology har använts som metod för att utveckla en artefakt i form av ett hjälpmedel som känner av olika objekt i realtid. Metoden har genom en iterativ process kunnat identifiera och utvärdera olika krav för artefakten som sedan resulterat i ett designförslag. För att identifiera kraven har kvalitativ data i form av semistrukturerade användarintervjuer samlats in från fem personer med en synskada. Avslutningsvis presenteras kopplingen mellan de krav som framkommit under intervjuerna och befintliga designprinciper för interaktiva system med grafiska användargränssnitt. Ett förslag på vidare forskning inom ämnet diskuteras också. / This study aims to examine and identify appropriate design principles for interactive systems without visual interfaces. By developing an aid for the visually impaired with the help of AI and computer vision, it is possible to identify and evaluate important design principles. Theories within interactive systems, design principles, AI and computer vision have been collected in order to develop an artifact and to understand existing design principles. Design Science Research Methodology has been used to develop an aid that can detect objects in real-time. The method has been able to identify and evaluate different requirements for the artifact through an iterative process that results in a design proposal. In order to identify the requirements, qualitative data was collected from five people with visual impairment by conducting semi-structured interviews. Finally, the connection between the requirements identified from the interviews, and the existing design principles for interactive systems with graphical user interfaces is presented. A proposal for further research within the area is also discussed.
|
490 |
Optical Inspection for Soldering Fault Detection in a PCB Assembly using Convolutional Neural NetworksBilal Akhtar, Muhammad January 2019 (has links)
Convolutional Neural Network (CNN) has been established as a powerful toolto automate various computer vision tasks without requiring any aprioriknowledge. Printed Circuit Board (PCB) manufacturers want to improve theirproduct quality by employing vision based automatic optical inspection (AOI)systems at PCB assembly manufacturing. An AOI system employs classiccomputer vision and image processing techniques to detect variousmanufacturing faults in a PCB assembly. Recently, CNN has been usedsuccessfully at various stages of automatic optical inspection. However, nonehas used 2D image of PCB assembly directly as input to a CNN. Currently, allavailable systems are specific to a PCB assembly and require a lot ofpreprocessing steps or a complex illumination system to improve theaccuracy. This master thesis attempts to design an effective soldering faultdetection system using CNN applied on image of a PCB assembly, withRaspberry Pi PCB assembly as the case in point.Soldering faults detection is considered as equivalent of object detectionprocess. YOLO (short for: “You Only Look Once”) is state-of-the-art fast objectdetection CNN. Although, it is designed for object detection in images frompublicly available datasets, we are using YOLO as a benchmark to define theperformance metrics for the proposed CNN. Besides accuracy, theeffectiveness of a trained CNN also depends on memory requirements andinference time. Accuracy of a CNN increases by adding a convolutional layer atthe expense of increased memory requirement and inference time. Theprediction layer of proposed CNN is inspired by the YOLO algorithm while thefeature extraction layer is customized to our application and is a combinationof classical CNN components with residual connection, inception module andbottleneck layer.Experimental results show that state-of-the-art object detection algorithmsare not efficient when used on a new and different dataset for object detection.Our proposed CNN detection algorithm predicts more accurately than YOLOalgorithm with an increase in average precision of 3.0%, is less complexrequiring 50% lesser number of parameters, and infers in half the time takenby YOLO. The experimental results also show that CNN can be an effectivemean of performing AOI (given there is plenty of dataset available for trainingthe CNN). / Convolutional Neural Network (CNN) har etablerats som ett kraftfullt verktygför att automatisera olika datorvisionsuppgifter utan att kräva någon apriorikunskap. Printed Circuit Board (PCB) tillverkare vill förbättra sinproduktkvalitet genom att använda visionbaserade automatiska optiskainspektionssystem (AOI) vid PCB-monteringstillverkning. Ett AOI-systemanvänder klassiska datorvisions- och bildbehandlingstekniker för att upptäckaolika tillverkningsfel i en PCB-enhet. Nyligen har CNN använts framgångsrikti olika stadier av automatisk optisk inspektion. Ingen har dock använt 2D-bildav PCB-enheten direkt som inmatning till ett CNN. För närvarande är allatillgängliga system specifika för en PCB-enhet och kräver mångaförbehandlingssteg eller ett komplext belysningssystem för att förbättranoggrannheten. Detta examensarbete försöker konstruera ett effektivtlödningsfelsdetekteringssystem med hjälp av CNN applicerat på bild av enPCB-enhet, med Raspberry Pi PCB-enhet som fallet.Detektering av lödningsfel anses vara ekvivalent medobjektdetekteringsprocessen. YOLO (förkortning: “Du ser bara en gång”) ärdet senaste snabba objektdetekteringen CNN. Även om det är utformat förobjektdetektering i bilder från offentligt tillgängliga datasätt, använder viYOLO som ett riktmärke för att definiera prestandametriken för detföreslagna CNN. Förutom noggrannhet beror effektiviteten hos en tränadCNN också på minneskrav och slutningstid. En CNNs noggrannhet ökargenom att lägga till ett invändigt lager på bekostnad av ökat minnesbehov ochinferingstid. Förutsägelseskiktet för föreslaget CNN är inspirerat av YOLOalgoritmenmedan funktionsekstraktionsskiktet anpassas efter vår applikationoch är en kombination av klassiska CNN-komponenter med restanslutning,startmodul och flaskhalsskikt.Experimentella resultat visar att modernaste objektdetekteringsalgoritmerinte är effektiva när de används i ett nytt och annorlunda datasätt förobjektdetektering. Vår föreslagna CNN-detekteringsalgoritm förutsäger merexakt än YOLO-algoritmen med en ökning av den genomsnittliga precisionenpå 3,0%, är mindre komplicerad som kräver 50% mindre antal parametraroch lägger ut under halva tiden som YOLO tar. De experimentella resultatenvisar också att CNN kan vara ett effektivt medel för att utföra AOI (med tankepå att det finns gott om datamängder tillgängliga för utbildning av CNN)
|
Page generated in 0.0961 seconds