• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 40
  • 3
  • 1
  • 1
  • Tagged with
  • 53
  • 32
  • 28
  • 28
  • 25
  • 21
  • 17
  • 16
  • 15
  • 15
  • 14
  • 14
  • 13
  • 12
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Detecting small and fast objects using image processing techniques : A project study within sport analysis

Gustafsson, Simon, Persson, Andreas January 2021 (has links)
This study has put three different object detecting techniques to the test. The goal was to investigate small and fast-moving objects to see which technique’s performance is most suitable within the sports of Padel. The study aims to cover and explain different affecting conditions that could cause better but also worse performance for small and fast object detection. The three techniques use different approaches for detecting one or multiple objects and could be a guideline for future object detection development. The proposed techniques utilize background histogram calculation, HSV masking with edge detection and DNN frameworks together with the COCO dataset. The process is tested through outdoor video footage across all techniques to generate data, which indicates that Canny edge detection is a prominent suggestion for further research given its high detection rate. However, YOLO shows excellent potential for multiple object detection at a very high confidence grade, which provides reliable and accurate detection of a targeted object. This study’s conclusion is that depending on what the end purpose aims to achieve, Canny and YOLO have potential for future small and fast object detection.
42

Object and Anomaly Detection

Klarin, Kristofer, Larsson, Daniel January 2022 (has links)
This project aims to contribute to the discussion regarding reproducibility of machinelearning research. This is done through utilizing the methods specified in the report ImprovingReproducibility in Machine Learning Research [30] to select an appropriateobject detection machine learning research paper for reproduction. Furthermore, this reportwill explain fundamental concepts of object detection. The chosen machine learningresearch paper, You Only Look Once (YOLO) [40] is then explained, implemented andtrained with various hyperparameters and pre-processing steps.While the reproduction did not achieve the results presented by the original machinelearning paper, some key insights were established. Firstly, the results of the projectdemonstrates the importance of pretraining. Secondly, the checklist provided by the NeurIPS[30] should be adjusted such that it is applicable in more situations.
43

A Study on Fault Tolerance of Object Detector Implemented on FPGA / En studie om feltolerans för objektdetektor Implementerad på FPGA

Yang, Tiancheng January 2023 (has links)
Objektdetektering har fått stort forskningsintresse de senaste åren, eftersom det är maskiners ögon och är en grundläggande uppgift inom datorseende som syftar till att identifiera och lokalisera föremål av intresse. Hårdvaruacceleratorer syftar vanligtvis till att öka genomströmningen för realtidskrav samtidigt som energiförbrukningen sänks. Studier av feltolerans säkerställer att algoritmen utförs korrekt även med felpresentation. Denna avhandling täcker dessa ämnen och tillhandahåller en Field-Programmable Gate Array (FPGA)-implementering av en objektdetekteringsalgoritm, You Only Look Once (YOLO), samtidigt som man undersöker implementeringens feltolerans. En baslinjeimplementering på FPGA tillhandahålls först och sedan tillämpas, implementeras och testas två feltoleranta implementeringar, en med trippelmodulär redundans och en med tidsredundans. Fastnade fel injiceras i implementeringarna för att studera feltoleransen. Vår FPGA-implementering av YOLO ger en höghastighets, låg strömförbrukning och mycket konfigurerbar hårdvaruaccelerator för objektdetektering. I detta examensarbete görs implementeringsdesignen med en kombination av egendesignade moduler med VHDL och Xilinx-försedd Intellectual Property (IP). Jämfört med andra forsknings- eller öppen källkodsversioner som använder High-Level Synthesis (HLS), är denna design mer konfigurerbar för framtida referenser och tar bort onödiga hårdvarusvarta lådor. Jämfört med andra studier om hårdvaruacceleratorer fokuserar denna avhandling på feltolerans. Detta examensarbete skapar utrymme för mer arbete med att utforska feltolerans, t.ex. skapa en mer feltolerant implementering eller undersöka hur vissa fel kan påverka resultatet. Det är också möjligt att använda implementeringen från denna avhandling som baslinje för andra forskningsändamål, eftersom implementeringen är fristående och mycket konfigurerbar. / Object detection gets great research interest in recent years, as it is the eyes of machines and is a fundamental task in computer vision that aims at identifying and locating objects of interest. Hardware accelerators usually aim at boosting the throughput for real-time requirements while lowering power consumption. Studies on fault tolerance ensure the algorithm to be performed correctly even with error presenting. This thesis covers these topics and provides a Field-Programmable Gate Array (FPGA) implementation of an object detection algorithm, You Only Look Once (YOLO), while investigating the fault tolerance of the implementation. A baseline implementation on FPGA is first provided and then two fault-tolerant implementations, one with triple-modular redundancy and one with time redundancy are applied, implemented, and tested. Stuck-at faults are injected into the implementations to study the fault tolerance. Our FPGA implementation of YOLO provides a high-speed, low-power-consumption, and highly-configurable hardware accelerator for object detection. In this thesis, the implementation design is done with a combination of self-designed modules with VHDL and Xilinx-provided Intellectual Property (IP). Compared to other research or open-source versions using High-Level Synthesis (HLS), this design is more configurable for future references and removes unnecessary hardware black boxes. Compared to other studies on hardware accelerators, this thesis focuses on fault tolerance. This thesis creates space for more work on exploring fault tolerance, e.g., creating a more fault-tolerant implementation or investigating how certain faults could affect the result. It is also possible to use the implementation from this thesis as a baseline for other research purposes, as the implementation is stand-alone and highly configurable.
44

Object Based Image Retrieval Using Feature Maps of a YOLOv5 Network / Objektbaserad bildhämtning med hjälp av feature maps från ett YOLOv5-nätverk

Essinger, Hugo, Kivelä, Alexander January 2022 (has links)
As Machine Learning (ML) methods have gained traction in recent years, someproblems regarding the construction of such methods have arisen. One such problem isthe collection and labeling of data sets. Specifically when it comes to many applicationsof Computer Vision (CV), one needs a set of images, labeled as either being of someclass or not. Creating such data sets can be very time consuming. This project setsout to tackle this problem by constructing an end-to-end system for searching forobjects in images (i.e. an Object Based Image Retrieval (OBIR) method) using an objectdetection framework (You Only Look Once (YOLO) [16]). The goal of the project wasto create a method that; given an image of an object of interest q, search for that sameor similar objects in a set of other images S. The core concept of the idea is to passthe image q through an object detection model (in this case YOLOv5 [16]), create a”fingerprint” (can be seen as a sort of identity for an object) from a set of feature mapsextracted from the YOLOv5 [16] model and look for corresponding similar parts of aset of feature maps extracted from other images. An investigation regarding whichvalues to select for a few different parameters was conducted, including a comparisonof performance for a couple of different similarity metrics. In the table below,the parameter combination which resulted in the highest F_Top_300-score (a measureindicating the amount of relevant images retrieved among the top 300 recommendedimages) in the parameter selection phase is presented. Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4 Evaluation of the method resulted in F_Top_300-scores as can be seen in the table below. Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696 / Medan ML-metoder har blivit mer populära under senare år har det uppstått endel problem gällande konstruktionen av sådana metoder. Ett sådant problem ärinsamling och annotering av data. Mer specifikt när det kommer till många metoderför datorseende behövs ett set av bilder, annoterande att antingen vara eller inte varaav en särskild klass. Att skapa sådana dataset kan vara väldigt tidskonsumerande.Metoden som konstruerades för detta projekt avser att bekämpa detta problem genomatt konstruera ett end-to-end-system för att söka efter objekt i bilder (alltså en OBIR-metod) med hjälp av en objektdetekteringsalgoritm (YOLO). Målet med projektet varatt skapa en metod som; givet en bild q av ett objekt, söka efter samma eller liknandeobjekt i ett bibliotek av bilder S. Huvudkonceptet bakom idén är att köra bilden qgenom objektdetekteringsmodellen (i detta fall YOLOv5 [16]), skapa ett ”fingerprint”(kan ses som en sorts identitet för ett objekt) från en samling feature maps extraheradefrån YOLOv5-modellen [16] och leta efter liknande delar av samlingar feature maps iandra bilder. En utredning angående vilka värden som skulle användas för ett antalolika parametrar utfördes, inklusive en jämförelse av prestandan som resultat av olikalikhetsmått. I tabellen nedan visas den parameterkombination som gav högst F_Top_300(ett mått som indikerar andelen relevanta bilder bland de 300 högst rekommenderadebilderna). Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4 Evaluering av metoden med parameterval enligt tabellen ovan resulterade i F_Top_300enligt tabellen nedan. Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696
45

Visual tracking systém pro UAV

KOLÁŘ, Michal January 2018 (has links)
This master thesis deals with the analysis of the current possibilities for object tracking in the image, based on which is designed a procedure for creating a system capable of tracking an object of interest. Part of this work is designing virtual reality for the needs of implementation of the tracking system, which is finally deployed and tested on a real prototype of unmanned vehicle.
46

You Only Gesture Once (YouGo): American Sign Language Translation using YOLOv3

Mehul Nanda (8786558) 01 May 2020 (has links)
<div>The study focused on creating and proposing a model that could accurately and precisely predict the occurrence of an American Sign Language gesture for an alphabet in the English Language</div><div>using the You Only Look Once (YOLOv3) Algorithm. The training dataset used for this study was custom created and was further divided into clusters based on the uniqueness of the ASL sign.</div><div>Three diverse clusters were created. Each cluster was trained with the network known as darknet. Testing was conducted using images and videos for fully trained models of each cluster and</div><div>Average Precision for each alphabet in each cluster and Mean Average Precision for each cluster was noted. In addition, a Word Builder script was created. This script combined the trained models, of all 3 clusters, to create a comprehensive system that would create words when the trained models were supplied</div><div>with images of alphabets in the English language as depicted in ASL.</div>
47

VISUAL DETECTION OF PERSONAL PROTECTIVE EQUIPMENT &amp; SAFETY GEAR ON INDUSTRY WORKERS

Strand, Fredrik, Karlsson, Jonathan January 2022 (has links)
Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions and worker safety. The goal is thus to develop a system that will improve construction workers' safety. Building such a system necessitates computer vision, which entails object recognition, facial recognition, and human recognition, among other things. The basic idea is first to detect the human and remove the background to speed up the process and avoid potential interferences. After that, the cropped image is subjected to facial and object recognition. The code is written in Python and includes libraries such as OpenCV, face_recognition, and CVZone. Some of the different algorithms chosen were YOLOv4 and Histogram of Oriented Gradients. The results were measured at three respectively five-meter distances. As a result of the system’s pipeline, algorithms, and software, a mean average precision of 99% and 89% was achieved at the respective distances. At three and five meters, the model achieved a precision rate of 100%. The recall rates were 96% - 100% at 3m and 54% - 100% at 5m. Finally, the fps was measured at 1.2 on a system without GPU. / Skador på arbetsplatsen är vanliga i dagens samhälle på grund av att skyddsutrustning inte används eller används felaktigt. Målet är därför att bygga ett robust system som ska förbättra säkerhet. Ett system som endast ger tillträde till personal med rätt skyddsutrustning kan skapas för att förbättra arbetsförhållandena och arbetarsäkerheten. Att bygga ett sådant system kräver datorseende, vilket bland annat innebär objektigenkänning, ansiktsigenkänning och mänsklig igenkänning. Grundidén är att först upptäcka människan och ta bort bakgrunden för att göra processen mer effektiv och undvika potentiella störningar. Därefter appliceras ansikts- och objektigenkänning på den beskurna bilden. Koden är skriven i Python och inkluderar bland annat bibliotek som: OpenCV, face_recognition och CVZone. Några av de algoritmer som valdes var YOLOv4 och Histogram of Oriented Gradients. Resultatet mättes på tre, respektive fem meters avstånd. Systemets pipeline, algoritmer och mjukvara gav en medelprecision för alla klasser på 99%, och 89% för respektive avstånd. För tre och fem meters avstånd uppnådde modellen en precision på 100%. Recall uppnådde värden mellan 96% - 100% vid 3 meters avstånd och 54% - 100% vid 5 meters avstånd. Avslutningsvis uppmättes antalet bilder per sekund till 1,2 på ett system utan GPU.
48

3D YOLO: End-to-End 3D Object Detection Using Point Clouds / 3D YOLO: Objektdetektering i 3D med LiDAR-data

Al Hakim, Ezeddin January 2018 (has links)
For safe and reliable driving, it is essential that an autonomous vehicle can accurately perceive the surrounding environment. Modern sensor technologies used for perception, such as LiDAR and RADAR, deliver a large set of 3D measurement points known as a point cloud. There is a huge need to interpret the point cloud data to detect other road users, such as vehicles and pedestrians. Many research studies have proposed image-based models for 2D object detection. This thesis takes it a step further and aims to develop a LiDAR-based 3D object detection model that operates in real-time, with emphasis on autonomous driving scenarios. We propose 3D YOLO, an extension of YOLO (You Only Look Once), which is one of the fastest state-of-the-art 2D object detectors for images. The proposed model takes point cloud data as input and outputs 3D bounding boxes with class scores in real-time. Most of the existing 3D object detectors use hand-crafted features, while our model follows the end-to-end learning fashion, which removes manual feature engineering. 3D YOLO pipeline consists of two networks: (a) Feature Learning Network, an artificial neural network that transforms the input point cloud to a new feature space; (b) 3DNet, a novel convolutional neural network architecture based on YOLO that learns the shape description of the objects. Our experiments on the KITTI dataset shows that the 3D YOLO has high accuracy and outperforms the state-of-the-art LiDAR-based models in efficiency. This makes it a suitable candidate for deployment in autonomous vehicles. / För att autonoma fordon ska ha en god uppfattning av sin omgivning används moderna sensorer som LiDAR och RADAR. Dessa genererar en stor mängd 3-dimensionella datapunkter som kallas point clouds. Inom utvecklingen av autonoma fordon finns det ett stort behov av att tolka LiDAR-data samt klassificera medtrafikanter. Ett stort antal studier har gjorts om 2D-objektdetektering som analyserar bilder för att upptäcka fordon, men vi är intresserade av 3D-objektdetektering med hjälp av endast LiDAR data. Därför introducerar vi modellen 3D YOLO, som bygger på YOLO (You Only Look Once), som är en av de snabbaste state-of-the-art modellerna inom 2D-objektdetektering för bilder. 3D YOLO tar in ett point cloud och producerar 3D lådor som markerar de olika objekten samt anger objektets kategori. Vi har tränat och evaluerat modellen med den publika träningsdatan KITTI. Våra resultat visar att 3D YOLO är snabbare än dagens state-of-the-art LiDAR-baserade modeller med en hög träffsäkerhet. Detta gör den till en god kandidat för kunna användas av autonoma fordon.
49

Implementation of Bolt Detection and Visual-Inertial Localization Algorithm for Tightening Tool on SoC FPGA / Implementering av bultdetektering och visuell tröghetslokaliseringsalgoritm för åtdragningsverktyg på SoC FPGA

Al Hafiz, Muhammad Ihsan January 2023 (has links)
With the emergence of Industry 4.0, there is a pronounced emphasis on the necessity for enhanced flexibility in assembly processes. In the domain of bolt-tightening, this transition is evident. Tools are now required to navigate a variety of bolts and unpredictable tightening methodologies. Each bolt, possessing distinct tightening parameters, necessitates a specific sequence to prevent issues like bolt cross-talk or unbalanced force. This thesis introduces an approach that integrates advanced computing techniques with machine learning to address these challenges in the tightening areas. The primary objective is to offer edge computation for bolt detection and tightening tools' precise localization. It is realized by leveraging visual-inertial data, all encapsulated within a System-on-Chip (SoC) Field Programmable Gate Array (FPGA). The chosen approach combines visual information and motion detection, enabling tools to quickly and precisely do the localization of the tool. All the computing is done inside the SoC FPGA. The key element for identifying different bolts is the YOLOv3-Tiny-3L model, run using the Deep-learning Processor Unit (DPU) that is implemented in the FPGA. In parallel, the thesis employs the Error-State Extended Kalman Filter (ESEKF) algorithm to fuse the visual and motion data effectively. The ESEKF is accelerated via a full implementation in Register Transfer Level (RTL) in the FPGA fabric. We examined the empirical outcomes and found that the visual-inertial localization exhibited a Root Mean Square Error (RMSE) position of 39.69 mm and a standard deviation of 9.9 mm. The precision in orientation determination yields a mean error of 4.8 degrees, offset by a standard deviation of 5.39 degrees. Notably, the entire computational process, from the initial bolt detection to its final localization, is executed in 113.1 milliseconds. This thesis articulates the feasibility of executing bolt detection and visual-inertial localization using edge computing within the SoC FPGA framework. The computation trajectory is significantly streamlined by harnessing the adaptability of programmable logic within the FPGA. This evolution signifies a step towards realizing a more adaptable and error-resistant bolt-tightening procedure in industrial areas. / Med framväxten av Industry 4.0, finns det en uttalad betoning på nödvändigheten av ökad flexibilitet i monteringsprocesser. Inom området bultåtdragning är denna övergång tydlig. Verktyg krävs nu för att navigera i en mängd olika bultar och oförutsägbara åtdragningsmetoder. Varje bult, som har distinkta åtdragningsparametrar, kräver en specifik sekvens för att förhindra problem som bultöverhörning eller obalanserad kraft. Detta examensarbete introducerar ett tillvägagångssätt som integrerar avancerade datortekniker med maskininlärning för att hantera dessa utmaningar i skärpningsområdena. Det primära målet är att erbjuda kantberäkning för bultdetektering och åtdragningsverktygs exakta lokalisering. Det realiseras genom att utnyttja visuella tröghetsdata, allt inkapslat i en System-on-Chip (SoC) Field Programmable Gate Array (FPGA). Det valda tillvägagångssättet kombinerar visuell information och rörelsedetektering, vilket gör det möjligt för verktyg att snabbt och exakt lokalisera verktyget. All beräkning sker inuti SoC FPGA. Nyckelelementet för att identifiera olika bultar är YOLOv3-Tiny-3L-modellen, som körs med hjälp av Deep-learning Processor Unit (DPU) som är implementerad i FPGA. Parallellt använder avhandlingen algoritmen Error-State Extended Kalman Filter (ESEKF) för att effektivt sammansmälta visuella data och rörelsedata. ESEKF accelereras via en fullständig implementering i Register Transfer Level (RTL) i FPGA-strukturen. Vi undersökte de empiriska resultaten och fann att den visuella tröghetslokaliseringen uppvisade en Root Mean Square Error (RMSE) position på 39,69 mm och en standardavvikelse på 9,9 mm. Precisionen i orienteringsbestämningen ger ett medelfel på 4,8 grader, kompenserat av en standardavvikelse på 5,39 grader. Noterbart är att hela beräkningsprocessen, från den första bultdetekteringen till dess slutliga lokalisering, exekveras på 113,1 millisekunder. Denna avhandling artikulerar möjligheten att utföra bultdetektering och visuell tröghetslokalisering med hjälp av kantberäkning inom SoC FPGA-ramverket. Beräkningsbanan är avsevärt effektiviserad genom att utnyttja anpassningsförmågan hos programmerbar logik inom FPGA. Denna utveckling innebär ett steg mot att förverkliga en mer anpassningsbar och felbeständig skruvdragningsprocedur i industriområden.
50

Object Detection in Domain Specific Stereo-Analysed Satellite Images

Grahn, Fredrik, Nilsson, Kristian January 2019 (has links)
Given satellite images with accompanying pixel classifications and elevation data, we propose different solutions to object detection. The first method uses hierarchical clustering for segmentation and then employs different methods of classification. One of these classification methods used domain knowledge to classify objects while the other used Support Vector Machines. Additionally, a combination of three Support Vector Machines were used in a hierarchical structure which out-performed the regular Support Vector Machine method in most of the evaluation metrics. The second approach is more conventional with different types of Convolutional Neural Networks. A segmentation network was used as well as a few detection networks and different fusions between these. The Convolutional Neural Network approach proved to be the better of the two in terms of precision and recall but the clustering approach was not far behind. This work was done using a relatively small amount of data which potentially could have impacted the results of the Machine Learning models in a negative way.

Page generated in 0.0237 seconds