Global ETD Search

51	Detecting small and fast objects using image processing techniques : A project study within sport analysis Gustafsson, Simon, Persson, Andreas January 2021 (has links) This study has put three different object detecting techniques to the test. The goal was to investigate small and fast-moving objects to see which technique’s performance is most suitable within the sports of Padel. The study aims to cover and explain different affecting conditions that could cause better but also worse performance for small and fast object detection. The three techniques use different approaches for detecting one or multiple objects and could be a guideline for future object detection development. The proposed techniques utilize background histogram calculation, HSV masking with edge detection and DNN frameworks together with the COCO dataset. The process is tested through outdoor video footage across all techniques to generate data, which indicates that Canny edge detection is a prominent suggestion for further research given its high detection rate. However, YOLO shows excellent potential for multiple object detection at a very high confidence grade, which provides reliable and accurate detection of a targeted object. This study’s conclusion is that depending on what the end purpose aims to achieve, Canny and YOLO have potential for future small and fast object detection. Sports analysis Padel Computer vision Object detection Canny edge detection YOLO CAMshift Meanshift
52	Object and Anomaly Detection Klarin, Kristofer, Larsson, Daniel January 2022 (has links) This project aims to contribute to the discussion regarding reproducibility of machinelearning research. This is done through utilizing the methods specified in the report ImprovingReproducibility in Machine Learning Research [30] to select an appropriateobject detection machine learning research paper for reproduction. Furthermore, this reportwill explain fundamental concepts of object detection. The chosen machine learningresearch paper, You Only Look Once (YOLO) [40] is then explained, implemented andtrained with various hyperparameters and pre-processing steps.While the reproduction did not achieve the results presented by the original machinelearning paper, some key insights were established. Firstly, the results of the projectdemonstrates the importance of pretraining. Secondly, the checklist provided by the NeurIPS[30] should be adjusted such that it is applicable in more situations. BSc Thesis Computer Science Computer Engineering Machine Learning Reproducibility You Only Look Once (YOLO) Convolutional Neural Network Computer Sciences Datavetenskap (datalogi)
53	A Study on Fault Tolerance of Object Detector Implemented on FPGA / En studie om feltolerans för objektdetektor Implementerad på FPGA Yang, Tiancheng January 2023 (has links) Objektdetektering har fått stort forskningsintresse de senaste åren, eftersom det är maskiners ögon och är en grundläggande uppgift inom datorseende som syftar till att identifiera och lokalisera föremål av intresse. Hårdvaruacceleratorer syftar vanligtvis till att öka genomströmningen för realtidskrav samtidigt som energiförbrukningen sänks. Studier av feltolerans säkerställer att algoritmen utförs korrekt även med felpresentation. Denna avhandling täcker dessa ämnen och tillhandahåller en Field-Programmable Gate Array (FPGA)-implementering av en objektdetekteringsalgoritm, You Only Look Once (YOLO), samtidigt som man undersöker implementeringens feltolerans. En baslinjeimplementering på FPGA tillhandahålls först och sedan tillämpas, implementeras och testas två feltoleranta implementeringar, en med trippelmodulär redundans och en med tidsredundans. Fastnade fel injiceras i implementeringarna för att studera feltoleransen. Vår FPGA-implementering av YOLO ger en höghastighets, låg strömförbrukning och mycket konfigurerbar hårdvaruaccelerator för objektdetektering. I detta examensarbete görs implementeringsdesignen med en kombination av egendesignade moduler med VHDL och Xilinx-försedd Intellectual Property (IP). Jämfört med andra forsknings- eller öppen källkodsversioner som använder High-Level Synthesis (HLS), är denna design mer konfigurerbar för framtida referenser och tar bort onödiga hårdvarusvarta lådor. Jämfört med andra studier om hårdvaruacceleratorer fokuserar denna avhandling på feltolerans. Detta examensarbete skapar utrymme för mer arbete med att utforska feltolerans, t.ex. skapa en mer feltolerant implementering eller undersöka hur vissa fel kan påverka resultatet. Det är också möjligt att använda implementeringen från denna avhandling som baslinje för andra forskningsändamål, eftersom implementeringen är fristående och mycket konfigurerbar. / Object detection gets great research interest in recent years, as it is the eyes of machines and is a fundamental task in computer vision that aims at identifying and locating objects of interest. Hardware accelerators usually aim at boosting the throughput for real-time requirements while lowering power consumption. Studies on fault tolerance ensure the algorithm to be performed correctly even with error presenting. This thesis covers these topics and provides a Field-Programmable Gate Array (FPGA) implementation of an object detection algorithm, You Only Look Once (YOLO), while investigating the fault tolerance of the implementation. A baseline implementation on FPGA is first provided and then two fault-tolerant implementations, one with triple-modular redundancy and one with time redundancy are applied, implemented, and tested. Stuck-at faults are injected into the implementations to study the fault tolerance. Our FPGA implementation of YOLO provides a high-speed, low-power-consumption, and highly-configurable hardware accelerator for object detection. In this thesis, the implementation design is done with a combination of self-designed modules with VHDL and Xilinx-provided Intellectual Property (IP). Compared to other research or open-source versions using High-Level Synthesis (HLS), this design is more configurable for future references and removes unnecessary hardware black boxes. Compared to other studies on hardware accelerators, this thesis focuses on fault tolerance. This thesis creates space for more work on exploring fault tolerance, e.g., creating a more fault-tolerant implementation or investigating how certain faults could affect the result. It is also possible to use the implementation from this thesis as a baseline for other research purposes, as the implementation is stand-alone and highly configurable. Object detection YOLO-v3-tiny Fault tolerance FPGA VHDL Hardware accelerator Triple-modular redundancy Time redundancy Stuck-at-faults Computer Sciences Datavetenskap (datalogi) Computer Engineering Datorteknik
54	Исследование задачи классификации фракции щебня на основе нейронных сетей : магистерская диссертация / Study of the problem of classification of crushed stone fractions based on neural networks Дюжев, А. К., Dyuzhev, A. K. January 2024 (has links) Topic of the work: study of the problem of classification of crushed stone fractions based on neural networks. Relevance: development of a neural network model for classifying the type of crushed stone fractions taken out of the quarry is due to the need to automate this process in order to improve the quality and speed of analysis, reduce the load on the operator. The object of the study is the problem of classifying digital images of crushed stone fractions in the back of a truck. The subject of the study is the architecture of neural networks for detection and classification of images using computer vision methods. Objective: study of the problem of classification of crushed stone fractions based on convolutional neural networks from images from an external camera. / Тема работы: исследование задачи классификации фракции щебня на основе нейронных сетей. Актуальность: разработка модели нейронной сети для классификации вида фракций щебня, вывозимого с карьера обусловлена необходимостью автоматизации данного процесса с целью повышения качества и скорости анализа, снижения нагрузки на оператора. Объект исследования – задача классификации цифровых изображений фракций щебня в кузове грузовика. Предмет исследования – архитектуры нейронных сетей для детекции и классификации изображений, с использованием методов компьютерного зрения. Цель: исследование задачи классификации фракции щебня на основе сверточных нейронных сетей по изображениям с внешней камеры. MASTER'S THESIS IMAGE CLASSIFICATION NEURAL NETWORKS FOR CLASSIFICATION DETECTORS OBJECT DETECTION BOUNDING AREA YOLO ДЕТЕКТОРЫ ОБНАРУЖЕНИЕ ОБЪЕКТА
55	Object Based Image Retrieval Using Feature Maps of a YOLOv5 Network / Objektbaserad bildhämtning med hjälp av feature maps från ett YOLOv5-nätverk Essinger, Hugo, Kivelä, Alexander January 2022 (has links) As Machine Learning (ML) methods have gained traction in recent years, someproblems regarding the construction of such methods have arisen. One such problem isthe collection and labeling of data sets. Specifically when it comes to many applicationsof Computer Vision (CV), one needs a set of images, labeled as either being of someclass or not. Creating such data sets can be very time consuming. This project setsout to tackle this problem by constructing an end-to-end system for searching forobjects in images (i.e. an Object Based Image Retrieval (OBIR) method) using an objectdetection framework (You Only Look Once (YOLO) [16]). The goal of the project wasto create a method that; given an image of an object of interest q, search for that sameor similar objects in a set of other images S. The core concept of the idea is to passthe image q through an object detection model (in this case YOLOv5 [16]), create a”fingerprint” (can be seen as a sort of identity for an object) from a set of feature mapsextracted from the YOLOv5 [16] model and look for corresponding similar parts of aset of feature maps extracted from other images. An investigation regarding whichvalues to select for a few different parameters was conducted, including a comparisonof performance for a couple of different similarity metrics. In the table below,the parameter combination which resulted in the highest F_Top_300-score (a measureindicating the amount of relevant images retrieved among the top 300 recommendedimages) in the parameter selection phase is presented. Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4 Evaluation of the method resulted in F_Top_300-scores as can be seen in the table below. Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696 / Medan ML-metoder har blivit mer populära under senare år har det uppstått endel problem gällande konstruktionen av sådana metoder. Ett sådant problem ärinsamling och annotering av data. Mer specifikt när det kommer till många metoderför datorseende behövs ett set av bilder, annoterande att antingen vara eller inte varaav en särskild klass. Att skapa sådana dataset kan vara väldigt tidskonsumerande.Metoden som konstruerades för detta projekt avser att bekämpa detta problem genomatt konstruera ett end-to-end-system för att söka efter objekt i bilder (alltså en OBIR-metod) med hjälp av en objektdetekteringsalgoritm (YOLO). Målet med projektet varatt skapa en metod som; givet en bild q av ett objekt, söka efter samma eller liknandeobjekt i ett bibliotek av bilder S. Huvudkonceptet bakom idén är att köra bilden qgenom objektdetekteringsmodellen (i detta fall YOLOv5 [16]), skapa ett ”fingerprint”(kan ses som en sorts identitet för ett objekt) från en samling feature maps extraheradefrån YOLOv5-modellen [16] och leta efter liknande delar av samlingar feature maps iandra bilder. En utredning angående vilka värden som skulle användas för ett antalolika parametrar utfördes, inklusive en jämförelse av prestandan som resultat av olikalikhetsmått. I tabellen nedan visas den parameterkombination som gav högst F_Top_300(ett mått som indikerar andelen relevanta bilder bland de 300 högst rekommenderadebilderna). Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4 Evaluering av metoden med parameterval enligt tabellen ovan resulterade i F_Top_300enligt tabellen nedan. Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696 Content based image retrieval CBIR Object based image retrieval OBIR image retrieval YOLO YOLOv5 object detection PyTorch deep learning convolutional neural network CNN Content based image retrieval CBIR Object based image retrieval OBIR image retrieval YOLO YOLOv5 object detection PyTorch deep learning convolutional neural network CNN Other Mathematics Annan matematik
56	Visual tracking systém pro UAV KOLÁŘ, Michal January 2018 (has links) This master thesis deals with the analysis of the current possibilities for object tracking in the image, based on which is designed a procedure for creating a system capable of tracking an object of interest. Part of this work is designing virtual reality for the needs of implementation of the tracking system, which is finally deployed and tested on a real prototype of unmanned vehicle.
57	You Only Gesture Once (YouGo): American Sign Language Translation using YOLOv3 Mehul Nanda (8786558) 01 May 2020 (has links) <div>The study focused on creating and proposing a model that could accurately and precisely predict the occurrence of an American Sign Language gesture for an alphabet in the English Language</div><div>using the You Only Look Once (YOLOv3) Algorithm. The training dataset used for this study was custom created and was further divided into clusters based on the uniqueness of the ASL sign.</div><div>Three diverse clusters were created. Each cluster was trained with the network known as darknet. Testing was conducted using images and videos for fully trained models of each cluster and</div><div>Average Precision for each alphabet in each cluster and Mean Average Precision for each cluster was noted. In addition, a Word Builder script was created. This script combined the trained models, of all 3 clusters, to create a comprehensive system that would create words when the trained models were supplied</div><div>with images of alphabets in the English language as depicted in ASL.</div> Computer Vision Image Processing Pattern Recognition and Data Mining Object Detection Neural Networks YOLO YOLOv3 Sign Language Sign Language Translation ASL Image Processing Convolutional Neural Network
58	VISUAL DETECTION OF PERSONAL PROTECTIVE EQUIPMENT & SAFETY GEAR ON INDUSTRY WORKERS Strand, Fredrik, Karlsson, Jonathan January 2022 (has links) Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions and worker safety. The goal is thus to develop a system that will improve construction workers' safety. Building such a system necessitates computer vision, which entails object recognition, facial recognition, and human recognition, among other things. The basic idea is first to detect the human and remove the background to speed up the process and avoid potential interferences. After that, the cropped image is subjected to facial and object recognition. The code is written in Python and includes libraries such as OpenCV, face_recognition, and CVZone. Some of the different algorithms chosen were YOLOv4 and Histogram of Oriented Gradients. The results were measured at three respectively five-meter distances. As a result of the system’s pipeline, algorithms, and software, a mean average precision of 99% and 89% was achieved at the respective distances. At three and five meters, the model achieved a precision rate of 100%. The recall rates were 96% - 100% at 3m and 54% - 100% at 5m. Finally, the fps was measured at 1.2 on a system without GPU. / Skador på arbetsplatsen är vanliga i dagens samhälle på grund av att skyddsutrustning inte används eller används felaktigt. Målet är därför att bygga ett robust system som ska förbättra säkerhet. Ett system som endast ger tillträde till personal med rätt skyddsutrustning kan skapas för att förbättra arbetsförhållandena och arbetarsäkerheten. Att bygga ett sådant system kräver datorseende, vilket bland annat innebär objektigenkänning, ansiktsigenkänning och mänsklig igenkänning. Grundidén är att först upptäcka människan och ta bort bakgrunden för att göra processen mer effektiv och undvika potentiella störningar. Därefter appliceras ansikts- och objektigenkänning på den beskurna bilden. Koden är skriven i Python och inkluderar bland annat bibliotek som: OpenCV, face_recognition och CVZone. Några av de algoritmer som valdes var YOLOv4 och Histogram of Oriented Gradients. Resultatet mättes på tre, respektive fem meters avstånd. Systemets pipeline, algoritmer och mjukvara gav en medelprecision för alla klasser på 99%, och 89% för respektive avstånd. För tre och fem meters avstånd uppnådde modellen en precision på 100%. Recall uppnådde värden mellan 96% - 100% vid 3 meters avstånd och 54% - 100% vid 5 meters avstånd. Avslutningsvis uppmättes antalet bilder per sekund till 1,2 på ett system utan GPU. Machine learning yolo yolov4 cnn ppe personal protective equipment object detection computer vision deep learning artificial intelligence ai real-time object detection real-time safety construction workers industry face recognition
59	3D YOLO: End-to-End 3D Object Detection Using Point Clouds / 3D YOLO: Objektdetektering i 3D med LiDAR-data Al Hakim, Ezeddin January 2018 (has links) For safe and reliable driving, it is essential that an autonomous vehicle can accurately perceive the surrounding environment. Modern sensor technologies used for perception, such as LiDAR and RADAR, deliver a large set of 3D measurement points known as a point cloud. There is a huge need to interpret the point cloud data to detect other road users, such as vehicles and pedestrians. Many research studies have proposed image-based models for 2D object detection. This thesis takes it a step further and aims to develop a LiDAR-based 3D object detection model that operates in real-time, with emphasis on autonomous driving scenarios. We propose 3D YOLO, an extension of YOLO (You Only Look Once), which is one of the fastest state-of-the-art 2D object detectors for images. The proposed model takes point cloud data as input and outputs 3D bounding boxes with class scores in real-time. Most of the existing 3D object detectors use hand-crafted features, while our model follows the end-to-end learning fashion, which removes manual feature engineering. 3D YOLO pipeline consists of two networks: (a) Feature Learning Network, an artificial neural network that transforms the input point cloud to a new feature space; (b) 3DNet, a novel convolutional neural network architecture based on YOLO that learns the shape description of the objects. Our experiments on the KITTI dataset shows that the 3D YOLO has high accuracy and outperforms the state-of-the-art LiDAR-based models in efficiency. This makes it a suitable candidate for deployment in autonomous vehicles. / För att autonoma fordon ska ha en god uppfattning av sin omgivning används moderna sensorer som LiDAR och RADAR. Dessa genererar en stor mängd 3-dimensionella datapunkter som kallas point clouds. Inom utvecklingen av autonoma fordon finns det ett stort behov av att tolka LiDAR-data samt klassificera medtrafikanter. Ett stort antal studier har gjorts om 2D-objektdetektering som analyserar bilder för att upptäcka fordon, men vi är intresserade av 3D-objektdetektering med hjälp av endast LiDAR data. Därför introducerar vi modellen 3D YOLO, som bygger på YOLO (You Only Look Once), som är en av de snabbaste state-of-the-art modellerna inom 2D-objektdetektering för bilder. 3D YOLO tar in ett point cloud och producerar 3D lådor som markerar de olika objekten samt anger objektets kategori. Vi har tränat och evaluerat modellen med den publika träningsdatan KITTI. Våra resultat visar att 3D YOLO är snabbare än dagens state-of-the-art LiDAR-baserade modeller med en hög träffsäkerhet. Detta gör den till en god kandidat för kunna användas av autonoma fordon. Computer vision Machine Learning Autonomous Vehicles Autonomous Cars Lider Point Cloud Deep Learning Object Detection YOLO Computer Sciences Datavetenskap (datalogi)
60	<b>INTEGRATION OF UAV AND LLM IN AGRICULTURAL ENVIRONMENT</b> Sudeep Reddy Angamgari (20431028) 16 December 2024 (has links) <p dir="ltr">Unmanned Aerial Vehicles (UAVs) are increasingly applied in agricultural tasks such as crop monitoring, especially with AI-driven enhancements significantly increasing their autonomy and ability to execute complex operations without human interventions. However, existing UAV systems lack efficiency, intuitive user interfaces using natural language processing for command input, and robust security which is essential for real-time operations in dynamic environments. In this paper, we propose a novel solution to create a secure, efficient, and user-friendly interface for UAV control by integrating Large Language Model (LLM) with the case study on agricultural environment. In particular, we designed a four-stage approach that allows only authorized user to issue voice commands to the UAV. The command is issued to the LLM controller processed by LLM using API and generates UAV control code. Additionally, we focus on optimizing UAV battery life and enhancing scene interpretation of the environment. We evaluate our approach using AirSim and an agricultural setting built in Unreal Engine, testing under various conditions, including variable weather and wind factors. Our experimental results confirm our method's effectiveness, demonstrating improved operational efficiency and adaptability in diverse agricultural scenarios.</p> Autonomous agents and multiagent systems Modelling and simulation Speech recognition artificial intelligence (< Generative pre-trained transformers Light Detection And Ranging - LIDAR large language models in education Unmanned Aerial Vehicle Path Planning You Only Look Once (YOLO)

Search results