Spelling suggestions: "subject:"solo"" "subject:"colo""
51 |
A Study on Fault Tolerance of Object Detector Implemented on FPGA / En studie om feltolerans för objektdetektor Implementerad på FPGAYang, Tiancheng January 2023 (has links)
Objektdetektering har fått stort forskningsintresse de senaste åren, eftersom det är maskiners ögon och är en grundläggande uppgift inom datorseende som syftar till att identifiera och lokalisera föremål av intresse. Hårdvaruacceleratorer syftar vanligtvis till att öka genomströmningen för realtidskrav samtidigt som energiförbrukningen sänks. Studier av feltolerans säkerställer att algoritmen utförs korrekt även med felpresentation. Denna avhandling täcker dessa ämnen och tillhandahåller en Field-Programmable Gate Array (FPGA)-implementering av en objektdetekteringsalgoritm, You Only Look Once (YOLO), samtidigt som man undersöker implementeringens feltolerans. En baslinjeimplementering på FPGA tillhandahålls först och sedan tillämpas, implementeras och testas två feltoleranta implementeringar, en med trippelmodulär redundans och en med tidsredundans. Fastnade fel injiceras i implementeringarna för att studera feltoleransen. Vår FPGA-implementering av YOLO ger en höghastighets, låg strömförbrukning och mycket konfigurerbar hårdvaruaccelerator för objektdetektering. I detta examensarbete görs implementeringsdesignen med en kombination av egendesignade moduler med VHDL och Xilinx-försedd Intellectual Property (IP). Jämfört med andra forsknings- eller öppen källkodsversioner som använder High-Level Synthesis (HLS), är denna design mer konfigurerbar för framtida referenser och tar bort onödiga hårdvarusvarta lådor. Jämfört med andra studier om hårdvaruacceleratorer fokuserar denna avhandling på feltolerans. Detta examensarbete skapar utrymme för mer arbete med att utforska feltolerans, t.ex. skapa en mer feltolerant implementering eller undersöka hur vissa fel kan påverka resultatet. Det är också möjligt att använda implementeringen från denna avhandling som baslinje för andra forskningsändamål, eftersom implementeringen är fristående och mycket konfigurerbar. / Object detection gets great research interest in recent years, as it is the eyes of machines and is a fundamental task in computer vision that aims at identifying and locating objects of interest. Hardware accelerators usually aim at boosting the throughput for real-time requirements while lowering power consumption. Studies on fault tolerance ensure the algorithm to be performed correctly even with error presenting. This thesis covers these topics and provides a Field-Programmable Gate Array (FPGA) implementation of an object detection algorithm, You Only Look Once (YOLO), while investigating the fault tolerance of the implementation. A baseline implementation on FPGA is first provided and then two fault-tolerant implementations, one with triple-modular redundancy and one with time redundancy are applied, implemented, and tested. Stuck-at faults are injected into the implementations to study the fault tolerance. Our FPGA implementation of YOLO provides a high-speed, low-power-consumption, and highly-configurable hardware accelerator for object detection. In this thesis, the implementation design is done with a combination of self-designed modules with VHDL and Xilinx-provided Intellectual Property (IP). Compared to other research or open-source versions using High-Level Synthesis (HLS), this design is more configurable for future references and removes unnecessary hardware black boxes. Compared to other studies on hardware accelerators, this thesis focuses on fault tolerance. This thesis creates space for more work on exploring fault tolerance, e.g., creating a more fault-tolerant implementation or investigating how certain faults could affect the result. It is also possible to use the implementation from this thesis as a baseline for other research purposes, as the implementation is stand-alone and highly configurable.
|
52 |
Исследование задачи классификации фракции щебня на основе нейронных сетей : магистерская диссертация / Study of the problem of classification of crushed stone fractions based on neural networksДюжев, А. К., Dyuzhev, A. K. January 2024 (has links)
Topic of the work: study of the problem of classification of crushed stone fractions based on neural networks. Relevance: development of a neural network model for classifying the type of crushed stone fractions taken out of the quarry is due to the need to automate this process in order to improve the quality and speed of analysis, reduce the load on the operator. The object of the study is the problem of classifying digital images of crushed stone fractions in the back of a truck. The subject of the study is the architecture of neural networks for detection and classification of images using computer vision methods. Objective: study of the problem of classification of crushed stone fractions based on convolutional neural networks from images from an external camera. / Тема работы: исследование задачи классификации фракции щебня на основе нейронных сетей. Актуальность: разработка модели нейронной сети для классификации вида фракций щебня, вывозимого с карьера обусловлена необходимостью автоматизации данного процесса с целью повышения качества и скорости анализа, снижения нагрузки на оператора. Объект исследования – задача классификации цифровых изображений фракций щебня в кузове грузовика. Предмет исследования – архитектуры нейронных сетей для детекции и классификации изображений, с использованием методов компьютерного зрения. Цель: исследование задачи классификации фракции щебня на основе сверточных нейронных сетей по изображениям с внешней камеры.
|
53 |
Object Based Image Retrieval Using Feature Maps of a YOLOv5 Network / Objektbaserad bildhämtning med hjälp av feature maps från ett YOLOv5-nätverkEssinger, Hugo, Kivelä, Alexander January 2022 (has links)
As Machine Learning (ML) methods have gained traction in recent years, someproblems regarding the construction of such methods have arisen. One such problem isthe collection and labeling of data sets. Specifically when it comes to many applicationsof Computer Vision (CV), one needs a set of images, labeled as either being of someclass or not. Creating such data sets can be very time consuming. This project setsout to tackle this problem by constructing an end-to-end system for searching forobjects in images (i.e. an Object Based Image Retrieval (OBIR) method) using an objectdetection framework (You Only Look Once (YOLO) [16]). The goal of the project wasto create a method that; given an image of an object of interest q, search for that sameor similar objects in a set of other images S. The core concept of the idea is to passthe image q through an object detection model (in this case YOLOv5 [16]), create a”fingerprint” (can be seen as a sort of identity for an object) from a set of feature mapsextracted from the YOLOv5 [16] model and look for corresponding similar parts of aset of feature maps extracted from other images. An investigation regarding whichvalues to select for a few different parameters was conducted, including a comparisonof performance for a couple of different similarity metrics. In the table below,the parameter combination which resulted in the highest F_Top_300-score (a measureindicating the amount of relevant images retrieved among the top 300 recommendedimages) in the parameter selection phase is presented. Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4 Evaluation of the method resulted in F_Top_300-scores as can be seen in the table below. Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696 / Medan ML-metoder har blivit mer populära under senare år har det uppstått endel problem gällande konstruktionen av sådana metoder. Ett sådant problem ärinsamling och annotering av data. Mer specifikt när det kommer till många metoderför datorseende behövs ett set av bilder, annoterande att antingen vara eller inte varaav en särskild klass. Att skapa sådana dataset kan vara väldigt tidskonsumerande.Metoden som konstruerades för detta projekt avser att bekämpa detta problem genomatt konstruera ett end-to-end-system för att söka efter objekt i bilder (alltså en OBIR-metod) med hjälp av en objektdetekteringsalgoritm (YOLO). Målet med projektet varatt skapa en metod som; givet en bild q av ett objekt, söka efter samma eller liknandeobjekt i ett bibliotek av bilder S. Huvudkonceptet bakom idén är att köra bilden qgenom objektdetekteringsmodellen (i detta fall YOLOv5 [16]), skapa ett ”fingerprint”(kan ses som en sorts identitet för ett objekt) från en samling feature maps extraheradefrån YOLOv5-modellen [16] och leta efter liknande delar av samlingar feature maps iandra bilder. En utredning angående vilka värden som skulle användas för ett antalolika parametrar utfördes, inklusive en jämförelse av prestandan som resultat av olikalikhetsmått. I tabellen nedan visas den parameterkombination som gav högst F_Top_300(ett mått som indikerar andelen relevanta bilder bland de 300 högst rekommenderadebilderna). Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4 Evaluering av metoden med parameterval enligt tabellen ovan resulterade i F_Top_300enligt tabellen nedan. Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696
|
54 |
Visual tracking systém pro UAVKOLÁŘ, Michal January 2018 (has links)
This master thesis deals with the analysis of the current possibilities for object tracking in the image, based on which is designed a procedure for creating a system capable of tracking an object of interest. Part of this work is designing virtual reality for the needs of implementation of the tracking system, which is finally deployed and tested on a real prototype of unmanned vehicle.
|
55 |
You Only Gesture Once (YouGo): American Sign Language Translation using YOLOv3Mehul Nanda (8786558) 01 May 2020 (has links)
<div>The study focused on creating and proposing a model that could accurately and precisely predict the occurrence of an American Sign Language gesture for an alphabet in the English Language</div><div>using the You Only Look Once (YOLOv3) Algorithm. The training dataset used for this study was custom created and was further divided into clusters based on the uniqueness of the ASL sign.</div><div>Three diverse clusters were created. Each cluster was trained with the network known as darknet. Testing was conducted using images and videos for fully trained models of each cluster and</div><div>Average Precision for each alphabet in each cluster and Mean Average Precision for each cluster was noted. In addition, a Word Builder script was created. This script combined the trained models, of all 3 clusters, to create a comprehensive system that would create words when the trained models were supplied</div><div>with images of alphabets in the English language as depicted in ASL.</div>
|
56 |
VISUAL DETECTION OF PERSONAL PROTECTIVE EQUIPMENT & SAFETY GEAR ON INDUSTRY WORKERSStrand, Fredrik, Karlsson, Jonathan January 2022 (has links)
Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions and worker safety. The goal is thus to develop a system that will improve construction workers' safety. Building such a system necessitates computer vision, which entails object recognition, facial recognition, and human recognition, among other things. The basic idea is first to detect the human and remove the background to speed up the process and avoid potential interferences. After that, the cropped image is subjected to facial and object recognition. The code is written in Python and includes libraries such as OpenCV, face_recognition, and CVZone. Some of the different algorithms chosen were YOLOv4 and Histogram of Oriented Gradients. The results were measured at three respectively five-meter distances. As a result of the system’s pipeline, algorithms, and software, a mean average precision of 99% and 89% was achieved at the respective distances. At three and five meters, the model achieved a precision rate of 100%. The recall rates were 96% - 100% at 3m and 54% - 100% at 5m. Finally, the fps was measured at 1.2 on a system without GPU. / Skador på arbetsplatsen är vanliga i dagens samhälle på grund av att skyddsutrustning inte används eller används felaktigt. Målet är därför att bygga ett robust system som ska förbättra säkerhet. Ett system som endast ger tillträde till personal med rätt skyddsutrustning kan skapas för att förbättra arbetsförhållandena och arbetarsäkerheten. Att bygga ett sådant system kräver datorseende, vilket bland annat innebär objektigenkänning, ansiktsigenkänning och mänsklig igenkänning. Grundidén är att först upptäcka människan och ta bort bakgrunden för att göra processen mer effektiv och undvika potentiella störningar. Därefter appliceras ansikts- och objektigenkänning på den beskurna bilden. Koden är skriven i Python och inkluderar bland annat bibliotek som: OpenCV, face_recognition och CVZone. Några av de algoritmer som valdes var YOLOv4 och Histogram of Oriented Gradients. Resultatet mättes på tre, respektive fem meters avstånd. Systemets pipeline, algoritmer och mjukvara gav en medelprecision för alla klasser på 99%, och 89% för respektive avstånd. För tre och fem meters avstånd uppnådde modellen en precision på 100%. Recall uppnådde värden mellan 96% - 100% vid 3 meters avstånd och 54% - 100% vid 5 meters avstånd. Avslutningsvis uppmättes antalet bilder per sekund till 1,2 på ett system utan GPU.
|
57 |
3D YOLO: End-to-End 3D Object Detection Using Point Clouds / 3D YOLO: Objektdetektering i 3D med LiDAR-dataAl Hakim, Ezeddin January 2018 (has links)
For safe and reliable driving, it is essential that an autonomous vehicle can accurately perceive the surrounding environment. Modern sensor technologies used for perception, such as LiDAR and RADAR, deliver a large set of 3D measurement points known as a point cloud. There is a huge need to interpret the point cloud data to detect other road users, such as vehicles and pedestrians. Many research studies have proposed image-based models for 2D object detection. This thesis takes it a step further and aims to develop a LiDAR-based 3D object detection model that operates in real-time, with emphasis on autonomous driving scenarios. We propose 3D YOLO, an extension of YOLO (You Only Look Once), which is one of the fastest state-of-the-art 2D object detectors for images. The proposed model takes point cloud data as input and outputs 3D bounding boxes with class scores in real-time. Most of the existing 3D object detectors use hand-crafted features, while our model follows the end-to-end learning fashion, which removes manual feature engineering. 3D YOLO pipeline consists of two networks: (a) Feature Learning Network, an artificial neural network that transforms the input point cloud to a new feature space; (b) 3DNet, a novel convolutional neural network architecture based on YOLO that learns the shape description of the objects. Our experiments on the KITTI dataset shows that the 3D YOLO has high accuracy and outperforms the state-of-the-art LiDAR-based models in efficiency. This makes it a suitable candidate for deployment in autonomous vehicles. / För att autonoma fordon ska ha en god uppfattning av sin omgivning används moderna sensorer som LiDAR och RADAR. Dessa genererar en stor mängd 3-dimensionella datapunkter som kallas point clouds. Inom utvecklingen av autonoma fordon finns det ett stort behov av att tolka LiDAR-data samt klassificera medtrafikanter. Ett stort antal studier har gjorts om 2D-objektdetektering som analyserar bilder för att upptäcka fordon, men vi är intresserade av 3D-objektdetektering med hjälp av endast LiDAR data. Därför introducerar vi modellen 3D YOLO, som bygger på YOLO (You Only Look Once), som är en av de snabbaste state-of-the-art modellerna inom 2D-objektdetektering för bilder. 3D YOLO tar in ett point cloud och producerar 3D lådor som markerar de olika objekten samt anger objektets kategori. Vi har tränat och evaluerat modellen med den publika träningsdatan KITTI. Våra resultat visar att 3D YOLO är snabbare än dagens state-of-the-art LiDAR-baserade modeller med en hög träffsäkerhet. Detta gör den till en god kandidat för kunna användas av autonoma fordon.
|
58 |
<b>INTEGRATION OF UAV AND LLM IN AGRICULTURAL ENVIRONMENT</b>Sudeep Reddy Angamgari (20431028) 16 December 2024 (has links)
<p dir="ltr">Unmanned Aerial Vehicles (UAVs) are increasingly applied in agricultural tasks such as crop monitoring, especially with AI-driven enhancements significantly increasing their autonomy and ability to execute complex operations without human interventions. However, existing UAV systems lack efficiency, intuitive user interfaces using natural language processing for command input, and robust security which is essential for real-time operations in dynamic environments. In this paper, we propose a novel solution to create a secure, efficient, and user-friendly interface for UAV control by integrating Large Language Model (LLM) with the case study on agricultural environment. In particular, we designed a four-stage approach that allows only authorized user to issue voice commands to the UAV. The command is issued to the LLM controller processed by LLM using API and generates UAV control code. Additionally, we focus on optimizing UAV battery life and enhancing scene interpretation of the environment. We evaluate our approach using AirSim and an agricultural setting built in Unreal Engine, testing under various conditions, including variable weather and wind factors. Our experimental results confirm our method's effectiveness, demonstrating improved operational efficiency and adaptability in diverse agricultural scenarios.</p>
|
59 |
A Multi-Fidelity Approach to Testing and Evaluation of AI-Enabled SystemsRobert Joseph Seif (19206790) 27 July 2024 (has links)
<p dir="ltr">Approaches to system testing and evaluation (T&E) are becoming increasingly relevant as artificial intelligence (AI)/machine learning (ML) technology expands across the industry’s current landscape. As the AI/ML landscape continues to develop, greater amounts of data are required to build the next generation of technology. Multiple communities have worked to create frameworks to interact with such scales of data, yet a gap persists in the ability to utilize data generated throughout the development process to support the for use in a T&E program. The objective of this thesis is to address this gap through a multi-fidelity approach to the test and evaluation of AI-enabled systems. This approach is constructed using a space of models to visualize similarities and differences between each individual model. Once requirements and potential tests that models can be employed to fulfill are organized, a method to sequentially select models for testing is utilized. Models are selected to maximize utility, dependent on model performance and cost to the T&E team. Experimentation was conducted through the case of an autonomous vehicle (AV) perception system, where models were constructed using a simulation of the Purdue University campus for AVs to drive around. Results show that the proposed approach, when paired with Bayesian Optimization for sequential test selection through an expected improvement acquisition function, can effectively select models in a manner that works to minimize uncertainty and cost for the test team. Through computational experiments, the proposed approach can be used to develop test combinations that minimize costs and maximize utility while maximizing the information a T&E team has on how well a system can meet a set of testing requirements in operational conditions.</p>
|
60 |
Implementation of Bolt Detection and Visual-Inertial Localization Algorithm for Tightening Tool on SoC FPGA / Implementering av bultdetektering och visuell tröghetslokaliseringsalgoritm för åtdragningsverktyg på SoC FPGAAl Hafiz, Muhammad Ihsan January 2023 (has links)
With the emergence of Industry 4.0, there is a pronounced emphasis on the necessity for enhanced flexibility in assembly processes. In the domain of bolt-tightening, this transition is evident. Tools are now required to navigate a variety of bolts and unpredictable tightening methodologies. Each bolt, possessing distinct tightening parameters, necessitates a specific sequence to prevent issues like bolt cross-talk or unbalanced force. This thesis introduces an approach that integrates advanced computing techniques with machine learning to address these challenges in the tightening areas. The primary objective is to offer edge computation for bolt detection and tightening tools' precise localization. It is realized by leveraging visual-inertial data, all encapsulated within a System-on-Chip (SoC) Field Programmable Gate Array (FPGA). The chosen approach combines visual information and motion detection, enabling tools to quickly and precisely do the localization of the tool. All the computing is done inside the SoC FPGA. The key element for identifying different bolts is the YOLOv3-Tiny-3L model, run using the Deep-learning Processor Unit (DPU) that is implemented in the FPGA. In parallel, the thesis employs the Error-State Extended Kalman Filter (ESEKF) algorithm to fuse the visual and motion data effectively. The ESEKF is accelerated via a full implementation in Register Transfer Level (RTL) in the FPGA fabric. We examined the empirical outcomes and found that the visual-inertial localization exhibited a Root Mean Square Error (RMSE) position of 39.69 mm and a standard deviation of 9.9 mm. The precision in orientation determination yields a mean error of 4.8 degrees, offset by a standard deviation of 5.39 degrees. Notably, the entire computational process, from the initial bolt detection to its final localization, is executed in 113.1 milliseconds. This thesis articulates the feasibility of executing bolt detection and visual-inertial localization using edge computing within the SoC FPGA framework. The computation trajectory is significantly streamlined by harnessing the adaptability of programmable logic within the FPGA. This evolution signifies a step towards realizing a more adaptable and error-resistant bolt-tightening procedure in industrial areas. / Med framväxten av Industry 4.0, finns det en uttalad betoning på nödvändigheten av ökad flexibilitet i monteringsprocesser. Inom området bultåtdragning är denna övergång tydlig. Verktyg krävs nu för att navigera i en mängd olika bultar och oförutsägbara åtdragningsmetoder. Varje bult, som har distinkta åtdragningsparametrar, kräver en specifik sekvens för att förhindra problem som bultöverhörning eller obalanserad kraft. Detta examensarbete introducerar ett tillvägagångssätt som integrerar avancerade datortekniker med maskininlärning för att hantera dessa utmaningar i skärpningsområdena. Det primära målet är att erbjuda kantberäkning för bultdetektering och åtdragningsverktygs exakta lokalisering. Det realiseras genom att utnyttja visuella tröghetsdata, allt inkapslat i en System-on-Chip (SoC) Field Programmable Gate Array (FPGA). Det valda tillvägagångssättet kombinerar visuell information och rörelsedetektering, vilket gör det möjligt för verktyg att snabbt och exakt lokalisera verktyget. All beräkning sker inuti SoC FPGA. Nyckelelementet för att identifiera olika bultar är YOLOv3-Tiny-3L-modellen, som körs med hjälp av Deep-learning Processor Unit (DPU) som är implementerad i FPGA. Parallellt använder avhandlingen algoritmen Error-State Extended Kalman Filter (ESEKF) för att effektivt sammansmälta visuella data och rörelsedata. ESEKF accelereras via en fullständig implementering i Register Transfer Level (RTL) i FPGA-strukturen. Vi undersökte de empiriska resultaten och fann att den visuella tröghetslokaliseringen uppvisade en Root Mean Square Error (RMSE) position på 39,69 mm och en standardavvikelse på 9,9 mm. Precisionen i orienteringsbestämningen ger ett medelfel på 4,8 grader, kompenserat av en standardavvikelse på 5,39 grader. Noterbart är att hela beräkningsprocessen, från den första bultdetekteringen till dess slutliga lokalisering, exekveras på 113,1 millisekunder. Denna avhandling artikulerar möjligheten att utföra bultdetektering och visuell tröghetslokalisering med hjälp av kantberäkning inom SoC FPGA-ramverket. Beräkningsbanan är avsevärt effektiviserad genom att utnyttja anpassningsförmågan hos programmerbar logik inom FPGA. Denna utveckling innebär ett steg mot att förverkliga en mer anpassningsbar och felbeständig skruvdragningsprocedur i industriområden.
|
Page generated in 0.0225 seconds