• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 46
  • 3
  • 1
  • 1
  • Tagged with
  • 60
  • 35
  • 33
  • 31
  • 28
  • 23
  • 20
  • 18
  • 17
  • 17
  • 17
  • 16
  • 16
  • 15
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

You Only Gesture Once (YouGo): American Sign Language Translation using YOLOv3

Mehul Nanda (8786558) 01 May 2020 (has links)
<div>The study focused on creating and proposing a model that could accurately and precisely predict the occurrence of an American Sign Language gesture for an alphabet in the English Language</div><div>using the You Only Look Once (YOLOv3) Algorithm. The training dataset used for this study was custom created and was further divided into clusters based on the uniqueness of the ASL sign.</div><div>Three diverse clusters were created. Each cluster was trained with the network known as darknet. Testing was conducted using images and videos for fully trained models of each cluster and</div><div>Average Precision for each alphabet in each cluster and Mean Average Precision for each cluster was noted. In addition, a Word Builder script was created. This script combined the trained models, of all 3 clusters, to create a comprehensive system that would create words when the trained models were supplied</div><div>with images of alphabets in the English language as depicted in ASL.</div>
52

VISUAL DETECTION OF PERSONAL PROTECTIVE EQUIPMENT &amp; SAFETY GEAR ON INDUSTRY WORKERS

Strand, Fredrik, Karlsson, Jonathan January 2022 (has links)
Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions and worker safety. The goal is thus to develop a system that will improve construction workers' safety. Building such a system necessitates computer vision, which entails object recognition, facial recognition, and human recognition, among other things. The basic idea is first to detect the human and remove the background to speed up the process and avoid potential interferences. After that, the cropped image is subjected to facial and object recognition. The code is written in Python and includes libraries such as OpenCV, face_recognition, and CVZone. Some of the different algorithms chosen were YOLOv4 and Histogram of Oriented Gradients. The results were measured at three respectively five-meter distances. As a result of the system’s pipeline, algorithms, and software, a mean average precision of 99% and 89% was achieved at the respective distances. At three and five meters, the model achieved a precision rate of 100%. The recall rates were 96% - 100% at 3m and 54% - 100% at 5m. Finally, the fps was measured at 1.2 on a system without GPU. / Skador på arbetsplatsen är vanliga i dagens samhälle på grund av att skyddsutrustning inte används eller används felaktigt. Målet är därför att bygga ett robust system som ska förbättra säkerhet. Ett system som endast ger tillträde till personal med rätt skyddsutrustning kan skapas för att förbättra arbetsförhållandena och arbetarsäkerheten. Att bygga ett sådant system kräver datorseende, vilket bland annat innebär objektigenkänning, ansiktsigenkänning och mänsklig igenkänning. Grundidén är att först upptäcka människan och ta bort bakgrunden för att göra processen mer effektiv och undvika potentiella störningar. Därefter appliceras ansikts- och objektigenkänning på den beskurna bilden. Koden är skriven i Python och inkluderar bland annat bibliotek som: OpenCV, face_recognition och CVZone. Några av de algoritmer som valdes var YOLOv4 och Histogram of Oriented Gradients. Resultatet mättes på tre, respektive fem meters avstånd. Systemets pipeline, algoritmer och mjukvara gav en medelprecision för alla klasser på 99%, och 89% för respektive avstånd. För tre och fem meters avstånd uppnådde modellen en precision på 100%. Recall uppnådde värden mellan 96% - 100% vid 3 meters avstånd och 54% - 100% vid 5 meters avstånd. Avslutningsvis uppmättes antalet bilder per sekund till 1,2 på ett system utan GPU.
53

3D YOLO: End-to-End 3D Object Detection Using Point Clouds / 3D YOLO: Objektdetektering i 3D med LiDAR-data

Al Hakim, Ezeddin January 2018 (has links)
For safe and reliable driving, it is essential that an autonomous vehicle can accurately perceive the surrounding environment. Modern sensor technologies used for perception, such as LiDAR and RADAR, deliver a large set of 3D measurement points known as a point cloud. There is a huge need to interpret the point cloud data to detect other road users, such as vehicles and pedestrians. Many research studies have proposed image-based models for 2D object detection. This thesis takes it a step further and aims to develop a LiDAR-based 3D object detection model that operates in real-time, with emphasis on autonomous driving scenarios. We propose 3D YOLO, an extension of YOLO (You Only Look Once), which is one of the fastest state-of-the-art 2D object detectors for images. The proposed model takes point cloud data as input and outputs 3D bounding boxes with class scores in real-time. Most of the existing 3D object detectors use hand-crafted features, while our model follows the end-to-end learning fashion, which removes manual feature engineering. 3D YOLO pipeline consists of two networks: (a) Feature Learning Network, an artificial neural network that transforms the input point cloud to a new feature space; (b) 3DNet, a novel convolutional neural network architecture based on YOLO that learns the shape description of the objects. Our experiments on the KITTI dataset shows that the 3D YOLO has high accuracy and outperforms the state-of-the-art LiDAR-based models in efficiency. This makes it a suitable candidate for deployment in autonomous vehicles. / För att autonoma fordon ska ha en god uppfattning av sin omgivning används moderna sensorer som LiDAR och RADAR. Dessa genererar en stor mängd 3-dimensionella datapunkter som kallas point clouds. Inom utvecklingen av autonoma fordon finns det ett stort behov av att tolka LiDAR-data samt klassificera medtrafikanter. Ett stort antal studier har gjorts om 2D-objektdetektering som analyserar bilder för att upptäcka fordon, men vi är intresserade av 3D-objektdetektering med hjälp av endast LiDAR data. Därför introducerar vi modellen 3D YOLO, som bygger på YOLO (You Only Look Once), som är en av de snabbaste state-of-the-art modellerna inom 2D-objektdetektering för bilder. 3D YOLO tar in ett point cloud och producerar 3D lådor som markerar de olika objekten samt anger objektets kategori. Vi har tränat och evaluerat modellen med den publika träningsdatan KITTI. Våra resultat visar att 3D YOLO är snabbare än dagens state-of-the-art LiDAR-baserade modeller med en hög träffsäkerhet. Detta gör den till en god kandidat för kunna användas av autonoma fordon.
54

A Multi-Fidelity Approach to Testing and Evaluation of AI-Enabled Systems

Robert Joseph Seif (19206790) 27 July 2024 (has links)
<p dir="ltr">Approaches to system testing and evaluation (T&E) are becoming increasingly relevant as artificial intelligence (AI)/machine learning (ML) technology expands across the industry’s current landscape. As the AI/ML landscape continues to develop, greater amounts of data are required to build the next generation of technology. Multiple communities have worked to create frameworks to interact with such scales of data, yet a gap persists in the ability to utilize data generated throughout the development process to support the for use in a T&E program. The objective of this thesis is to address this gap through a multi-fidelity approach to the test and evaluation of AI-enabled systems. This approach is constructed using a space of models to visualize similarities and differences between each individual model. Once requirements and potential tests that models can be employed to fulfill are organized, a method to sequentially select models for testing is utilized. Models are selected to maximize utility, dependent on model performance and cost to the T&E team. Experimentation was conducted through the case of an autonomous vehicle (AV) perception system, where models were constructed using a simulation of the Purdue University campus for AVs to drive around. Results show that the proposed approach, when paired with Bayesian Optimization for sequential test selection through an expected improvement acquisition function, can effectively select models in a manner that works to minimize uncertainty and cost for the test team. Through computational experiments, the proposed approach can be used to develop test combinations that minimize costs and maximize utility while maximizing the information a T&E team has on how well a system can meet a set of testing requirements in operational conditions.</p>
55

Implementation of Bolt Detection and Visual-Inertial Localization Algorithm for Tightening Tool on SoC FPGA / Implementering av bultdetektering och visuell tröghetslokaliseringsalgoritm för åtdragningsverktyg på SoC FPGA

Al Hafiz, Muhammad Ihsan January 2023 (has links)
With the emergence of Industry 4.0, there is a pronounced emphasis on the necessity for enhanced flexibility in assembly processes. In the domain of bolt-tightening, this transition is evident. Tools are now required to navigate a variety of bolts and unpredictable tightening methodologies. Each bolt, possessing distinct tightening parameters, necessitates a specific sequence to prevent issues like bolt cross-talk or unbalanced force. This thesis introduces an approach that integrates advanced computing techniques with machine learning to address these challenges in the tightening areas. The primary objective is to offer edge computation for bolt detection and tightening tools' precise localization. It is realized by leveraging visual-inertial data, all encapsulated within a System-on-Chip (SoC) Field Programmable Gate Array (FPGA). The chosen approach combines visual information and motion detection, enabling tools to quickly and precisely do the localization of the tool. All the computing is done inside the SoC FPGA. The key element for identifying different bolts is the YOLOv3-Tiny-3L model, run using the Deep-learning Processor Unit (DPU) that is implemented in the FPGA. In parallel, the thesis employs the Error-State Extended Kalman Filter (ESEKF) algorithm to fuse the visual and motion data effectively. The ESEKF is accelerated via a full implementation in Register Transfer Level (RTL) in the FPGA fabric. We examined the empirical outcomes and found that the visual-inertial localization exhibited a Root Mean Square Error (RMSE) position of 39.69 mm and a standard deviation of 9.9 mm. The precision in orientation determination yields a mean error of 4.8 degrees, offset by a standard deviation of 5.39 degrees. Notably, the entire computational process, from the initial bolt detection to its final localization, is executed in 113.1 milliseconds. This thesis articulates the feasibility of executing bolt detection and visual-inertial localization using edge computing within the SoC FPGA framework. The computation trajectory is significantly streamlined by harnessing the adaptability of programmable logic within the FPGA. This evolution signifies a step towards realizing a more adaptable and error-resistant bolt-tightening procedure in industrial areas. / Med framväxten av Industry 4.0, finns det en uttalad betoning på nödvändigheten av ökad flexibilitet i monteringsprocesser. Inom området bultåtdragning är denna övergång tydlig. Verktyg krävs nu för att navigera i en mängd olika bultar och oförutsägbara åtdragningsmetoder. Varje bult, som har distinkta åtdragningsparametrar, kräver en specifik sekvens för att förhindra problem som bultöverhörning eller obalanserad kraft. Detta examensarbete introducerar ett tillvägagångssätt som integrerar avancerade datortekniker med maskininlärning för att hantera dessa utmaningar i skärpningsområdena. Det primära målet är att erbjuda kantberäkning för bultdetektering och åtdragningsverktygs exakta lokalisering. Det realiseras genom att utnyttja visuella tröghetsdata, allt inkapslat i en System-on-Chip (SoC) Field Programmable Gate Array (FPGA). Det valda tillvägagångssättet kombinerar visuell information och rörelsedetektering, vilket gör det möjligt för verktyg att snabbt och exakt lokalisera verktyget. All beräkning sker inuti SoC FPGA. Nyckelelementet för att identifiera olika bultar är YOLOv3-Tiny-3L-modellen, som körs med hjälp av Deep-learning Processor Unit (DPU) som är implementerad i FPGA. Parallellt använder avhandlingen algoritmen Error-State Extended Kalman Filter (ESEKF) för att effektivt sammansmälta visuella data och rörelsedata. ESEKF accelereras via en fullständig implementering i Register Transfer Level (RTL) i FPGA-strukturen. Vi undersökte de empiriska resultaten och fann att den visuella tröghetslokaliseringen uppvisade en Root Mean Square Error (RMSE) position på 39,69 mm och en standardavvikelse på 9,9 mm. Precisionen i orienteringsbestämningen ger ett medelfel på 4,8 grader, kompenserat av en standardavvikelse på 5,39 grader. Noterbart är att hela beräkningsprocessen, från den första bultdetekteringen till dess slutliga lokalisering, exekveras på 113,1 millisekunder. Denna avhandling artikulerar möjligheten att utföra bultdetektering och visuell tröghetslokalisering med hjälp av kantberäkning inom SoC FPGA-ramverket. Beräkningsbanan är avsevärt effektiviserad genom att utnyttja anpassningsförmågan hos programmerbar logik inom FPGA. Denna utveckling innebär ett steg mot att förverkliga en mer anpassningsbar och felbeständig skruvdragningsprocedur i industriområden.
56

Object Detection in Domain Specific Stereo-Analysed Satellite Images

Grahn, Fredrik, Nilsson, Kristian January 2019 (has links)
Given satellite images with accompanying pixel classifications and elevation data, we propose different solutions to object detection. The first method uses hierarchical clustering for segmentation and then employs different methods of classification. One of these classification methods used domain knowledge to classify objects while the other used Support Vector Machines. Additionally, a combination of three Support Vector Machines were used in a hierarchical structure which out-performed the regular Support Vector Machine method in most of the evaluation metrics. The second approach is more conventional with different types of Convolutional Neural Networks. A segmentation network was used as well as a few detection networks and different fusions between these. The Convolutional Neural Network approach proved to be the better of the two in terms of precision and recall but the clustering approach was not far behind. This work was done using a relatively small amount of data which potentially could have impacted the results of the Machine Learning models in a negative way.
57

System for People Detection and Localization Using Thermal Imaging Cameras / System for People Detection and Localization Using Thermal Imaging Cameras

Charvát, Michal January 2020 (has links)
V dnešním světě je neustále se zvyšující poptávka po spolehlivých automatizovaných mechanismech pro detekci a lokalizaci osob pro různé účely -- od analýzy pohybu návštěvníků v muzeích přes ovládání chytrých domovů až po hlídání nebezpečných oblastí, jimiž jsou například nástupiště vlakových stanic. Představujeme metodu detekce a lokalizace osob s pomocí nízkonákladových termálních kamer FLIR Lepton 3.5 a malých počítačů Raspberry Pi 3B+. Tento projekt, navazující na předchozí bakalářský projekt "Detekce lidí v místnosti za použití nízkonákladové termální kamery", nově podporuje modelování komplexních scén s polygonálními okraji a více termálními kamerami. V této práci představujeme vylepšenou knihovnu řízení a snímání pro kameru Lepton 3.5, novou techniku detekce lidí používající nejmodernější YOLO (You Only Look Once) detektor objektů v reálném čase, založený na hlubokých neuronových sítích, dále novou automaticky konfigurovatelnou termální jednotku, chráněnou schránkou z 3D tiskárny pro bezpečnou manipulaci, a v neposlední řadě také podrobný návod instalace detekčního systému do nového prostředí a další podpůrné nástroje a vylepšení. Výsledky nového systému demonstrujeme příkladem analýzy pohybu osob v Národním muzeu v Praze.
58

Detekce dopravních značek a semaforů / Detection of Traffic Signs and Lights

Oškera, Jan January 2020 (has links)
The thesis focuses on modern methods of traffic sign detection and traffic lights detection directly in traffic and with use of back analysis. The main subject is convolutional neural networks (CNN). The solution is using convolutional neural networks of YOLO type. The main goal of this thesis is to achieve the greatest possible optimization of speed and accuracy of models. Examines suitable datasets. A number of datasets are used for training and testing. These are composed of real and synthetic data sets. For training and testing, the data were preprocessed using the Yolo mark tool. The training of the model was carried out at a computer center belonging to the virtual organization MetaCentrum VO. Due to the quantifiable evaluation of the detector quality, a program was created statistically and graphically showing its success with use of ROC curve and evaluation protocol COCO. In this thesis I created a model that achieved a success average rate of up to 81 %. The thesis shows the best choice of threshold across versions, sizes and IoU. Extension for mobile phones in TensorFlow Lite and Flutter have also been created.
59

Uncertainty Estimation and Confidence Calibration in YOLO5Face

Savinainen, Oskar January 2024 (has links)
This thesis investigates predicting the Intersection over Union (IoU) in detections made by the face detector YOLO5Face, which is done to use the predicted IoU as a new uncertainty measure. The detections are done on the face dataset WIDER FACE, and the prediction of IoU is made by adding a parallel head to the existing YOLO5Face architecture. Experiments show that the methodology for predicting the IoU used in this thesis does not work and the parallel prediction head fails to predict the IoU and instead resorts to predicting common IoU values. The localisation confidence and classification confidences of YOLO5Face are then investigated to find out which confidence measure is least uncertain and most suitable to use when identifying faces. Experiments show that the localisation confidence is consistently more calibrated than the classification confidence. The classification confidence is then calibrated with respect to the localisation confidence which reduces the Expected Calibration Error (ECE) for classification confidence from 0.17 to 0.01.
60

From Pixels to Predators: Wildlife Monitoring with Machine Learning / Från Pixlar till Rovdjur: Viltövervakning med Maskininlärning

Eriksson, Max January 2024 (has links)
This master’s thesis investigates the application of advanced machine learning models for the identification and classification of Swedish predators using camera trap images. With the growing threats to biodiversity, there is an urgent need for innovative and non-intrusive monitoring techniques. This study focuses on the development and evaluation of object detection models, including YOLOv5, YOLOv8, YOLOv9, and Faster R-CNN, aiming to enhance the surveillance capabilities of Swedish predatory species such as bears, wolves, lynxes, foxes, and wolverines. The research leverages a dataset from the NINA database, applying data preprocessing and augmentation techniques to ensure robust model training. The models were trained and evaluated using various dataset sizes and conditions, including day and night images. Notably, YOLOv8 and YOLOv9 underwent extended training for 300 epochs, leading to significant improvements in performance metrics. The performance of the models was evaluated using metrics such as mean Average Precision (mAP), precision, recall, and F1-score. YOLOv9, with its innovative Programmable Gradient Information (PGI) and GELAN architecture, demonstrated superior accuracy and reliability, achieving an F1-score of 0.98 on the expanded dataset. The research found that training models on images captured during both day and night jointly versus separately resulted in only minor differences in performance. However, models trained exclusively on daytime images showed slightly better performance due to more consistent and favorable lighting conditions. The study also revealed a positive correlation between the size of the training dataset and model performance, with larger datasets yielding better results across all metrics. However, the marginal gains decreased as the dataset size increased, suggesting diminishing returns. Among the species studied, foxes were the least challenging for the models to detect and identify, while wolves presented more significant challenges, likely due to their complex fur patterns and coloration blending with the background.

Page generated in 0.0426 seconds