Spelling suggestions: "subject:"yolov3"" "subject:"yolov5""
1 |
A Method of Combining GANs to Improve the Accuracy of Object Detection on Autonomous VehiclesYe, Fanjie 12 1900 (has links)
As the technology in the field of computer vision becomes more and more mature, the autonomous vehicles have achieved rapid developments in recent years. However, the object detection and classification tasks of autonomous vehicles which are based on cameras may face problems when the vehicle is driving at a relatively high speed. One is that the camera will collect blurred photos when driving at high speed which may affect the accuracy of deep neural networks. The other is that small objects far away from the vehicle are difficult to be recognized by networks. In this paper, we present a method to combine two kinds of GANs to solve these problems. We choose DeblurGAN as the base model to remove blur in images. SRGAN is another GAN we choose for solving small object detection problems. Due to the total time of these two are too long, we still do the model compression on it to make it lighter. Then we use the Yolov4 to do the object detection. Finally we do the evaluation of the whole model architecture and proposed a model version 2 based on DeblurGAN and ESPCN which is faster than previous one but the accuracy may be lower.
|
2 |
2D object detection and semantic segmentation in the Carla simulator / 2D-objekt detektering och semantisk segmentering i Carla-simulatornWang, Chen January 2020 (has links)
The subject of self-driving car technology has drawn growing interest in recent years. Many companies, such as Baidu and Tesla, have already introduced automatic driving techniques in their newest cars when driving in a specific area. However, there are still many challenges ahead toward fully autonomous driving cars. Tesla has caused several severe accidents when using autonomous driving functions, which makes the public doubt self-driving car technology. Therefore, it is necessary to use the simulator environment to help verify and perfect algorithms for the perception, planning, and decision-making of autonomous vehicles before implementation in real-world cars. This project aims to build a benchmark for implementing the whole self-driving car system in software. There are three main components including perception, planning, and control in the entire autonomous driving system. This thesis focuses on two sub-tasks 2D object detection and semantic segmentation in the perception part. All of the experiments will be tested in a simulator environment called The CAR Learning to Act(Carla), which is an open-source platform for autonomous car research. Carla simulator is developed based on the game engine(Unreal4). It has a server-client system, which provides a flexible python API. 2D object detection uses the You only look once(Yolov4) algorithm that contains the tricks of the latest deep learning techniques from the aspect of network structure and data augmentation to strengthen the network’s ability to learn the object. Yolov4 achieves higher accuracy and short inference time when comparing with the other popular object detection algorithms. Semantic segmentation uses Efficient networks for Computer Vision(ESPnetv2). It is a light-weight and power-efficient network, which achieves the same performance as other semantic segmentation algorithms by using fewer network parameters and FLOPS. In this project, Yolov4 and ESPnetv2 are implemented into the Carla simulator. Two modules work together to help the autonomous car understand the world. The minimal distance awareness application is implemented into the Carla simulator to detect the distance to the ahead vehicles. This application can be used as a basic function to avoid the collision. Experiments are tested by using a single Nvidia GPU(RTX2060) in Ubuntu 18.0 system. / Ämnet självkörande bilteknik har väckt intresse de senaste åren. Många företag, som Baidu och Tesla, har redan infört automatiska körtekniker i sina nyaste bilar när de kör i ett specifikt område. Det finns dock fortfarande många utmaningar inför fullt autonoma bilar. Detta projekt syftar till att bygga ett riktmärke för att implementera hela det självkörande bilsystemet i programvara. Det finns tre huvudkomponenter inklusive uppfattning, planering och kontroll i hela det autonoma körsystemet. Denna avhandling fokuserar på två underuppgifter 2D-objekt detektering och semantisk segmentering i uppfattningsdelen. Alla experiment kommer att testas i en simulatormiljö som heter The CAR Learning to Act (Carla), som är en öppen källkodsplattform för autonom bilforskning. Du ser bara en gång (Yolov4) och effektiva nätverk för datorvision (ESPnetv2) implementeras i detta projekt för att uppnå Funktioner för objektdetektering och semantisk segmentering. Den minimala distans medvetenhets applikationen implementeras i Carla-simulatorn för att upptäcka avståndet till de främre bilarna. Denna applikation kan användas som en grundläggande funktion för att undvika kollisionen.
|
3 |
Use of improved Deep Learning and DeepSORT for Vehicle estimation / Användning av förbättrad djupinlärning och DeepSORT för fordonsuppskattningZheng, Danna January 2022 (has links)
Intelligent Traffic System (ITS) has high application value in nowadays vehicle surveillance and future applications such as automated driving. The crucial part of ITS is to detect and track vehicles in real-time video stream with high accuracy and low GPU consumption. In this project, we select the YOLO version4 (YOLOv4) one-stage deep learning detector to generate bounding boxes with vehicle classes and location as well as confidence value, we select Simple Online and Realtime Tracking with a Deep Association Metric (DeepSORT) tracker to track vehicles using the output of YOLOv4 detector. Furthermore, in order to make the detector more adaptive to practical use, especially when the vehicle is small or obscured, we improved the detector’s structure by adding attention mechanisms and reducing parameters to detect vehicles with relatively high accuracy and low GPU memory usage. With the baseline model, results show that the YOLOv4 and DeepSORT vehicle detection could achieve 82.4% mean average precision among three vehicle classes with 63.945 MB parameters under 19.98 frames per second. After optimization, the improved model could achieve 85.84% mean average precision among three detection classes with 44.158MB parameters under 18.65 frames per second. Compared with original YOLOv4, the improved YOLOv4 detector could increase the mean average precision by 3.44% and largely reduced the parameters by 30.94% as well as maintaining high detection speed. This proves the validity and high applicability of the proposed improved YOLOv4 detector. / Intelligenta trafiksystem har ett stort tillämpningsvärde i dagens fordonsövervakning och framtida tillämpningar som t.ex. automatiserad körning. Den avgörande delen av systemet är att upptäcka och spåra fordon i videoströmmar i realtid med hög noggrannhet och låg GPU-förbrukning. I det här projektet väljer vi YOLOv4-detektorn för djupinlärning i ett steg för att generera avgränsande rutor med fordonsklasser och lokalisering samt konfidensvärde, och vi väljer DeepSORT-tracker för att spåra fordon med hjälp av YOLOv4-detektorns resultat. För att göra detektorn mer anpassningsbar för praktisk användning, särskilt när fordonet är litet eller dolt, förbättrade vi dessutom detektorns struktur genom att lägga till uppmärksamhetsmekanismer och minska parametrarna för att upptäcka fordon med relativt hög noggrannhet och låg GPU-minneanvändning. Med basmodellen visar resultaten att YOLOv4 och DeepSORT fordonsdetektering kunde uppnå en genomsnittlig genomsnittlig precision på 82.4 % bland tre fordonsklasser med 63.945 MB parametrar under 19.98 bilder per sekund. Efter optimering kunde den förbättrade modellen uppnå 85.84% genomsnittlig precision bland tre detektionsklasser med 44.158 MB parametrar under 18.65 bilder per sekund. Jämfört med den ursprungliga YOLOv4-detektorn kunde den förbättrade YOLOv4-detektorn öka den genomsnittliga precisionen med 3.44 % och minska parametrarna med 30.94%, samtidigt som den bibehöll en hög detektionshastighet. Detta visar att den föreslagna förbättrade YOLOv4-detektorn är giltig och mycket användbar.
|
4 |
Classification and localization of extreme outliers in computer vision tasks in surveillance scenarios / Klassificering och lokalisering av extremvärden för datorseende i övervakningsscenarionDaoud, Tariq, Zere Goitom, Emanuel January 2022 (has links)
Convolutional neural networks (CNN) have come a long way and can be trained toclassify many of the objects around us. Despite this, researchers do not fullyunderstand how CNN models learn features (edges, shapes, contours, etc.) fromdata. For this reason, it is reasonable to investigate if a CNN model can learn toclassify objects under extreme conditions. An example of such an extreme conditioncould be a car that drives towards the camera at night, and therefore does not haveany distinct features because the light from the headlights covers large parts of thecar.The aim of this thesis is to investigate how the performance of a CNN model isaffected, when trained on objects under extreme conditions. A YOLOv4 model willbe trained on three different extreme cases: light polluted vehicles, nighttimeobjects and snow-covered vehicles. A validation will then be conducted on a testdataset to see if the performance decreases or improves, compared to when themodel trained is on normal conditions. Generally, the training was stable for allextreme cases and the results show an improved or similar performance incomparison to the normal cases. This indicates that models can be trained with allextreme cases. Snow-covered vehicles with mosaic data augmentation and the IOUthreshold 0,25 had the best overall performance compared to the normal cases, witha difference of +14,95% in AP for cars, −0,73% in AP for persons, +8,08% in AP fortrucks, 0 in precision and +9% in recall. / Konvolutionella neurala nätverk (CNN) har kommit långt och kan tränas till attklassificera de flesta objekten i vår omgivning. Trots detta har forskare intefullständigt förstått hur CNN modeller lär sig att klassificera drag (kanter, former,konturer, osv), på grund av detta är det rimligt att undersöka om en CNN-modellkan lära sig att klassificera objekt som befinner sig under extrema förhållanden.Ett exempel på ett sådant extremfall kan vara när en bil kör mot kameran undernattetid och inte har några distinkta drag, eftersom ljuset från framlyktorna täckerstora delar av bilen.Målet med detta arbete är att undersöka hur en CNN-modells prestanda påverkas,när den tränats på objekt som befinner sig under extrema förhållanden. EnYOLOV4 modell ska tränas på tre olika extrema fall: ljus bländade fordon,nattetidobjekt samt snötäckta fordon. En validering ska sedan utföras på ett test setför att se om prestandan försämras eller förbättras, jämfört med modellen somtränat på normala förhållanden. Generellt sett var träningen stabil för alla extremafall och resultatet visade förbättring eller liknande prestanda, i förhållande tillnormala fallen. Detta indikerar att modeller kan tränas med alla extrema fall. Bästprestanda erhölls av snötäckta bilar med mosaik data augmentering och IOUtröskeln 0,25 jämfört med normala fallen, med en skillnad på -0,73% i AP förpersoner, +14,95% i AP för bilar, +8,08% skillnad i AP för lastbilar, 0 i precisionoch +9% i recall.
|
5 |
Image Augmentation to Create Lower Quality Images for Training a YOLOv4 Object Detection ModelMelcherson, Tim January 2020 (has links)
Research in the Arctic is of ever growing importance, and modern technology is used in news ways to map and understand this very complex region and how it is effected by climate change. Here, animals and vegetation are tightly coupled with their environment in a fragile ecosystem, and when the environment undergo rapid changes it risks damaging these ecosystems severely. Understanding what kind of data that has potential to be used in artificial intelligence, can be of importance as many research stations have data archives from decades of work in the Arctic. In this thesis, a YOLOv4 object detection model has been trained on two classes of images to investigate the performance impacts of disturbances in the training data set. An expanded data set was created by augmenting the initial data to contain various disturbances. A model was successfully trained on the augmented data set and a correlation between worse performance and presence of noise was detected, but changes in saturation and altered colour levels seemed to have less impact than expected. Reducing noise in gathered data is seemingly of greater importance than enhancing images with lacking colour levels. Further investigations with a larger and more thoroughly processed data set is required to gain a clearer picture of the impact of the various disturbances.
|
6 |
Počítání unikátních aut ve snímcích / Unique Car CountingUhrín, Peter January 2021 (has links)
Current systems for counting cars on parking lots usually use specialized equipment, such as barriers at the parking lot entrance. Usage of such equipment is not suitable for free or residential parking areas. However, even in these car parks, it can help keep track of their occupancy and other data. The system designed in this thesis uses the YOLOv4 model for visual detection of cars in photos. It then calculates an embedding vector for each vehicle, which is used to describe cars and compare whether the car has changed over time at the same parking spot. This information is stored in the database and used to calculate various statistical values like total cars count, average occupancy, or average stay time. These values can be retrieved using REST API or be viewed in the web application.
|
7 |
VISUAL DETECTION OF PERSONAL PROTECTIVE EQUIPMENT & SAFETY GEAR ON INDUSTRY WORKERSStrand, Fredrik, Karlsson, Jonathan January 2022 (has links)
Workplace injuries are common in today's society due to a lack of adequately worn safety equipment. A system that only admits appropriately equipped personnel can be created to improve working conditions and worker safety. The goal is thus to develop a system that will improve construction workers' safety. Building such a system necessitates computer vision, which entails object recognition, facial recognition, and human recognition, among other things. The basic idea is first to detect the human and remove the background to speed up the process and avoid potential interferences. After that, the cropped image is subjected to facial and object recognition. The code is written in Python and includes libraries such as OpenCV, face_recognition, and CVZone. Some of the different algorithms chosen were YOLOv4 and Histogram of Oriented Gradients. The results were measured at three respectively five-meter distances. As a result of the system’s pipeline, algorithms, and software, a mean average precision of 99% and 89% was achieved at the respective distances. At three and five meters, the model achieved a precision rate of 100%. The recall rates were 96% - 100% at 3m and 54% - 100% at 5m. Finally, the fps was measured at 1.2 on a system without GPU. / Skador på arbetsplatsen är vanliga i dagens samhälle på grund av att skyddsutrustning inte används eller används felaktigt. Målet är därför att bygga ett robust system som ska förbättra säkerhet. Ett system som endast ger tillträde till personal med rätt skyddsutrustning kan skapas för att förbättra arbetsförhållandena och arbetarsäkerheten. Att bygga ett sådant system kräver datorseende, vilket bland annat innebär objektigenkänning, ansiktsigenkänning och mänsklig igenkänning. Grundidén är att först upptäcka människan och ta bort bakgrunden för att göra processen mer effektiv och undvika potentiella störningar. Därefter appliceras ansikts- och objektigenkänning på den beskurna bilden. Koden är skriven i Python och inkluderar bland annat bibliotek som: OpenCV, face_recognition och CVZone. Några av de algoritmer som valdes var YOLOv4 och Histogram of Oriented Gradients. Resultatet mättes på tre, respektive fem meters avstånd. Systemets pipeline, algoritmer och mjukvara gav en medelprecision för alla klasser på 99%, och 89% för respektive avstånd. För tre och fem meters avstånd uppnådde modellen en precision på 100%. Recall uppnådde värden mellan 96% - 100% vid 3 meters avstånd och 54% - 100% vid 5 meters avstånd. Avslutningsvis uppmättes antalet bilder per sekund till 1,2 på ett system utan GPU.
|
8 |
System for People Detection and Localization Using Thermal Imaging Cameras / System for People Detection and Localization Using Thermal Imaging CamerasCharvát, Michal January 2020 (has links)
V dnešním světě je neustále se zvyšující poptávka po spolehlivých automatizovaných mechanismech pro detekci a lokalizaci osob pro různé účely -- od analýzy pohybu návštěvníků v muzeích přes ovládání chytrých domovů až po hlídání nebezpečných oblastí, jimiž jsou například nástupiště vlakových stanic. Představujeme metodu detekce a lokalizace osob s pomocí nízkonákladových termálních kamer FLIR Lepton 3.5 a malých počítačů Raspberry Pi 3B+. Tento projekt, navazující na předchozí bakalářský projekt "Detekce lidí v místnosti za použití nízkonákladové termální kamery", nově podporuje modelování komplexních scén s polygonálními okraji a více termálními kamerami. V této práci představujeme vylepšenou knihovnu řízení a snímání pro kameru Lepton 3.5, novou techniku detekce lidí používající nejmodernější YOLO (You Only Look Once) detektor objektů v reálném čase, založený na hlubokých neuronových sítích, dále novou automaticky konfigurovatelnou termální jednotku, chráněnou schránkou z 3D tiskárny pro bezpečnou manipulaci, a v neposlední řadě také podrobný návod instalace detekčního systému do nového prostředí a další podpůrné nástroje a vylepšení. Výsledky nového systému demonstrujeme příkladem analýzy pohybu osob v Národním muzeu v Praze.
|
9 |
Detekce dopravních značek a semaforů / Detection of Traffic Signs and LightsOškera, Jan January 2020 (has links)
The thesis focuses on modern methods of traffic sign detection and traffic lights detection directly in traffic and with use of back analysis. The main subject is convolutional neural networks (CNN). The solution is using convolutional neural networks of YOLO type. The main goal of this thesis is to achieve the greatest possible optimization of speed and accuracy of models. Examines suitable datasets. A number of datasets are used for training and testing. These are composed of real and synthetic data sets. For training and testing, the data were preprocessed using the Yolo mark tool. The training of the model was carried out at a computer center belonging to the virtual organization MetaCentrum VO. Due to the quantifiable evaluation of the detector quality, a program was created statistically and graphically showing its success with use of ROC curve and evaluation protocol COCO. In this thesis I created a model that achieved a success average rate of up to 81 %. The thesis shows the best choice of threshold across versions, sizes and IoU. Extension for mobile phones in TensorFlow Lite and Flutter have also been created.
|
Page generated in 0.0443 seconds