Spelling suggestions: "subject:"yolov3""
1 |
Real time Optical Character Recognition in steel bars using YOLOV5Gattupalli, Monica January 2023 (has links)
Background.Identifying the quality of the products in the manufacturing industry is a challenging task. Manufacturers use needles to print unique numbers on the products to differentiate between good and bad quality products. However, identi- fying these needle printed characters can be difficult. Hence, new technologies like deep learning and optical character recognition (OCR) are used to identify these characters. Objective.The primary ob jective of this thesis is to identify the needle-printed characters on steel bars. This ob jective is divided into two sub-ob jectives. The first sub-ob jective is to identify the region of interest on the steel bars and extract it from the images. The second sub-ob jective is to identify the characters on the steel bars from the extracted images. The YOLOV5 and YOLOV5-obb ob ject detection algorithms are used to achieve these ob jectives. Method. Literature review was performed at first to select the algorithms, then the research was to collect the dataset, which was provided by OVAKO. The dataset included 1000 old images and 3000 new images of steel bars. To answer the RQ2, at first existing OCR techniques were used on the old images which had low accuracy levels. So, the YOLOV5 algorithm was used on old images to detect the region of interest. Different rotation techniques are applied to the cropped images(cropped after the bounding box is detected) no promising result is observed so YOLOV5 at the character level is used in identifying the characters, the results are unsatisfactory. To achieve this, YOLOV5-obb was used on the new images, which resulted in good accuracy levels. Results. Accuracy and mAP are used to assess the performance of OCRs and selected ob ject detection algorithms. The current study proved Existing OCR was also used in the extraction, however, it had an accuracy of 0%, which implies it failed to identify characters. With a mAP of 0.95, YOLOV5 is good at extracting cropped images but fails to identify the characters. When YOLOV5-obb is used for attaining orientation, it achieves a mAP of 0.93. Due to time constraint, the last part of the thesis was not implemented. Conclusion. The present research employed YOLOV5 and YOLOV5-obb ob ject detection algorithms to identify needle-printed characters on steel bars. By first se- lecting the region of interest and then extracting images, the study ob jectives were met. Finally, character-level identification was performed on the old images using the YOLOV5 technique and on the new images using the YOLOV5-obb algorithm, with promising results
|
2 |
POTHOLE DETECTION USING DEEP LEARNING AND AREA ASSESSMENT USING IMAGE MANIPULATIONKharel, Subash 01 June 2021 (has links)
Every year, drivers are spending over 3 billions to repair damage on vehicle caused by potholes. Along with the financial disaster, potholes cause frustration in drivers. Also, with the emerging development of automated vehicles, road safety with automation in mind is being a necessity. Deep Learning techniques offer intelligent alternatives to reduce the loss caused by spotting pothole. The world is connected in such a way that the information can be shared in no time. Using the power of connectivity, we can communicate the information of potholes to other vehicles and also the department of Transportation for necessary action. A significant number of research efforts have been done with a view to help detect potholes in the pavements. In this thesis, we have compared two object detection algorithms belonging to two major classes i.e. single shot detectors and two stage detectors using our dataset. Comparing the results in the Faster RCNN and YOLOv5, we concluded that, potholes take a small portion in image which makes potholes detection with YOLOv5 less accurate than the Faster RCNN, but keeping the speed of detection in mind, we have suggested that YOLOv5 will be a better solution for this task. Using the YOLOv5 model and image processing technique, we calculated approximate area of potholes and visualized the shape of potholes. Thus obtained information can be used by the Department of Transportation for planning necessary construction tasks. Also, we can use these information to warn the drivers about the severity of potholes depending upon the shape and area.
|
3 |
Performance evaluation of deep learning object detectors for weed detection and real time deployment in cotton fieldsRahman, Abdur 13 August 2024 (has links) (PDF)
Effective weed control is crucial, especially for herbicide-resistant species. Machine vision technology, through weed detection and localization, can facilitate precise, species-specific treatments. Despite the challenges posed by unstructured field conditions and weed variability, deep learning (DL) algorithms show promise. This study evaluated thirteen DL-based weed detection models, including YOLOv5, RetinaNet, EfficientDet, Fast RCNN, and Faster RCNN, using pre-trained object detectors. RetinaNet (R101-FPN) achieved the highest accuracy with a mean average precision (mAP@0.50) of 79.98%, though it had longer inference times. YOLOv5n, with the fastest inference (17 ms on Google Colab) and only 1.8 million parameters, achieved a comparable 76.58% mAP@0.50, making it suitable for real-time use in resource-limited devices. A prototype using YOLOv5 was tested on two datasets, showing good real-time accuracy on In-season data and comparable results on Cross-season data, despite some accuracy challenges due to dataset distribution shifts.
|
4 |
Identifiering av UNO-kort : En jämförelse av bildigenkänningsteknikerAl-Asadi, Yousif, Streit, Jennifer January 2023 (has links)
Att spela sällskapsspelet UNO är en typ av umgängesform där målet är att trivas. EnUNO-kortlek har 5 olika färger (blå, röd, grön, gul och joker) och olika symboler.Detta kan vara frustrerande för en person med nedsatt färgseende att delta, då enstor andel av spelet är beroende av att identifiera färgen på varje kort. Övergripandesyftet med detta arbete är att utveckla en prototyp för objektigenkänning av UNOkort som stöd för färgnedsatta. Arbetet sker genom jämförelse av objektigenkänningsmetoder som Convolutional Neural Network (CNN) och Template Matchinginspirerade metoder: hue template test samt binary template test. Detta kommer attjämföras i samband med igenkänning av färg och symbol tillsammans och separerat. Utvecklandet av prototypen kommer att utföras genom att träna två olika CNNmodeller, där en modell fokuserar endast på symboler och den andra fokuserar påbåde färg och symbol. Dessa modeller kommer att tränas med hjälp av YOLOv5 algoritmen som anses vara State Of The Art (SOTA) inom CNN med snabb exekvering. Samtidigt kommer template test att utvecklas med hjälp av OpenCV och genom att skapa mallar för korten. Dessa används för att göra en jämförelse av kortetsom ska identifieras med hjälp av mallen. Utöver detta kommer K Nearest Neighbor(KNN), en maskininlärningsalgoritm att utvecklas med syfte att identifiera endastfärg på korten. Slutligen utförs en jämförelse mellan dessa metoder genom mätningav prestanda som består av accuracy, precision, recall och latency. Jämförelsen kommer att ske mellan varje metod genom en confusion matrix för färger och symbolerför respektive modell. Resultatet av studien visade på att modellen som kombinerar CNN och KNN presterade bäst vid valideringen av de olika metoderna. Utöver detta visar studien atttemplate test är snabbare att implementera än CNN på grund av tiden för träningensom ett neuralt nätverk kräver. Dessutom visar latency att det finns en skillnad mellan de olika modellerna, där CNN presterade bäst. / Engaging in the social game of UNO represents a form of social interaction aimed atpromoting enjoyment. Each UNO card deck consists of five different colors (blue,red, green, yellow and joker) and various symbols. However participating in such agame can be frustrating for individuals with color vision impairment. Since a substantial portion of the game relies on accurately identifying the color of each card.The overall purpose of this research is to develop a prototype for object recognitionof UNO cards to support individuals with color vision impairment. This thesis involves comparing object recognition methods, namely Convolutional Neural Network (CNN) and Template Matching (TM). Each method will be compared with respect to color and symbol recognition both separately and combined. The development of such a prototype will be through creating and training two different CNN models, where the first model focuses on solely symbol recognitionwhile the other model incorporates both color and symbol recognition. These models will be trained though an algorithm called YOLOv5 which is considered state-ofthe-art (SOTA) with fast execution. At the same time, two models of TM inspiredmethods, hue template test and binary template test, will be developed with thehelp of OpenCV and by creating templates for the cards. Each template will be usedas a way to compare the detected card in order to classify it. Additionally, the KNearest Neighbor (KNN) algorithm, a machine learning algorithm, will be developed specifically to identify the color of the cards. Finally a comparative analysis ofthese methods will be conducted by evaluating performance metrics such as accuracy, precision, recall and latency. The comparison will be carried out in betweeneach method using a confusion matrix for color and symbol in respective models. The study’s findings revealed that the model combining CNN and KNN demonstrated the best performance during the validation of the different models. Furthermore, the study shows that template tests are faster to implement than CNN due tothe training that a neural network requires. Moreover, the execution time showsthat there is a difference between the different models, where CNN achieved thehighest performance.
|
5 |
Hybrid pool based deep active learning for object detection using intermediate network embeddingsMarbinah, Johan January 2021 (has links)
With the advancements in deep learning, object detection networks have become more robust. Nevertheless, a challenge with training deep networks is finding enough labelled training data for the model to perform well, due to constraints associated with acquiring relevant data. For this reason, active learning is used to minimize the cost by sampling the unlabeled samples that increase the performance the most. In the field of object detection, few works have been done in exploring effective hybrid active learning strategies that exploit the intermediate feature embeddings in neural networks. In this work, hybrid active learning methods are proposed and tested, using various uncertainty sampling techniques and the well-respected core-set method as the representative strategy. In addition, experiments are conducted with network embeddings to find a suitable strategy to model representation of all available samples. Experiments show mixed outcomes as to whether hybrid methods perform better than the core-set method used separately. / Med framstegen inom djupinlärning, har neurala nätverk för objektdetektering blivit mer robusta. En utmaning med att träna djupa neurala nätverk är att hitta en tillräcklig mängd träningsdata för att ett nätverk ska prestera bra, med tanke på de begränsningar som är förknippade med anskaffningen av relevant data. Av denna anledning används aktiv maskininlärning för att minimera kostnaden med att förvärva nya datapunkter, genom att göra kontinuerliga urval av de omärkta bilder som ökar prestandan mest. När det gäller objektsdetektering har få arbeten gjorts för att utforska effektiva hybridstrategier som utnyttjar de mellanliggande lagren som finns i ett neuralt nätverk. I det här arbetet föreslås och testas hybridmetoder i kontext av aktiv maskininlärning med hjälp av olika tekniker för att göra urval av datamängder baserade på osäkerhetsberäkningar men även beräkningar med hänsyn till representation (core-set-metoden). Dessutom utförs experiment med mellanliggande nätverksinbäddningar för att hitta en lämplig strategi för att modellera representation av alla tillgängliga bilder i datasetet. Experimenten visar blandade resultat när det gäller huruvida hybridmetoderna presterar bättre i jämförelse med seperata aktiv maskininlärning strategier där core-set metoden inte används.
|
6 |
Detekce pohybujících se objektů ve videu s využitím neuronových sítí / Object detection in video using neural networksMikulský, Petr January 2021 (has links)
This diploma thesis deals with the detection of moving objects in a video recording using neural networks. The aim of the thesis was to detect road users in video recordings. Pre-trained YOLOv5 object detection model was used for a practical part of the thesis. As part of the solution, an own dataset of traffic road video recordings was created and annotated with following classes: a car, a bus, a van, a motorcycle, a truck and a trailer truck. Final version of this dataset comprise 5404 frames and 6467 annotated objects in total. After training, the YOLOv5 model achieved 0.995 mAP, 0.995 precision and 0.986 recall on the dataset. All steps leading to the final form of the dataset are described in the conclusion chapter.
|
7 |
How Safe Is Machine Vision? : An Evaluation of the AMLAS Process in a Machine Vision EnvironmentHamnert, Josef, Hägglund, Daniel January 2022 (has links)
This thesis evaluates the AMLAS methodology. To support the evaluation, literature studies are conducted and a machine learning dependent system that detects people and helmets is implemented. The practical work is performed according to the documentation of AMLAS. Alongside this work, a user interface is developed. The user interface and the machine learning component is merged to create the complete system. The results show that AMLAS contributes with safety, structure and reliability to the system. However, the findings show that AMLAS is missing some aspects. / <p>Examensarbetet är utfört vid Institutionen för teknik och naturvetenskap (ITN) vid Tekniska fakulteten, Linköpings universitet</p>
|
8 |
Road damage detection withYolov8 on Swedish roadsEriksson, Martin January 2023 (has links)
This thesis addresses the problem of Road Damage Detection using object detection models,Yolov8 and Yolov5. While Yolov5 has been utilized in prior road damage detection projects, thiswork introduces the application of the newly released Yolov8 model to this domain. We haveprepared a dataset of 3,000 annotated images of road damage in Sweden and applied variousYolov8 and Yolov5 models to this dataset and a larger international one. The potential ofdeploying a lightweight Yolov8 model in a smartphone application for real-time detection, aswell as the effectiveness of an ensemble approach combining several models, were alsoexplored. The results show an F1 score of 0.57 and 0.6 for the best-performing models on theSwedish dataset and an international Road damage dataset respectively. Several box clusteringmethods were tested to combine the predictions of the ensemble, but none outperformed thebest individual model. A Quantized version of Yolov8 was deployed to a smartphone device withsatisfying performance. This work aims to create a model which can ultimately be used toimprove road safety and quality.T
|
9 |
CNN-BASED AUTONOMY BIN-PICKING PLATFORM WITH MINIMAL HUMAN INTERVENTIONJinho Park (55645) 22 July 2024 (has links)
<p> Vision-based robots have been utilized for pick-and-place operations for their repeatability. Various vision-based autonomous pick and place approaches using machine learning techniques have been researched for more flexible and lightweight operations with a large dataset for training. there is rare research about human intervention for dataset. This research suggests two methods for pick-and-place with minimum human intervention </p>
|
10 |
Biodiversity Monitoring Using Machine Learning for Animal Detection and Tracking / Övervakning av biologisk mångfald med hjälp av maskininlärning för upptäckt och spårning av djurZhou, Qian January 2023 (has links)
As an important indicator of biodiversity and ecological environment in a region, the number and distribution of animals has been given more and more attention by agencies such as nature reserves, wetland parks, and animal protection supervision departments. To protect biodiversity, we need to be able to detect and track the movement of animals to understand which animals are visiting the space. This thesis uses the improved You Only Look Once Version 5 (YOLOv5) target detection algorithm and Simple online and real-time tracking with a deep association metric (DeepSORT) tracking algorithm to provide technical support for bird monitoring, identification and tracking. Specifically, the thesis tries different improvement methods based on YOLOv5 to solve the problem that small targets in images are difficult to detect. In the backbone network, different attention modules are added to enhance the network feature extraction ability; in the neck network part, the Bi-Directional Feature Pyramid Network (BiFPN) structure is used to replace the Path Aggregation Network (PAN) structure to strengthen the utilization of underlying features; in the detection head part, a high-resolution detection head is added to improve the detection ability of tiny targets. In addition, a better loss function has been used to improve the algorithm’s performance on small birds. The improved algorithms in this paper have been used in multiple comparative experiments on the VisDrone data set and a data set of bird flight images, and the results show that compared with the baseline using YOLOv5, for VisDrone data set, Spatial-to-Depth (SPD)-Convolutional stride-free (Conv) gets the highest training mean Average Precision (mAP) of all methods with an increase from 0.325 to 0.419; for the bird data set, the best result of training mAP that could be achieved is adding a P2 layer, which reaches an improvement from 0.701 to 0.724. After combining the You Only Look Once (YOLO) with DeepSORT to implement the tracking function, the improved method makes the final tracking effect better. / Som en viktig indikator på biologisk mångfald och ekologisk miljö i en region har antal och utbredning av djur uppmärksammats mer och mer av organisationer som som naturreservat, våtmarksparker och djurskyddsmyndigheter. För att skydda den biologiska mångfalden måste vi kunna upptäcka och spåra djurs rörelser för att förstå vilka djur som besöker ett område. Uppsatsen använder den förbättrade YOLOv5-måldetektionsalgoritmen och DeepSORT-spårningsalgoritmen för fågelövervakning, identifiering och spårning. Specifikt undersöks olika förbättringsmetoder baserade på YOLOv5 för att lösa problemet med att små mål i bilder är svåra att upptäcka. I den första delen av nätverket läggs olika uppmärksamhetsmoduler till; i nästa används BiFPN-strukturen för att ersätta PAN-strukturen; i detektionsdelen läggs ett högupplöst detektionshuvud till för att förbättra detekteringsförmågan för små föremål. Dessutom har en bättre förlustfunktion använts för att förbättra algoritmens prestanda för små fåglar och andra djur. De förbättrade algoritmerna har testats flera jämförande experiment på VisDronedatamängden och en datamängd av bilder av flygande fåglar. Resultaten visar att jämfört med baslinjen med YOLOv5s, för VisDrone-datamängden får SPD-Conv det högsta tränings-mAP med en ökning från 0,325 till 0,419; för fågeldatamängden nås det bästa resultatet genom att lägga till ett P2-lager, vilket ger en förbättring från 0,701 till 0,724 av mAP. Efter att ha kombinerat YOLO med DeepSORT för att implementera spårningsfunktionen, blir den slutliga spårningseffekten bättre.
|
Page generated in 0.0456 seconds