Spelling suggestions: "subject:"[een] YOLOV8"" "subject:"[enn] YOLOV8""
1 |
Evaluation of Tree Planting using Computer Vision models YOLO and U-NetLiszka, Sofie January 2023 (has links)
Efficient and environmentally responsible tree planting is crucial to sustainable land management. Tree planting processes involve significant machinery and labor, impacting efficiency and ecosystem health. In response, Södra Skogsägarna introduced the BraSatt initiative to develop an autonomous planting vehicle called E-Beaver. This vehicle aims to simultaneously address efficiency and ecological concerns by autonomously planting saplings in clear-felled areas. BIT ADDICT, partnering with Södra Skogsägarna, is re- sponsible for developing the control system for E-Beaver’s autonomous navigation and perception. In this thesis work, we examine the possibility of using the computer vision models YOLO and U-Net for detecting and segmenting newly planted saplings in a clear felled area. We also compare the models’ performances with and without augmenting the dataset to see if that would yield better-performing models. RGB and RGB-D images were gath- ered with the ZED 2i stereo camera. Two different models are presented, one for detecting saplings in RGB images taken with a top-down perspective and the other for segmenting saplings trunks from RGB-D images taken with a side perspective. The purpose of this the- sis work is to be able to use the models for evaluating the plating of newly planted saplings so that autonomous tree planting can be done. The outcomes of this research showcase that YOLOv8s has great potential in detecting tree saplings from a top-down perspective and the YOLOv8s-seg models in segmenting sapling trunks. The YOLOv8s-seg models performed significantly better on segmenting the trunks compared to U-Net models. The research contributes insights into using computer vision for efficient and ecologi- cally sound tree planting practices, poised to reshape the future of sustainable land man- agement. / BraSatt
|
2 |
[pt] APERFEIÇOANDO MODELOS DE SLAM VISUAIS PELA COMBINAÇÃO DA ESTIMAÇÃO DE PROFUNDIDADE, SEGMENTAÇÃO SEMÂNTICA E REMOÇÃO DE OBJETOS DINÂMICOS USANDO MODELOS FUNDACIONAIS VISUAIS / [en] IMPROVING VISUAL SLAM BY COMBINING DEPTH ESTIMATION, SEMANTIC SEGMENTATION, AND DYNAMIC OBJECT REMOVAL USING VISUAL FOUNDATION MODELSPEDRO THIAGO CUTRIM DOS SANTOS 28 November 2024 (has links)
[pt] O objetivo de um sistema SLAM (Localização e Mapeamento Simultâneos) é estimar a trajetória da câmera no espaço enquanto reconstrói um mapa
preciso do ambiente ao redor. Sua definição pode ser explicada em duas partes: a primeira, mapear um ambiente não conhecido, e a segunda, realizar a
localização do agente neste ambiente através dos sensores disponíveis. Dentre
os diferentes tipos de sensores, câmeras possuem um custo menor de operação
ao mesmo tempo que fornecem uma quantidade rica de informações do ambiente que permitem um reconhecimento e mapeamento mais preciso. Devido a
isso, soluções onde apenas o uso da câmera é utilizado, chamado de Sistemas
SLAM Visuais, são de grande interesse. Este trabalho propõe a adaptação de
um Sistema SLAM que necessite apenas de uma câmera como sensor principal
e que use Visual Foundation Models para gerar imagens de profundidade que
auxiliem na robustez do mapeamento e localização no ambiente. Além disso,
tal sistema também deve ser capaz de identificar elementos dinâmicos no ambiente e removê-los do mapa, através do uso de modelos de visão computacional.
E por fim, deve ser viável para aplicações em tempo real. / [en] The goal of a SLAM (Simultaneous Localization and Mapping) system is
to estimate the camera s trajectory in space while reconstructing an accurate
map of the surrounding environment. Its definition can be explained in two
parts: the first one, mapping an unknown environment, and the second,
performing agent localization in this environment through available sensors.
Among the different types of sensors, cameras have lower operating costs
while providing a rich amount of environmental information that allows for
more precise mapping. Because of this, solutions where only the use of the
camera is employed as the main sensor, called Visual SLAM Systems, are of
great interest. This work proposes an adaptation of a Visual SLAM System
that uses Visual Foundation Models to generate depth images that assist in
the robustness of mapping and localization in the environment. Additionally,
such a system should also be capable of identifying dynamic elements in the
environment and removing them from the map, through the use of computer
vision models. Finally, this should be viable for real-time applications.
|
3 |
Exploring the Depth-Performance Trade-Off : Applying Torch Pruning to YOLOv8 Models for Semantic Segmentation Tasks / Utforska kompromissen mellan djup och prestanda : Tillämpning av Torch Pruning på YOLOv8-modeller för uppgifter om semantisk segmenteringWang, Xinchen January 2024 (has links)
In order to comprehend the environments from different aspects, a large variety of computer vision methods are developed to detect objects, classify objects or even segment them semantically. Semantic segmentation is growing in significance due to its broad applications in fields such as robotics, environmental understanding for virtual or augmented reality, and autonomous driving. The development of convolutional neural networks, as a powerful tool, has contributed to solving classification or object detection tasks with the trend of larger and deeper models. It is hard to compare the models from the perspective of depth since they are of different structure. At the same time, semantic segmentation is computationally demanding for the reason that it requires classifying each pixel to certain classes. Running these complicated processes on resource-constrained embedded systems may cause performance degradation in terms of inference time and accuracy. Network pruning, a model compression technique, targeting to eliminate the redundant parameters in the models based on a certain evaluation rule, is one solution. Most traditional network pruning methods, structural or nonstructural, apply zero masks to cover the original parameters rather than literally eliminate the connections. A new pruning method, Torch-Pruning, has a general-purpose library for structural pruning. This method is based on the dependency between parameters and it can remove groups of less important parameters and reconstruct the new model. A cutting-edge research work towards solving several computer vision tasks, Yolov8 has proposed several pre-trained models from nano, small, medium to large and xlarge with similar structure but different parameters for different applications. This thesis applies Torch-Pruning to Yolov8 semantic segmentation models to compare the performance of pruning based on existing models with similar structures, thus it is meaningful to compare the depth of the model as a factor. Several configurations of the pruning have been explored. The results show that greater depth does not always lead to better performance. Besides, pruning can bring about more generalization ability for Gaussian noise at medium level, from 20% to 40% compared with the original models. / För att förstå miljöer från olika perspektiv har en mängd olika datorseendemetoder utvecklats för att upptäcka objekt, klassificera objekt eller till och med segmentera dem semantiskt. Semantisk segmentering växer i betydelse på grund av dess breda tillämpningar inom områden som robotik, miljöförståelse för virtuell eller förstärkt verklighet och autonom körning. Utvecklingen av konvolutionella neurala nätverk, som är ett kraftfullt verktyg, har bidragit till att lösa klassificerings- eller objektdetektionsuppgifter med en trend mot större och djupare modeller. Det är svårt att jämföra modeller från djupets perspektiv eftersom de har olika struktur. Samtidigt är semantisk segmentering beräkningsintensiv eftersom den kräver att varje pixel klassificeras till vissa klasser. Att köra dessa komplicerade processer på resursbegränsade inbäddade system kan orsaka prestandanedgång när det gäller inferenstid och noggrannhet. Nätverksbeskärning, en modellkomprimeringsteknik som syftar till att eliminera överflödiga parametrar i modellerna baserat på en viss utvärderingsregel, är en lösning. De flesta traditionella nätverksbeskärningsmetoder, både strukturella och icke-strukturella, tillämpar nollmasker för att täcka de ursprungliga parametrarna istället för att bokstavligen eliminera anslutningarna. En ny beskärningsmetod, Torch-Pruning, har en allmän användningsområde för strukturell beskärning. Denna metod är baserad på beroendet mellan parametrar och den kan ta bort grupper av mindre viktiga parametrar och återskapa den nya modellen. Ett banbrytande forskningsarbete för att lösa flera datorseenduppgifter, Yolov8, har föreslagit flera förtränade modeller från nano, liten, medium till stor och xstor med liknande struktur men olika parametrar för olika tillämpningar. Denna avhandling tillämpar Torch-Pruning på Yolov8 semantiska segmenteringsmodeller för att jämföra prestandan för beskärning baserad på befintliga modeller med liknande strukturer, vilket gör det meningsfullt att jämföra djupet som en faktor. Flera konfigurationer av beskärningen har utforskats. Resultaten visar att större djup inte alltid leder till bättre prestanda. Dessutom kan beskärning medföra en större generaliseringsförmåga för gaussiskt brus på medelnivå, från 20% till 40%, jämfört med de ursprungliga modellerna.
|
4 |
Development of a Real-Time Safety System for Robotic Arms Using Computer Vision and Predictive Modeling : Enhancing Industrial Safety through YOLOv8, Kalman Filtering, and Dead ReckoningArabzadeh, Koray Aman January 2024 (has links)
I industriella miljöer är det avgörande att säkerställa människors säkerhet runt robotarmar för att förhindra allvarliga skador vid olyckor. Denna studie syftar till att utveckla ett realtidssystem för fara-detektering som använder datorseende och prediktiva modeller för att förbättra säkerheten. Genom att kombinera YOLOv8-algoritmen för objektigenkänning med Kalmanfiltrering (KF) och Dead Reckoning (DR) kan systemet upptäcka människors närvaro och förutsäga rörelser för att minska risken för olyckor. Det första experimentet visar att KF presterar bättre än DR, särskilt vid linjära rörelser, med lägre medelabsolutfel (MAE) och medelkvadratfel (MSE). Det andra experimentet visar att integrationen av KF med YOLOv8 resulterar i högre precision, noggrannhet och balanserad noggrannhet, även om återkallning fortfarande behöver förbättras. Dessa resultat indikerar att kombinationen av datorseende och prediktiva modeller har betydande potential att förbättra människors säkerhet. Ytterligare forskning och tester i olika scenarier är dock nödvändiga innan implementering i verkliga miljöer. / In industrial environments, ensuring human safety around robotic arms is crucial to prevent severe injuries from accidents. This study aims to develop a real-time hazard detection system using computer vision and predictive modeling techniques to improve safety. By combining the YOLOv8 object detection algorithm with Kalman Filtering (KF) and Dead Reckoning (DR), the system can detect human presence and predict movements to reduce the risk of accidents. The first experiment shows that KF outperforms DR, especially in linear movements, with lower Mean Absolute Error (MAE) and Mean Squared Error (MSE). The second experiment demonstrates that integrating KF with YOLOv8 results in higher precision, accuracy, and balanced accuracy, although recall still needs improvement. These findings indicate that combining computer vision with predictive modeling has significant potential to enhance human safety. However, further research and testing in diverse scenarios are necessary before real-world deployment.
|
5 |
Deep Neural Networks for Object Detection in Satellite ImageryFritsch, Frederik January 2023 (has links)
With the development of small satellites it has become easier and cheaper to deploy satellites for earth observation from space. While optical sensors capture high-resolution data, this data is traditionally sent to earth for analysis which puts a high constraint on the data link and increases the time for making data based decisions. This thesis explores the possibilities of deploying an AI model in small satellites for detecting objects in satellite imagery and therefore reduce the amount of data that needs to be transmitted. The neural network model YOLOv8 was trained on the xView and DIOR dataset and evaluated in a hardware restricted execution environment. The model achieved a mAP50 of 0.66 and could process satellite images at a speed of 309m2/s.
|
6 |
Investigation regarding the Performance of YOLOv8 in Pedestrian Detection / Undersökning angående YOLOv8s prestanda att detektera fotgängareJönsson Hyberg, Jonatan, Sjöberg, Adam January 2023 (has links)
Autonomous cars have become a trending topic as cars become better and better at driving autonomously. One of the big changes that have allowed autonomous cars to progress is the improvements in machine learning. Machine learning has made autonomous cars able to detect and react to obstacles on the road in real time. Like in all machine learning, there exists no solution that works better than all others, each solution has different strengths and weaknesses. That is why this study has tried to find the strengths and weaknesses of the object detector You Only Look Once v8 (YOLOv8) in autonomous cars. YOLOv8 was tested for how fast and accurately it could detect pedestrians in traffic in normal daylight images and light-augmented images. The trained YOLOv8 model was able to learn to detect pedestrians at high accuracy on daylight images, with the model achieving a mean Average Precision 50 (mAP50) of 0.874 with a Frames per second (FPS) of 67. Finally, the model struggled especially when the images got darker which means that the YOLOv8 in the current stage might not be good as the main detector for autonomous cars, as the detector loses accuracy at night. More tests with other datasets are needed to find all strengths and weaknesses of YOLOv8. / Autonoma bilar har blivit ett trendigt ämne då bilar blir bättre och bättre på att köra självständigt. En av de stora förändringarna som har gjort det möjligt för autonoma bilar att utvecklas är framstegen inom maskininlärning. Maskininlärning har gjort att autonoma bilar kan upptäcka och reagera på hinder på vägen i realtid. Som i all maskininlärning finns det ingen lösning som fungerar bättre än alla andra, varje lösning har olika styrkor och svagheter. Det är därför den här studien har försökt hitta styrkorna och svagheterna hos objektdetektorn You Only Look Once v8 (YOLOv8) i autonoma bilar. YOLOv8 testades för hur snabbt och precist den kunde upptäcka fotgängare i bilder av trafiken i dagsljus och bilder där ljuset har förändrat. Den tränade YOLOv8-modellen kunde lära sig att upptäcka fotgängare med hög noggrannhet på bilder i dagsljus, där modellen uppnådde en genomsnittlig medelprecision 50 (mAP50) på 0,874 med en antal bilder per sekund (FPS) på 67. Modellen hade särskilt svårt när bilderna blev mörkare vilket gör att YOLOv8 i det aktuella stadiet kanske inte är tillräckligt bra som huvuddetektor för autonoma bilar, eftersom detektorn tappar noggrannhet på mörkare bilder. Fler tester med andra datauppsättningar behövs för att hitta alla styrkor och svagheter med YOLOv8.
|
7 |
Road damage detection withYolov8 on Swedish roadsEriksson, Martin January 2023 (has links)
This thesis addresses the problem of Road Damage Detection using object detection models,Yolov8 and Yolov5. While Yolov5 has been utilized in prior road damage detection projects, thiswork introduces the application of the newly released Yolov8 model to this domain. We haveprepared a dataset of 3,000 annotated images of road damage in Sweden and applied variousYolov8 and Yolov5 models to this dataset and a larger international one. The potential ofdeploying a lightweight Yolov8 model in a smartphone application for real-time detection, aswell as the effectiveness of an ensemble approach combining several models, were alsoexplored. The results show an F1 score of 0.57 and 0.6 for the best-performing models on theSwedish dataset and an international Road damage dataset respectively. Several box clusteringmethods were tested to combine the predictions of the ensemble, but none outperformed thebest individual model. A Quantized version of Yolov8 was deployed to a smartphone device withsatisfying performance. This work aims to create a model which can ultimately be used toimprove road safety and quality.T
|
8 |
Detection of Oral Cancer From Clinical Images using Deep LearningSolanki, Anusha, 0009-0006-9086-9165 05 1900 (has links)
Objectives: To detect and distinguish oral malignant and non-malignant lesions from clinical
photographs using YOLO v8 deep learning algorithm.
Methods: This is a diagnostic study conducted using clinical images of oral cavity lesions. The
427 clinical images of the oral cavity were extracted from a publicly available dataset repository
specifically Kaggle and Mendeley data repositories. The datasets obtained were then categorized
into normal, abnormal (non-malignant), and malignant oral lesions by two independent oral
pathologists using Roboflow Annotation Software. The images collected were first set to a
resolution of 640 x 640 pixels and then randomly split into 3 sets: training, validation, and testing
– 70:20:10, respectively. Finally, the image classification analysis was performed using the YOLO
V8 classification algorithm at 20 epochs to classify and distinguish between malignant lesions,
non-malignant lesions, and normal tissue. The performance of the algorithm was assessed using
the following parameters accuracy, precision, sensitivity, and specificity.
Results: After training and validation with 20 epochs, our oral cancer image classification
algorithm showed maximum performance at 15 epochs. Based on the generated normalized
confusion matrix, the sensitivity of our algorithm in classifying normal images, non-malignant
images, and malignant images was 71%, 47%, and 54%, respectively. The specificity of our
algorithm in classifying normal images, non-malignant, and malignant images were 86%, 65%,
and 72%. The precision of our algorithm in classifying normal images, non-malignant images,
and malignant images was 73%, 62%, and 35%, respectively. The overall accuracy of our oral
cancer image classification algorithm was 55%. On a test set, our algorithm gave an overall 96%
accuracy in detecting malignant lesions.
Conclusion: Our object classification algorithm showed a promising application in
distinguishing between malignant, non-malignant, and normal tissue. Further studies and
continued research will observe increasing emphasis on the use of artificial intelligence to
enhance understanding of early detection of oral cancer and pre-cancerous lesions.
Keywords: Normal, Non-malignant, Malignant lesions, Image classification, Roboflow
annotation software, YOLO v8 object/image classification algorithm. / Oral Biology
|
9 |
Deep-Learning Conveyor Belt Anomaly Detection Using Synthetic Data and Domain AdaptationFridesjö, Jakob January 2024 (has links)
Conveyor belts are essential components used in the mining and mineral processing industry to transport granular material and objects. However, foreign objects/anomalies transported along the conveyor belts can result in catastrophic and costly consequences. A solution to the problem is to use machine vision systems based on AI algorithms to detect anomalies before any incidents occur. However, the challenge is to obtain sufficient training data when images containing anomalous objects are, by definition, scarce. This thesis investigates how synthetic data generated by a granular simulator can be used to train a YOLOv8-based model to detect foreign objects in a real world setting. Furthermore, the domain gap between the synthetic data domain and real-world data domain is bridged by utilizing style transfer through CycleGAN. Results show that using YOLOv8s-seg for instance segmentation of conveyors is possible even when trained on synthetic data. It is also shown that using domain adaptation by style transfer using CycleGAN can improve the performance of the synthetic model, even when the real-world data lacks anomalies.
|
10 |
Artificiell intelligens inom IT-forensik : Kan AI effektivisera brottsutredningarCarlsson, Felix, Rapp, Ted January 2024 (has links)
Artificiell intelligens är ett snabbt utvecklande område som gör det möjligt att automatisera och effektivisera arbetsuppgifter, vilket kan behövas när vi genererar mer mängder data än någonsin. Syftet med denna uppsats var att undersöka potentialen i att integrera AI inom IT-forensiska brottsutredningar. Genom en litteraturöversikt visades det hur olika tekniker inom AI kunde appliceras för att underlätta för dagens IT-forensiska utredare. Genom ett experiment demonstrerades också hur AI-applikationen “objektdetektering” kunde underlätta IT-forensiskt arbete inom bildanalys.
|
Page generated in 0.0529 seconds