Spelling suggestions: "subject:"yolov3"" "subject:"yolov3s""
1 |
Drone Detection and Classification using Machine LearningShafiq, Khurram 26 September 2023 (has links)
UAV (Unmanned Airborne Vehicle) is a source of entertainment and a pleasurable experience, attracting many young people to pursue it as a hobby. With the potential increase in the number of UAVs, the risk of using them for malicious purposes also increases. In addition, birds and UAVs have very similar maneuvers during flights. These UAVs can also carry a significant payload, which can have unintended consequences. Therefore, detecting UAVs near red-zone areas is an important problem. In addition, small UAVs can record video from large distances without being spotted by the naked eye. An appropriate network of sensors may be needed to foresee the arrival of such entities from a safe distance before they pose any danger to the surrounding areas.
Despite the growing interest in UAV detection, limited research has been conducted in this area due to a lack of available data for model training. This thesis proposes a novel approach to address this challenge by leveraging experimental data collected in real-time using high-sensitivity sensors instead of relying solely on simulations. This approach allows for improved model accuracy and a better representation of the complex and dynamic environments in which UAVs operate, which are difficult to simulate accurately. The thesis further explores the application of machine learning and sensor fusion algorithms to detect UAVs and distinguish them from other objects, such as birds, in real-time. Specifically, the thesis utilizes YOLOv3 with deep sort and sensor fusion algorithms to achieve accurate UAV detection.
In this study, we employed YOLOv3, a deep learning model known for its high efficiency and complexity, to facilitate real-time drone versus bird detection. To further enhance the reliability of the system, we incorporated sensor fusion, leading to a more stable and accurate real-time system, and mitigating the incidence of false detections. Our study indicates that the YOLOv3 model outperformed the state-of-the-art models in terms of both speed and robustness, achieving a high level of confidence with a score above 95%. Moreover, the YOLOv3 model demonstrated a promising capability in real-time drone versus bird detection, which suggests its potential for practical applications
|
2 |
A tracking framework for a dynamic non- stationary environment / Ett spårningsramverk för en dynamisk icke- stationär miljöStåhl, Sebastian January 2020 (has links)
As the use of unmanned aerial vehicles (UAVs) increases in popularity across the globe, their fields of application are constantly growing. This thesis researches the possibility of using a UAV to detect, track, and geolocate a target in a dynamic nonstationary environment as the seas. In this case, different projection and apparent size of the target in the captured images can lead to ambiguous assignments of coordinated. In this thesis, a framework based on a UAV, a monocular camera, a GPS receiver, and the UAV’s inertial measurement unit (IMU) is developed to perform the task of detecting, tracking and geolocating targets. An object detection model called Yolov3 was retrained to be able to detect boats in UAV footage. This model was selected due to its capabilities of detecting targets of small apparent sizes and its performance in terms of speed. A model called the kernelized correlation filter (KCF) is adopted as the visual tracking algorithm. This tracker is selected because of its performance in terms of speed and accuracy. A reinitialization of the tracker in combination with a periodic update of the tracked bounding box are implemented which resulted in improved performance of the tracker. A geolocation method is developed to continuously estimate the GPS coordinates of the target. These estimates will be used by the flight control method already developed by the stakeholder Airpelago to control the UAV. The experimental results show promising results for all models. Due to inaccurate data, the true accuracy of the geolocation method can not be determined. The average error calculated with the inaccurate data is 19.5 meters. However, an in- depth analysis of the results indicates that the true accuracy of the method is more accurate. Hence, it is assumed that the model can estimate the GPS coordinates of a target with an error significantly lower than 19.5 meters. Thus, it is concluded that it is possible to detect, track and geolocate a target in a dynamic nonstationary environment as the seas. / Användandet av drönare ökar i popularitet över hela världen vilket bidrar till att dess tillämpningsområden växer. I denna avhandling undersöks möjligheten att använda en drönare för att detektera, spåra och lokalisera ett mål i en dynamisk icke- stationär miljö som havet. Målets varierande position och storlek i bilderna leda till tvetydiga uppgifter. I denna avhandlingen utvecklas ett ramverk baserat på en drönare, en monokulär kamera, en GPS- mottagare och drönares IMU sensor för att utföra detektering, spårning samt lokalisering av målet. En objektdetekteringsmodell vid namn Yolov3 tränades för att kunna detektera båtar i bilder tagna från en drönare. Denna modell valdes på grund av dess förmåga att upptäcka små mål och dess prestanda vad gäller hastighet. En modell vars förkortning är KCF används som den visuella spårningsalgoritmen. Denna algoritm valdes på grund av dess prestanda när det gäller hastighet och precision. En återinitialisering av spårningsalgoritmen i kombination med en periodisk uppdatering av den spårade avgränsningsrutan implementeras vilket förbättrar trackerens prestanda. En lokaliseringsmetod utvecklas för att kontinuerligt uppskatta GPS- koordinaterna av målet. Dessa uppskattningar kommer att användas av en flygkontrollmetod som redan utvecklats av Airpelago för att styra drönaren. De experimentella resultaten visar lovande resultat för alla modeller. På grund av opålitlig data kan inte lokaliseringsmetodens precision fastställas med säkerhet. En djupgående analys av resultaten indikerar emellertid att metodens noggrannhet är mer exakt än det genomsnittliga felet beräknat med opålitliga data, som är 19.5 meter. Därför antas det att modellen kan uppskatta GPS- koordinaterna för ett mål med ett fel som är lägre än 19.5 meter. Således dras slutsatsen att det är möjligt att upptäcka, spåra och geolocera ett mål i en dynamisk icke- stationär miljö som havet.
|
3 |
Detection of Aircraft, Vehicles and Ships in Aerial and Satellite Imagery using Evolutionary Deep LearningThoudoju, Akshay Kumar January 2021 (has links)
Background. The view of the Earth from above can offer a lot of data and with technological advancements in image sensors and high-resolution satellite images there is more quantity and quality of data which can be useful in research and applications like military, monitoring climate, etc. Deep neural networks have been successful in object detection and it is seen that their learning process can be improved with using right hyperparameters when configuring the networks. This can be done hyperparameter optimization by the use of genetic algorithms. Objectives. The thesis focuses on obtaining deep learning techniques with optimal hyperparameters using genetic algorithm to detect aircraft, vehicles and ships from satellite and aerial images and compare the optimal models with the original deep learning models. Methods. The study uses literature review to obtain the appropriate deep learning techniques for object detection in satellite and aerial images, followed by conducting experiments in order to implement a genetic algorithm to find the right hyperparameters and then followed by another experiment which compares the performance between optimal and original deep learning model on basis of performance metrics mentioned in the study. Results. The literature review results depict that deep learning techniques for object detection in satellite and aerial images are Faster R-CNN, SSD and YOLO. The results of experiments show that the genetic algorithm was successful in finding optimal hyperparameters. The accuracy achieved by optimized models was higher than the original models in the case of aircraft, vehicles and ship detection. The results also show that the training times for the models have been reduced with the use of optimal hyperparameters with slight decrease in precision when detecting ships. Conclusions. After analyzing all the results carefully, the best deep learning techniques to detect aircraft, vehicles and ships are found and stated. The implementation of the genetic algorithm has been successful as it provided a set of hyperparameters which resulted in the improvement of accuracy, precision and recall in all scenarios except for values of precision in ship detection as well as improvement in training times.
|
4 |
Analys av inskannade arkiverade dokument med hjälp av objektdetektering uppbyggt på AISvedberg, Malin January 2020 (has links)
Runt om i världen finns det en stor mängd historiska dokument som endast finns i pappersform. Genom att digitalisera dessa dokument förenklas bland annat förvaring och spridning av dokumenten. Vid digitalisering av dokument räcker det oftast inte att enbart skanna in dokumenten och förvara dem som en bild, oftast finns det önskemål att kunna hantera informationen som dokumenten innehåller på olika vis. Det kan t.ex. vara att söka efter en viss information eller att sortera dokumenten utifrån informationen dem innehåller. Det finns olika sätt att digitalisera dokument och extrahera den information som finns på dem. I denna studie används metoden objektdetektering av typen YOLOv3 för att hitta och urskilja olika områden på historiska dokument i form av gamla registerkort för gamla svenska fordon. Objektdetekteringen tränas på ett egenskapat träningsdataset och träningen av objektdetekteringen sker via ramverket Darknet. Studien redovisar resultat i form av recall, precision och IoU för flera olika objektdetekteringsmodeller tränade på olika träningsdataset och som testats på ett flertal olika testdataset. Resultatet analyseras bland annat utifrån storlek och färg på träningsdatat samt mängden träning av objektdetekteringen.
|
5 |
A Comparative study of YOLO and Haar Cascade algorithm for helmet and license plate detection of motorcyclesMavilla Vari Palli, Anusha Jayasree, Medimi, Vishnu Sai January 2022 (has links)
Background: Every country has seen an increase in motorcycle accidents over the years due to social and economic differences as well as regional variations in transportation circumstances. One common mode of transportation for those in the middle class is a motorbike. Every motorbike rider is legally required to wear a helmet when driving a bike. However, some people on bikes used to ignore their safety, which resulted in them violating traffic rules by driving the bike without a helmet. The policeman tried to address this issue manually, but it was ineffective and proved to be quite challenging in practical circumstances. Therefore, automating this procedure is essential if we are to effectively enforce road safety. As a result, an automated system was created employing a variety of techniques, including Convolutional Neural Networks (CNN), the Haar Cascade Classifier, the You Only Look Once (YOLO), the Single Shot multi-box Detector (SSD), etc. In this study, YOLOv3 and Haar Cascade Classifier are used to compare motorcycle helmet and license plate detection. Objectives: This thesis aims to compare the machine learning algorithms that detect motorcycles’ license plates and helmets. Here, the Haar Cascade Classifier and YOLO algorithms are used on the US License Plates and Helmet Detection datasets to train the models. The accuracy is obtained in detecting the helmets and license plates of the motorcycles and analyzed. Methods: The experiment method is chosen to answer the research question. An experiment is performed to find the accuracy of the models in detecting the helmets and license plates of motorcycles. The datasets utilized for this are from Kaggle, which included 764 pictures of two distinct classes, i.e., with and without a helmet, along with 447 unique license plate images. Before training the model, preprocessing techniques are performed on US License Plates and Helmet Detection datasets. Now the datasets are divided into test and train datasets where the test data set size is considered to be 20% and the train data set size is 80%. The models are trained using 80% pre-processed training datasets and using the Haar Cascade Classifier and YOLOv3 algorithms. An observation is made by giving the 20% test data to the trained models. Finally, the prediction results of these two models are recorded and the accuracy is measured by generating a confusion matrix. Results: The efficient and best algorithm for detecting the helmets and license plates of motorcycles is identified from the experiment method. The YOLOv3 algorithm is considered more accurate in detecting motorcycles' helmets and license plates based on the results. Conclusions: Models are trained using Haar Cascade and YOLOv3 algorithms on US License Plates and Helmet Detection training datasets. The accuracy of the models in detecting the helmets and license plates of motorcycles is checked by using the testing datasets. The model trained using the YOLOv3 algorithm has high accuracy; hence, the Neural network-based YOLOv3 technique is thought to be the best and most efficient.
|
6 |
An automated validation of a cleared-out storage unit during move-out : A RoomPlan solution integrated with image classificationRimhagen, Elsa January 2024 (has links)
The efficient management of storage units requires a reliable and streamlined move-out process. Manual validation methods are resource intensive. Therefore, the task is to introduce an automated approach that capitalises on modern smartphone capabilities to improve the move-out validation process. Hence, the purpose of this thesis project. The proposed solution is a Proof of Concept (POC) application that utilises the Light Detection and Ranging (LiDAR) sensor and camera of a modern iPhone. This is performed through RoomPlan, a framework developed for real-time, indoor room scanning. It generates a 3D model of the room with its key characteristics. Moreover, to increase the number detectable object categories, the solution is integrated with the image classifier Tiny YOLOv3. The solution is evaluated through a quantitative evaluation in a storage unit. It shows that the application can validate whether the storage unit is empty or not in all the completed scans. However, an improvement of the object detecition is needed for the solution to work in a commercial case. Therefore, further work includes investigation of the possibilities to expand the object categories within the image classifier or creating a similar detection pipeline as RoomPlan adjusted for this specific case. The usage of LiDAR sensors indicated to be a stable object detector and a successful tool for the assignment. In contrast, the image classifier had lower detection accuracy in the storage unit.
|
7 |
Machine vision for automation of earth-moving machines : Transfer learning experiments with YOLOv3Borngrund, Carl January 2019 (has links)
This master thesis investigates the possibility to create a machine vision solution for the automation of earth-moving machines. This research was done as without some type of vision system it will not be possible to create a fully autonomous earth moving machine that can safely be used around humans or other machines. Cameras were used as the primary sensors as they are cheap, provide high resolution and is the type of sensor that most closely mimic the human vision system. The purpose of this master thesis was to use existing real time object detectors together with transfer learning and examine if they can successfully be used to extract information in environments such as construction, forestry and mining. The amount of data needed to successfully train a real time object detector was also investigated. Furthermore, the thesis examines if there are specifically difficult situations for the defined object detector, how reliable the object detector is and finally how to use service-oriented architecture principles can be used to create deep learning systems. To investigate the questions formulated above, three data sets were created where different properties were varied. These properties were light conditions, ground material and dump truck orientation. The data sets were created using a toy dump truck together with a similarly sized wheel loader with a camera mounted on the roof of its cab. The first data set contained only indoor images where the dump truck was placed in different orientations but neither the light nor the ground material changed. The second data set contained images were the light source was kept constant, but the dump truck orientation and ground materials changed. The last data set contained images where all property were varied. The real time object detector YOLOv3 was used to examine how a real time object detector would perform depending on which one of the three data sets it was trained using. No matter the data set, it was possible to train a model to perform real time object detection. Using a Nvidia 980 TI the inference time of the model was around 22 ms, which is more than enough to be able to classify videos running at 30 fps. All three data sets converged to a training loss of around 0.10. The data set which contained more varied data, such as the data set where all properties were changed, performed considerably better reaching a validation loss of 0.164 compared to the indoor data set, containing the least varied data, only reached a validation loss of 0.257. The size of the data set was also a factor in the performance, however it was not as important as having varied data. The result also showed that all three data sets could reach a mAP score of around 0.98 using transfer learning.
|
8 |
Multi-site Organ Detection in CT Images using Deep Learning / Regionsoberoende organdetektion i CT-bilder meddjupinlärningJacobzon, Gustaf January 2020 (has links)
When optimizing a controlled dose in radiotherapy, high resolution spatial information about healthy organs in close proximity to the malignant cells are necessary in order to mitigate dispersion into these organs-at-risk. This information can be provided by deep volumetric segmentation networks, such as 3D U-Net. However, due to limitations of memory in modern graphical processing units, it is not feasible to train a volumetric segmentation network on full image volumes and subsampling the volume gives a too coarse segmentation. An alternative is to sample a region of interest from the image volume and train an organ-specific network. This approach requires knowledge of which region in the image volume that should be sampled and can be provided by a 3D object detection network. Typically the detection network will also be region specific, although a larger region such as the thorax region, and requires human assistance in choosing the appropriate network for a certain region in the body. Instead, we propose a multi-site object detection network based onYOLOv3 trained on 43 different organs, which may operate on arbitrary chosen axial patches in the body. Our model identifies the organs present (whole or truncated) in the image volume and may automatically sample a region from the input and feed to the appropriate volumetric segmentation network. We train our model on four small (as low as 20 images) site-specific datasets in a weakly-supervised manner in order to handle the partially unlabeled nature of site-specific datasets. Our model is able to generate organ-specific regions of interests that enclose 92% of the organs present in the test set. / Vid optimering av en kontrollerad dos inom strålbehandling krävs det information om friska organ, så kallade riskorgan, i närheten av de maligna cellerna för att minimera strålningen i dessa organ. Denna information kan tillhandahållas av djupa volymetriskta segmenteringsnätverk, till exempel 3D U-Net. Begränsningar i minnesstorleken hos moderna grafikkort gör att det inte är möjligt att träna ett volymetriskt segmenteringsnätverk på hela bildvolymen utan att först nedsampla volymen. Detta leder dock till en lågupplöst segmentering av organen som inte är tillräckligt precis för att kunna användas vid optimeringen. Ett alternativ är att endast behandla en intresseregion som innesluter ett eller ett fåtal organ från bildvolymen och träna ett regionspecifikt nätverk på denna mindre volym. Detta tillvägagångssätt kräver dock information om vilket område i bildvolymen som ska skickas till det regionspecifika segmenteringsnätverket. Denna information kan tillhandahållas av ett 3Dobjektdetekteringsnätverk. I regel är även detta nätverk regionsspecifikt, till exempel thorax-regionen, och kräver mänsklig assistans för att välja rätt nätverk för en viss region i kroppen. Vi föreslår istället ett multiregions-detekteringsnätverk baserat påYOLOv3 som kan detektera 43 olika organ och fungerar på godtyckligt valda axiella fönster i kroppen. Vår modell identifierar närvarande organ (hela eller trunkerade) i bilden och kan automatiskt ge information om vilken region som ska behandlas av varje regionsspecifikt segmenteringsnätverk. Vi tränar vår modell på fyra små (så lågt som 20 bilder) platsspecifika datamängder med svag övervakning för att hantera den delvis icke-annoterade egenskapen hos datamängderna. Vår modell genererar en organ-specifik intresseregion för 92 % av organen som finns i testmängden.
|
9 |
Detekce dopravních značek a semaforů / Detection of Traffic Signs and LightsOškera, Jan January 2020 (has links)
The thesis focuses on modern methods of traffic sign detection and traffic lights detection directly in traffic and with use of back analysis. The main subject is convolutional neural networks (CNN). The solution is using convolutional neural networks of YOLO type. The main goal of this thesis is to achieve the greatest possible optimization of speed and accuracy of models. Examines suitable datasets. A number of datasets are used for training and testing. These are composed of real and synthetic data sets. For training and testing, the data were preprocessed using the Yolo mark tool. The training of the model was carried out at a computer center belonging to the virtual organization MetaCentrum VO. Due to the quantifiable evaluation of the detector quality, a program was created statistically and graphically showing its success with use of ROC curve and evaluation protocol COCO. In this thesis I created a model that achieved a success average rate of up to 81 %. The thesis shows the best choice of threshold across versions, sizes and IoU. Extension for mobile phones in TensorFlow Lite and Flutter have also been created.
|
10 |
You Only Gesture Once (YouGo): American Sign Language Translation using YOLOv3Mehul Nanda (8786558) 01 May 2020 (has links)
<div>The study focused on creating and proposing a model that could accurately and precisely predict the occurrence of an American Sign Language gesture for an alphabet in the English Language</div><div>using the You Only Look Once (YOLOv3) Algorithm. The training dataset used for this study was custom created and was further divided into clusters based on the uniqueness of the ASL sign.</div><div>Three diverse clusters were created. Each cluster was trained with the network known as darknet. Testing was conducted using images and videos for fully trained models of each cluster and</div><div>Average Precision for each alphabet in each cluster and Mean Average Precision for each cluster was noted. In addition, a Word Builder script was created. This script combined the trained models, of all 3 clusters, to create a comprehensive system that would create words when the trained models were supplied</div><div>with images of alphabets in the English language as depicted in ASL.</div>
|
Page generated in 0.053 seconds