• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 350
  • 42
  • 20
  • 13
  • 11
  • 9
  • 8
  • 5
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 554
  • 554
  • 258
  • 215
  • 178
  • 138
  • 117
  • 114
  • 108
  • 96
  • 87
  • 84
  • 77
  • 75
  • 74
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
431

Optimization of Convolutional Neural Networks for Enhanced Compression Techniques and Computer Vision Applications

Couture Del Valle, Christopher Javier 26 July 2022 (has links)
No description available.
432

The V-SLAM Hurdler : A Faster V-SLAM System using Online Semantic Dynamic-and-Hardness-aware Approximation / V-SLAM Häcklöparen : Ett Snabbare V-SLAM System med Online semantisk Dynamisk-och-Hårdhetsmedveten Approximation

Mingxuan, Liu January 2022 (has links)
Visual Simultaneous Localization And Mapping (V-SLAM) and object detection algorithms are two critical prerequisites for modern XR applications. V-SLAM allows XR devices to geometrically map the environment and localize itself within the environment, simultaneously. Furthermore, object detectors based on Deep Neural Network (DNN) can be used to semantically understand what those features in the environment represent. However, both of these algorithms are computationally expensive, which makes it challenging for them to achieve good real-time performance on device. In this thesis, we first present TensoRT Quantized YOLOv4 (TRTQYOLOv4), a faster implementation of YOLOv4 architecture [1] using FP16 reduced precision and INT8 quantization powered by NVIDIA TensorRT [2] framework. Second, we propose the V-SLAM Hurdler: A Faster VSLAM System using Online Dynamic-and-Hardness-aware Approximation. The proposed system integrates the base RGB-D V-SLAM ORB-SLAM3 [3] with the INT8 TRTQ-YOLOv4 object detector, a novel Entropy-based Degreeof- Difficulty Estimator, an Online Hardness-aware Approximation Controller and a Dynamic Object Eraser, applying online dynamic-and-hardness aware approximation to the base V-SLAM system during runtime while increasing its robustness in dynamic scenes. We first evaluate the proposed object detector on public object detection dataset. The proposed FP16 precision TRTQ-YOLOv4 achieves 2×faster than the full-precision model without loss of accuracy, while the INT8 quantized TRTQ-YOLOv4 is almost 3×faster than the full-precision one with only 0.024 loss in mAP@50:5:95. Second, we evaluate our proposed V-SLAM system on public RGB-D SLAM dataset. In static scenes, the proposed system speeds up the base VSLAM system by +21.2% on average with only −0.7% loss of accuracy. In dynamic scenes, the proposed system not only accelerate the base system by +23.5% but also improves the accuracy by +89.3%, making it as robust as in the static scenes. Lastly, the comparison against the state-of-the-art SLAMs designed dynamic environments shows that our system outperforms most of the compared methods in highly dynamic scenes. / Visual SLAM (V-SLAM) och objektdetekteringsalgoritmer är två kritiska förutsättningar för moderna XR-applikationer. V-SLAM tillåter XR-enheter att geometriskt kartlägga miljön och lokalisera sig i miljön samtidigt. Dessutom kan DNN-baserade objektdetektorer användas för att semantiskt förstå vad dessa egenskaper i miljön representerar. Men båda dessa algoritmer är beräkningsmässigt dyra, vilket gör det utmanande för dem att uppnå bra realtidsprestanda på enheten. I det här examensarbetet presenterar vi först TRTQ-YOLOv4, en snabbare implementering av YOLOv4 arkitektur [1] med FP16 reducerad precision och INT8 kvantisering som drivs av NVIDIA TensorRT [2] ramverk. För det andra föreslår vi V-SLAM-häckaren: ett snabbare V-SLAM-system som använder online-dynamisk och hårdhetsmedveten approximation. Det föreslagna systemet integrerar basen RGB-D V-SLAM ORB-SLAM3 [3] med INT8 TRTQYOLOv4 objektdetektorn, en ny Entropi-baserad svårighetsgradsuppskattare, en online hårdhetsmedveten approximationskontroller och en Dynamic Object Eraser, applicerar online-dynamik- och hårdhetsmedveten approximation till bas-V-SLAM-systemet under körning samtidigt som det ökar dess robusthet i dynamiska scener. Vi utvärderar först den föreslagna objektdetektorn på datauppsättning för offentlig objektdetektering. Den föreslagna FP16 precision TRTQ-YOLOv4 uppnår 2× snabbare än fullprecisionsmodellen utan förlust av noggrannhet, medan den INT8 kvantiserade TRTQ-YOLOv4 är nästan 3× snabbare än fullprecisionsmodellen med endast 0.024 förlust i mAP@50:5:95. För det andra utvärderar vi vårt föreslagna V-SLAM-system på offentlig RGB-D SLAM-datauppsättning. I statiska scener snabbar det föreslagna systemet upp V-SLAM-bassystemet med +21.2% i genomsnitt med endast −0.7% förlust av noggrannhet. I dynamiska scener accelererar det föreslagna systemet inte bara bassystemet med +23.5% utan förbättrar också noggrannheten med +89.3%, vilket gör det lika robust som i de statiska scenerna. Slutligen visar jämförelsen med de senaste SLAM-designade dynamiska miljöerna att vårt system överträffar de flesta av de jämförda metoderna i mycket dynamiska scener.
433

Data Augmentation for Safe 3D Object Detection for Autonomous Volvo Construction Vehicles

Zhao, Xun January 2021 (has links)
Point cloud data can express the 3D features of objects, and is an important data type in the field of 3D object detection. Since point cloud data is more difficult to collect than image data and the scale of existing datasets is smaller, point cloud data augmentation is introduced to allow more features to be discovered on existing data. In this thesis, we propose a novel method to enhance the point cloud scene, based on the generative adversarial network (GAN) to realize the augmentation of the objects and then integrate them into the existing scenes. A good fidelity and coverage are achieved between the fake sample and the real sample, with JSD equal to 0.027, MMD equal to 0.00064, and coverage equal to 0.376. In addition, we investigated the functional data annotation tools and completed the data labeling task. The 3D object detection task is carried out on the point cloud data, and we have achieved a relatively good detection results in a short processing of around 22ms. Quantitative and qualitative analysis is carried out on different models. / Punktmolndata kan uttrycka 3D-egenskaperna hos objekt och är en viktig datatyp inom området för 3D-objektdetektering. Eftersom punktmolndata är svarare att samla in än bilddata och omfattningen av befintlig data är mindre, introduceras punktmolndataförstärkning för att tillåta att fler funktioner kan upptäckas på befintlig data. I det här dokumentet föreslår vi en metod för att förbättra punktmolnsscenen, baserad på det generativa motstridiga nätverket (GAN) för att realisera förstärkningen av objekten och sedan integrera dem i de befintliga scenerna. En god trohet och tackning uppnås mellan det falska provet och det verkliga provet, med JSD lika med 0,027, MMD lika med 0,00064 och täckning lika med 0,376. Dessutom undersökte vi de funktionella verktygen för dataanteckningar och slutförde uppgiften for datamärkning. 3D- objektdetekteringsuppgiften utförs på punktmolnsdata och vi har uppnått ett relativt bra detekteringsresultat på en kort bearbetningstid runt 22ms. Kvantitativ och kvalitativ analys utförs på olika modeller.
434

Utvärdering av noggrannheten av kastparablar på en iPad / Accuracy evaluation of trajectories on an iPad

Waninger, Mikael, Rothman, Sofia January 2022 (has links)
Prestationsmätning och analys används inom sporter för att förbättra en spelares resultat relaterade till sin respektive sport. För analys finns labb och/eller dyr utrustning vilket gör den svårtillgänglig för icke-professionella utövare. Att minska kostnaden för mätverktyg bidrar till mer jämlika förutsättningar för spelare oavsett inkomst eller ålder. Den här studien syftar till att undersöka om en smartphone eller surfplatta kan användas för mätning och sportanalys. För att utforska detta utvecklades en applikation med fokus på projektilsporter som fotboll, tennis och golf. Applikationen testar visualisering av ett objekts parabel, mätning av dess hastighet och visualisering av dess träff i ett vertikalt plan. Applikationen utvecklades för iOS och testades på en iPad 12 pro. Tester för att validera applikationens noggrannhet utfördes med en fotboll, en tennisboll och en golfboll. Testresultaten för visualisering av parabel gav resultat för fotboll och tennisboll men kunde inte hantera golfbollens mindre storlek. Hastighet kunde mätas för alla tre bollar med en genomsnittlig procentuell avvikelse på 76% för fotboll, 21% för tennisboll och 43% för golfboll. Testresultaten för visualisering av ett objekts träff i ett målplan visade resultat för fotboll och tennisboll, men inte för en golfboll. Den genomsnittliga procentuella avvikelsen var 89% för fotboll respektive 23% för tennisboll. / Measuring and analyzing player performance within sports helps to improve a players results in regards to their respective sport. Specialized labs and or expensive equipment are used for analysis but are difficult to access for the average player. Decreasing the cost of measurement tools would help equalize the playing field for players regardless of age or economic background. This study evaluates if a smartphone or tablet can be used to perform the same task. To achieve this an application was developed with a focus on projectile sports such as soccer, tennis, or golf. The application will visualize a parabola, measure speed, and visualize the point where an object hits a vertical plane. The application was developed for iOS and was tested on an iPad 12 pro. The tests were performed with a soccer ball, tennis ball and golf ball. Tests for visualizing a parabola produced results for the soccer ball and the tennis ball but could not handle the golf balls smaller size. Speed was measured for all three balls with an average percentual offset of 76% for the soccer ball, 21% for the tennis ball and 43% for the golf ball. Hit on a vertical plan produced results for the soccer ball and tennis ball with an average percentual offset of 89% for the soccer ball and 23% for the tennis ball.
435

Near Realtime Object Detection : Optimizing YOLO Models for Efficiency and Accuracy for Computer Vision Applications

Abo Khalaf, Mulham January 2024 (has links)
Syftet med denna studie är att förbättra effektiviteten och noggrannheten hos YOLO-modeller genom att optimera dem, särskilt när de står inför begränsade datorresurser. Det akuta behovet av objektigenkänning i nära realtid i tillämpningar som övervakningssystem och autonom körning understryker betydelsen av bearbetningshastighet och exceptionell noggrannhet. Avhandlingen fokuserar på svårigheterna med att implementera komplexa modeller för objektidentifiering på enheter med låg kapacitet, nämligen Jetson Orin Nano. Den föreslår många optimeringsmetoder för att övervinna dessa hinder. Vi utförde flera försök och gjorde metodologiska förbättringar för att minska bearbetningskraven och samtidigt bibehålla en stark prestanda för objektdetektering. Viktiga komponenter i forskningen inkluderar noggrann modellträning, användning av bedömningskriterier och undersökning av optimeringseffekter på modellprestanda i verkliga miljöer. Studien visar att det är möjligt att uppnå optimal prestanda i YOLO-modeller trots begränsade resurser, vilket ger betydande framsteg inom datorseende och maskininlärning. / The objective of this study is to improve the efficiency and accuracy of YOLO models by optimizing them, particularly when faced with limited computing resources. The urgent need for near realtime object recognition in applications such as surveillance systems and autonomous driving underscores the significance of processing speed and exceptional accuracy. The thesis focuses on the difficulties of implementing complex object identification models on low-capacity devices, namely the Jetson Orin Nano. It suggests many optimization methods to overcome these obstacles. We performed several trials and made methodological improvements to decrease processing requirements while maintaining strong object detecting performance. Key components of the research include meticulous model training, the use of assessment criteria, and the investigation of optimization effects on model performance in reallife settings. The study showcases the feasibility of achieving optimal performance in YOLO models despite limited resources, bringing substantial advancements in computer vision and machine learning.
436

Advancing Precision Agriculture Through AI and Statistical Modeling: Transforming Crop and Livestock Management

Mann, Sahilpreet Singh 06 January 2025 (has links)
This thesis explores the application of Artificial Intelligence (AI), machine learning (ML), and statistical analysis to enhance agricultural practices, focusing on both livestock man- agement and plant biology. The first part investigates automated weight prediction of beef cattle using computer vision techniques, including YOLOv9 and InternImage with Cas- cade R-CNN for precise image segmentation. Advanced feature extraction methods utilizing ResNet, DenseNet, and ResNeXt are employed to develop ML and deep learning (DL) mod- els, providing a non-invasive alternative to traditional weight measurement techniques. The second part examines the regulation of the auxin response in Arabidopsis plants, focusing on epistatic interactions among auxin receptors. Through experimental assays and com- putational modeling, the study reveals synergistic effects that influence plant growth and development. The third part of the thesis characterizes the transcriptional specificity medi- ated by plant hormones using comprehensive data analysis, uncovering key insights into the gene regulation mechanisms influenced by auxin. Overall, the research integrates AI, ML, DL, and statistical methods to address critical challenges in agriculture and plant science, demonstrating improved predictive accuracy, enhanced understanding of hormonal signaling, and potential advancements in crop productivity and livestock management / Master of Science / This research applies advanced deep learning technology and statistical analysis to improve farming practices and plant science. The study first focuses on helping farmers predict the weight of cows using cameras and AI software instead of traditional scales, providing a faster and less stressful method for both animals and farmers. Next, the research investigates how plant hormones, specifically auxin, interact with certain proteins to regulate plant growth. Understanding these interactions helps scientists predict plant responses and enhance crop yields. Lastly, the study examines how these hormones influence specific genes, using data analysis to reveal how plants control their growth at a molecular level. By combining AI, biology, and statistical methods, this work offers new tools for improving livestock manage- ment and understanding plant growth, ultimately contributing to better farming practices and increased agricultural productivity.
437

Roadside Fisheye Vision for Detection, Localization, and Movement Classification of Road Users at Intersections

Adl, Morteza January 2024 (has links)
This thesis addresses key challenges in intersection traffic monitoring using overhead fisheye cameras, focusing on object detection, localization, vehicle maneuver classification, and traffic violation detection. A data augmentation technique was developed to improve the performance of deep learning-based object detection algorithms for fisheye images. By fine-tuning these models, significant improvements in Average Precision (AP) were achieved for vehicle and pedestrian detection, effectively addressing object orientation and size variability. A novel calibration method was introduced to mitigate the effects of road surface elevation changes on object localization. This method accurately translates image coordinates into geographical coordinates by incorporating 3D road characteristics. The proposed localization algorithm, validated through field tests, demonstrated high accuracy in localizing both cars and pedestrians. Furthermore, Kalman filtering techniques were integrated to enhance object tracking, providing precise localization even in complex environments like sloped streets. In addition, a self-learning vehicle maneuver classification and counting algorithm was developed, capable of recognizing various vehicle movements such as turns and U-turns. The algorithm’s performance was validated in real-world scenarios, where it successfully classified and counted vehicle maneuvers at multiple intersections. Moreover, a traffic violation detection system was designed on top of the maneuver classification algorithm to identify common infractions like box-blocking and illegal turns at intersections. The outcomes of this research contribute to a comprehensive system that enhances traffic monitoring, safety enforcement, and operational efficiency at intersections, offering practical solutions to modern traffic management challenges. / Thesis / Doctor of Philosophy (PhD) / This thesis explores methods for improving intersection traffic monitoring using overhead fisheye cameras. It enhances vehicle and pedestrian detection with a refined object detection model that improves accuracy by effectively handling object orientation and size variations. A new calibration technique has been developed to address road surface elevation changes, improving the precision of object localization. The thesis also presents a novel object localization approach that combines Kalman filtering with camera altitude correction, enabling accurate object localization in complex environments like sloped streets. Additionally, a new vehicle counting algorithm is designed to handle fisheye imagery and traffic management challenges. This system has proven effective in real-world tests, accurately classifying the vehicle maneuvers used to detect traffic violations such as illegal turns and box-blocking with an impressive precision rate. The proposed methods significantly enhance real-time traffic monitoring and enforcement, contributing to safer and more efficient intersections.
438

Towards a Comprehensive Bicycle Motion Behavior Model and Naturalistic Cycling Dataset

Alazemi, Fahd 25 May 2022 (has links)
Most of the existing bicycle flow traffic research is limited to characterizing the longitudinal motion of bicyclists based on the assumption that there is no significant differences between the dynamics of a single-file bicycle traffic and the longitudinal motion behavior of cars. This research reparametrizes an existing car-following model to describe bicycle-following and motion behavior. Furthermore, the lack of naturalistic data has limited the validation of this model. This research aims at developing a descriptive model that is capable of capturing the inherent non-lane-based traffic behavior characteristics of bicycle traffic and provides a methodology for extracting naturalistic cycling data from video feeds for use in safety and mobility applications. In this study, The Fadhloun-Rakha (FR) bicycle-following longitudinal motion model was extended through complementing it with a lateral motion strategy; thus allowing for overtaking maneuvers and lateral bicycle movements. For the most part, the following strategy of the FR model remains valid for modeling the longitudinal motion of bicycles except for the activation conditions of the collision avoidance strategy which are modified in order to allow for overtaking when possible. The proposed methodology is innovative in that it makes use of the intersection of certain pre-defined regions around the bicycles to decide on the feasibility of angular motion along with its direction and magnitude. The resulting model is the first point-mass dynamics-based model for the description of the longitudinal and lateral behavior of bicycles in both constrained and unconstrained conditions, and it is the only existing model that is sensitive to the bicyclist physical characteristics and the bicycle and roadway surface conditions given that the used longitudinal logic was previously validated against experimental cycling data. In relation to the development of the naturalistic cycling dataset, the used videos come from a dataset collected in a previous Virginia Tech Transportation Institute study in collaboration with SPIN in which continuous video data at a non-signalized intersection on the Virginia Tech campus was recorded. The research applied computer vision and machine learning techniques to develop a comprehensive framework for the extraction of naturalistic cycling trajectories. In total, this study resulted in the collection and classification of 619 bicycle trajectories based on their type of interactions with other road users. The results confirm the success of the proposed methodology in relation to extracting the locations, speeds, and accelerations of the bicycles with a high precision level. Furthermore, preliminary insights into the acceleration and speed behavior of bicyclists around motorists are determined. / Master of Science / The behavior of bicycle traffic differs from the that of cars. Bicycle traffic flow dynamics is unconstrained in lateral motion and overtaking when compared to car traffic flow. Based on this inherent behavior, existing car-following can only model the longitudinal motion of the bicycle flow traffic and it does not describe the non-lane base traffic that characterizes bicycle traffic dynamics. Furthermore, the existing experimental controlled dataset used for validating bicycle traffic flow models does not capture the naturalistic behavior of cyclists. Therefore, this research aims to develop a descriptive model that is capable of capturing the inherent non-lane-based traffic behavior characteristics of bicycle traffic and provides a methodology for extracting a naturalistic cycling data from a video dataset for use in safety and mobility applications. In this study, the Fadhloun-Rakha (FR) bicycle-following longitudinal motion model was extended through complementing it with a lateral motion strategy; thus allowing for overtaking maneuvers and lateral bicycle movements. For the most part, the following strategy of the FR model remains valid for modeling the longitudinal motion of bicycles except for the activation conditions of the collision avoidance strategy which are modified in order to allow for overtaking when possible. The proposed methodology is innovative in that it makes use of the intersection of certain pre-defined regions around the bicycles to decide on the feasibility of angular motion along with its direction and magnitude. The resulting model is the first point-mass dynamics-based model for the description of the longitudinal and lateral behavior of bicycles in both constrained and unconstrained conditions, and it is the only existing model that is sensitive to the bicyclist physical characteristics and the bicycle and roadway surface conditions given that the used longitudinal logic was previously validated against experimental cycling data. In relation to the development of the naturalistic cycling dataset, the used videos come from a dataset collected in a previous Virginia Tech Transportation Institute study in collaboration with SPIN in which continuous video data at a non-signalized intersection on the Virginia Tech campus was recorded. The research applied computer vision and machine learning techniques to develop a comprehensive framework for the extraction of naturalistic cycling trajectories. In total, this study resulted in the collection and classification of 619 bicycle trajectories based on their type of interactions with other road users. The results confirm the success of the proposed methodology in relation to extracting the locations, speeds, and accelerations of the bicycles with a high precision level. Furthermore, preliminary insights into the acceleration and speed behavior of bicyclists around motorists are determined.
439

KI-basierte Detektion von Meilerplätzen mithilfe der Kombination luftgestützter LiDAR-Datenprodukte und Neuronaler Netze

Rünger, Carolin 20 August 2024 (has links)
Die historische Holzkohleproduktion spielte eine bedeutende Rolle in der industriellen Entwicklung. Traditionell wurde Holzkohle in sogenannten Meilern, aufrechtstehenden Öfen, hergestellt. Diese Praxis führte zur weitreichenden Abholzung und veränderte die Vegetationszusammensetzung. Um die historische Waldbedeckung und historischen Landnutzungspraktiken besser zu verstehen, ist es notwendig, die räumliche Verteilung der Meiler zu analysieren. Die manuelle Kartierung der Meilerüberreste mittels DGM-Visualisierungstechniken ist sehr zeit- und arbeitsintensiv. Diese Arbeit untersucht daher den Einsatz von Deep Learning zur automatischen Detektion von Meilerplätzen basierend auf LiDAR-Datenprodukten. Hierfür wurden vortrainierte Modelle der Toolbox MMDetection mit DGM-Bildern trainiert, um ein spezifisch auf Meiler abgestimmtes Modell zu entwickeln. Insgesamt wurden vier Experimente durchgeführt, die den Einfluss verschiedener DGM-Visualisierungen, die Größe der Bounding Boxen und Hyperparameter unter Verwendung des FoveaBox-Detektors sowie die Leistung unterschiedlicher Modelle (ATSS, VFNet, RetinaNet) analysierten. Die Ergebnisse zeigen, dass ein 3-Band Bild bestehend aus Hügelschattierung, Sky-View Faktor und Neigung sowie eine Bounding Box Größe von 50 m optimal für die Detektion von Meilern sind. Der FoveaBox-Detektor erzielte die beste Leistung mit dem RAdam-Optimierer und einer Lernrate von 0.0001, wobei das ATSS-Modell mit den gleichen Hyperparametern die schlüssigsten Ergebnisse mit einer Genauigkeit von 93 % erreichte und nur 7 % der Meiler übersah. Das ATSS-Modell zeigte im Gegensatz zu anderen Studien eine um bis zu 10 % bessere Leistung. Ausschlaggebende Faktoren für diese Verbesserungen waren der verwendete Datensatz aus den 3-Band Bildern, die Größe der Bounding Boxen und die umfangreichere Datenaugmentierung, insbesondere die ergänzende Nutzung radiometrischer Techniken. Durch die experimentelle Herangehensweise konnte die Erkennungsgenauigkeit um 13 % gesteigert werden. Im Vergleich zur manuellen Kartierung hat das Modell viele zusätzliche Meiler identifiziert, obwohl es gelegentlich zu Verwechslungen mit angehäufter Erde am Hang und Fehldetektionen in unebenem Gelände mit geringen Höhenunterschieden kam. Die Eignung des Algorithmus zur verbesserten Erkennung von Meilerplätzen anstelle der manuellen Kartierung wird als effizienter, aber nicht zwangsläufig als präziser eingeschätzt:Selbständigkeitserklärung II Weitergabe der Arbeit II Kurzfassung IV Abstract V Abbildungsverzeichnis VIII Tabellenverzeichnis X Abkürzungsverzeichnis XI 1 Einleitung 1 1.1 Problemstellung und Zielsetzung 1 1.2 Aufbau der Arbeit 2 2 Grundlagen 3 2.1 Historischer und archäologischer Kontext von Meilerplätzen 3 2.1.1 Holzkohleproduktion und ihre Auswirkungen auf die Umwelt 3 2.1.2 Wichtigkeit der Erforschung von Meilerplätzen 4 2.1.3 Aussehen der Meilerüberreste 5 2.2 Einsatz von LiDAR-Daten für die Detektion von Meilerplätzen 6 2.2.1 Einführung in LiDAR 6 2.2.2 LiDAR in der archäologischen Praxis 8 2.2.3 Visualisierungstechniken von Höhenmodellen 10 2.2.4 Automatisierte Detektion von Meilerplätzen 15 2.3 Objekterkennung mit Deep Learning 16 2.3.1 Einführung in Deep Learning 16 2.3.2 Bildbasierte Objekterkennung von kleinen Objekten 17 2.3.3 Training eines Deep Learning-Modells 18 2.3.4 Datenaugmentierung 19 2.3.5 Hyperparameter 21 2.3.6 Bewertungsmetriken 21 2.3.7 Kategorisierung von Deep Learning-Modellen 23 2.3.8 Verwendete Modelle 25 3 Daten und Methoden 31 3.1 Datengrundlage und Computer-Hardware 31 3.2 Aufbereitung der Daten 32 3.2.1 Bearbeitung der Meilerdaten 32 3.2.2 Vorverarbeitung der DGM-Bilder 33 3.2.3 Aufteilung in Trainings-, Test- und Validierungsdatensatz 34 3.2.4 Datenaugmentierung des Trainingsdatensatzes 35 3.2.5 Verwendete DGM-Visualisierungstechniken 37 3.2.6 COCO-Format und Normalisierung 38 3.3 Experimentelles Vorgehen 39 3.3.1 Experiment 1: Verschiedene Eingangsdaten 39 3.3.2 Experiment 2: Verschiedene Bounding Box-Größen 40 3.3.3 Experiment 3: Verschiedene Hyperparameter 41 3.3.4 Experiment 4: Verschiedene Modelle 41 3.4 Verwendete Bewertungsmetriken 42 4 Ergebnisse 44 4.1 Experiment 1: Verschiedene Eingangsdaten 44 4.2 Experiment 2: Verschiedene Bounding Box-Größen 48 4.3 Experiment 3: Verschiedene Hyperparameter 52 4.4 Experiment 4: Verschiedene Modelle 56 4.5 Inferenz des besten Modells auf ein unbekanntes Gebiet 61 5 Diskussion 63 5.1 Interpretation der Ergebnisse 63 5.2 Vergleich der Ergebnisse mit anderen Studien 66 5.3 Bewertung der Modelleistung in einem gut und schlecht zu kartierendem Gebiet 68 6 Fazit und Ausblick 71 7 Literaturverzeichnis 73 Anhang 78 / The historical production of charcoal played a significant role in the industrial development. Traditionally, charcoal was produced in so-called kilns, upright ovens. This practice led to extensive deforestation and changed the vegetation composition. In order to better understand historical forest cover and historical land use practices, it is necessary to analyze the spatial distribution of the charcoal kilns. However, manual mapping of the kilns remains using DTM visualization techniques is very time-consuming and labour-intensive. Therefore, this study examines the use of deep learning for the automatic detection of charcoal kiln sites based on LiDAR data products. Pre-trained models from the MMDetection toolbox were trained with DTM images to develop a model specifically adapted to the charcoal kilns. A total of four experiments were conducted to analyze the impact of different DTM visualizations, bounding box sizes, and hyperparameters using the FoveaBox detector as well as the performance of different models (FoveaBox, ATSS, VFNet, RetinaNet). The results show that a 3-band image consisting of hill shading, Sky-View factor, and slope, and a bounding box size of 50 m, is ideal for the detection of kilns. The FoveaBox detector achieved the best performance with the RAdam optimizer and a learning rate of 0.0001, while the ATSS model performed the most consistent results with an accuracy of 93 % and missing only 7 % of the kilns. The ATSS model shows up to 10 % better performance compared to other studies. Key factors for these improvements were the used dataset of the 3-band images, the size of the bounding boxes, and the more extensive data augmentation, particularly the complementary use of radiometric techniques. Through the experimental approach, detection accuracy was improved by 13 %. Compared to manual mapping, the model could identify many additional kilns, although it sometimes led to confusion with accumulated soil on slopes and false detections in uneven terrain with small height differences. The suitability of the algorithm for improved detection of charcoal kiln sites instead of manual mapping is considered efficient but not necessarily more accurate.:Selbständigkeitserklärung II Weitergabe der Arbeit II Kurzfassung IV Abstract V Abbildungsverzeichnis VIII Tabellenverzeichnis X Abkürzungsverzeichnis XI 1 Einleitung 1 1.1 Problemstellung und Zielsetzung 1 1.2 Aufbau der Arbeit 2 2 Grundlagen 3 2.1 Historischer und archäologischer Kontext von Meilerplätzen 3 2.1.1 Holzkohleproduktion und ihre Auswirkungen auf die Umwelt 3 2.1.2 Wichtigkeit der Erforschung von Meilerplätzen 4 2.1.3 Aussehen der Meilerüberreste 5 2.2 Einsatz von LiDAR-Daten für die Detektion von Meilerplätzen 6 2.2.1 Einführung in LiDAR 6 2.2.2 LiDAR in der archäologischen Praxis 8 2.2.3 Visualisierungstechniken von Höhenmodellen 10 2.2.4 Automatisierte Detektion von Meilerplätzen 15 2.3 Objekterkennung mit Deep Learning 16 2.3.1 Einführung in Deep Learning 16 2.3.2 Bildbasierte Objekterkennung von kleinen Objekten 17 2.3.3 Training eines Deep Learning-Modells 18 2.3.4 Datenaugmentierung 19 2.3.5 Hyperparameter 21 2.3.6 Bewertungsmetriken 21 2.3.7 Kategorisierung von Deep Learning-Modellen 23 2.3.8 Verwendete Modelle 25 3 Daten und Methoden 31 3.1 Datengrundlage und Computer-Hardware 31 3.2 Aufbereitung der Daten 32 3.2.1 Bearbeitung der Meilerdaten 32 3.2.2 Vorverarbeitung der DGM-Bilder 33 3.2.3 Aufteilung in Trainings-, Test- und Validierungsdatensatz 34 3.2.4 Datenaugmentierung des Trainingsdatensatzes 35 3.2.5 Verwendete DGM-Visualisierungstechniken 37 3.2.6 COCO-Format und Normalisierung 38 3.3 Experimentelles Vorgehen 39 3.3.1 Experiment 1: Verschiedene Eingangsdaten 39 3.3.2 Experiment 2: Verschiedene Bounding Box-Größen 40 3.3.3 Experiment 3: Verschiedene Hyperparameter 41 3.3.4 Experiment 4: Verschiedene Modelle 41 3.4 Verwendete Bewertungsmetriken 42 4 Ergebnisse 44 4.1 Experiment 1: Verschiedene Eingangsdaten 44 4.2 Experiment 2: Verschiedene Bounding Box-Größen 48 4.3 Experiment 3: Verschiedene Hyperparameter 52 4.4 Experiment 4: Verschiedene Modelle 56 4.5 Inferenz des besten Modells auf ein unbekanntes Gebiet 61 5 Diskussion 63 5.1 Interpretation der Ergebnisse 63 5.2 Vergleich der Ergebnisse mit anderen Studien 66 5.3 Bewertung der Modelleistung in einem gut und schlecht zu kartierendem Gebiet 68 6 Fazit und Ausblick 71 7 Literaturverzeichnis 73 Anhang 78
440

Soccer Data Analysis Based on Computer Vision : Master Thesis at KTH Royal Institute of Technology / Fotbollsdataanalys baserad på datorseende : Masteruppsats vid Kungliga Tekniska Högskolan

Pan, Rongfei January 2024 (has links)
As the top sport in the world without any doubt, soccer has a wide influence on human society. Since the beginning of modern soccer, soccer tactics have been developed for a long time. Clearly, it requires data for soccer analysis, which includes not only the match results between each team but also performance of players on the pitch. Playmaker.ai, where this degree project has been carried out, is a company that provides soccer analysis services. The major purpose of this project is to create a system that can generate player position by analyzing video data without bird-view information. Besides player position generation, some progress has been made in expected goal calculation and implemented some data preprocessing tools. In this project, the goal is accomplished in following steps: 1. Detect players from camera view images by using YOLO (You Only Look Once) network. 2. Use Strong-Sort method to track the position of players and ball in a long video. 3. Assign the teams to different detected object, methods including K-means are used in this step. 4. Generate bird view position by using perspective transformation method The result shows that all the machine model successfully converged and achieve good performance in practical usage, despite that there are still existing limitations and problems. By using this system, a 2-D map with player position on this map can be generated. And the data preprocessing tools can also be used for the company. Admittedly, because of several limitation in practical development, there are problems and disadvantage of the system. This system could be considered as a prototype of a complete method for solving multiple issues in soccer data analysis based on machine learning and computer vision. The future developers can iterate this project for further improvement. / Som den bästa sporten i världen utan tvekan har fotboll ett stort inflytande på det mänskliga samhället. Sedan starten av modern fotboll har fotbollstaktik utvecklats under lång tid. Det kräver helt klart data för fotbollsanalys, som inte bara inkluderar matchresultaten mellan varje lag utan även spelarnas prestation på planen. Playmaker.ai, där jag gjorde det här examensarbetet, är ett företag som tillhandahåller fotbollsanalystjänster. Huvudsyftet med detta projekt är att skapa ett system som kan generera spelarposition genom att analysera videodata utan fågelvyinformation. Förutom spelarpositionsgenerering, gjorde jag också vissa framsteg i xG-beräkning och implementerade några verktyg för förbearbetning av data. I det här projektet uppnådde jag målet i följande steg: 1.Upptäck spelare från kameravisningsbilder genom att använda YOLOv5-nätverket. 2. Använd Strong-Sort-metoden för att spåra spelares och bollens position i en lång video. 3. Tilldela teamen till olika upptäckta objekt, metoder inklusive Kmeans används i detta steg. 4. Generera fågelvyposition genom att använda perspektivomvandlings-metoden. Resultatet visar att alla maskinmodeller framgångsrikt konvergerade och uppnår bra prestanda i praktisk användning, trots att det fortfarande finns begränsningar och problem. Genom att använda detta system kan vi framgångsrikt generera en 2D-karta med spelarposition på denna karta. Och verktygen för dataförbehandling kan också användas för företaget. Visserligen, på grund av flera begränsningar i praktisk utveckling, finns det problem och nackdelar med systemet. Detta system skulle kunna betraktas som en prototyp av en komplett metod för att lösa flera problem inom fotbollsdataanalys baserad på maskininlärning och datorseende. Den framtida utvecklaren kan upprepa detta projekt för att göra framsteg.

Page generated in 0.0749 seconds