Global ETD Search

361	Delad augmented reality: En generell implementation oberoende av hårdvara : Implementering av ett delat AR-system som projicerar på individer med hjälp av maskininlärning, oberoende av hårdvara. / Shared augmented reality: A general implementation independent of hardware : Implementation of shared AR-system that projects on the individual with help of machine learning, independent of hardware. Bäckström, Otto, Marawgeh, Maikel January 2023 (has links) Delade Augmented reality (AR) system är ofta implementerade på ett sådant sätt att de är hårt kopplade av hårdvaran eller utnyttjar stora ramverk eller grafikmotorer för att implementera den grafiska delen av projektet. Detta är ett problem då system som är hårtkopplade till hårdvara inte är lika enkla att distribuera och minskar användarsegmentet som kan använda systemet. Implementationer som nyttjar stora grafikmotorer använder ofta bara en liten del av funktionaliteten som motorn erbjuder och denna funktionalitet som inte används utgör en förlust i prestanda. För att lösa detta måste ett system abstraherat från hårdvara med den minsta nödvändiga funktionaliteten nödvändig för delad Augmented reality implementeras. Rapporten arbetar med abstraktion av hårdvara för ett projicerat delat Augmented reality med hjälp av maskinlärning. Detta utförs genom uppdelning av systemet i två delar. En del fångar upp individen framför kameran och beräknar hur personen är positionerad med hjälp av en maskinlärning-modell, medan den andra delen hanterar projiceringen genom att ta in data från maskinlärning-modellen, som skickas över via socketkommunikation, för att knyta samman punkterna till ett tredimensionellt skelett. Tester gjordes om denna typ av abstraktion innebär för signifikanta fördröjningar samt påfrestningar på prestanda. Testerna visade att programmet körs med 50 millisekunder fördröjning på 80 bilder per sekund. Detta tyder på att det är gynnsamt att abstrahera applikationen för distribution av dess olika moduler eller byta ut dem. En slutgiltig projektion enlig med personen kunde inte tas fram men abstraktionen som är nödvändig för att tillhandahålla systemet är godtycklig nog för att vara oberoende av hårdvara och visa vilken funktionalitet som krävs av system inom delad AR. / Shared augmented reality (AR) systems are often implemented in a way that tightly couples them to the hardware or relies on large frameworks or graphics engines to implement the graphical part of the project. This is a problem because systems tightly coupled to hardware are not as easy to distribute and reduce the user segment that can use the system. Implementations that rely on large graphics engines often utilize only a small portion of the engine's functionality, resulting in a loss of performance for unused functionality. To solve this, a hardware-agnostic system with the minimal necessary functionality for shared augmented reality must be implemented. The report works with hardware abstraction for a projected shared augmented reality using machine learning. This is accomplished by dividing the system into two parts. One part captures the individual in front of the camera and calculates their positioning using a machine learning model, while the other part handles projection by receiving data from the machine learning model, transmitted via socket communication, to connect the points into a three-dimensional skeleton. Tests were conducted to determine if this type of abstraction results in significant delays and performance strain. The tests showed that the program runs with a 50-millisecond delay at 80 frames per second. This suggests that it is beneficial to abstract the application for the distribution of its different modules or to replace them. A final projection aligned with the individual could not be produced, but the abstraction necessary to provide the system is arbitrary enough to be independent of hardware and demonstrate the required functionality of shared AR systems. augmented reality virtual reality quaternion texture Augmented reality Virtual reality kvaternion textur
362	Learning Embeddings for Fashion Images Hermansson, Simon January 2023 (has links) Today the process of sorting second-hand clothes and textiles is mostly manual. In this master’s thesis, methods for automating this process as well as improving the manual sorting process have been investigated. The methods explored include the automatic prediction of price and intended usage for second-hand clothes, as well as different types of image retrieval to aid manual sorting. Two models were examined: CLIP, a multi-modal model, and MAE, a self-supervised model. Quantitatively, the results favored CLIP, which outperformed MAE in both image retrieval and prediction. However, MAE may still be useful for some applications in terms of image retrieval as it returns items that look similar, even if they do not necessarily have the same attributes. In contrast, CLIP is better at accurately retrieving garments with as many matching attributes as possible. For price prediction, the best model was CLIP. When fine-tuned on the dataset used, CLIP achieved an F1-Score of 38.08 using three different price categories in the dataset. For predicting the intended usage (either reusing the garment or exporting it to another country) the best model managed to achieve an F1-Score of 59.04. Computer Vision Machine Learning Image Retrieval CLIP Masked Autoencoders (MAE) Vision Transformers Image Captioning Price Prediction AI for Fashion
363	Rolling shutter in feature-based Visual-SLAM : Robustness through rectification in a wearable and monocular context Norée Palm, Caspar January 2023 (has links) This thesis analyzes the impact of and implements compensation for rolling shutter distortions in the state-of-the-art feature-based visual SLAM system ORB-SLAM3. The compensation method involves rectifying the detected features, and the evaluation was conducted on the "Rolling-Shutter Visual-Inertial Odometry Dataset" from TUM, which comprises of ten sequences recorded with side-by-side synchronized global and rolling shutter cameras in a single room. The performance of ORB-SLAM3 on rolling shutter without the implemented rectification algorithms substantially decreased in terms of accuracy and robustness. The global shutter camera achieved centimeter or even sub-centimeter accuracy, while the rolling shutter camera's accuracy could reach the decimeter range in the more challenging sequences. Also, specific individual executions using a rolling shutter camera could not track the trajectory effectively, indicating a degradation in robustness. The effects of rolling shutter in inertial ORB-SLAM3 were even more pronounced with higher trajectory errors and outright failure to track in some sequences. This was the case even though using inertial measurements with the global shutter camera resulted in better accuracy and robustness compared to the non-inertial case. The rectification algorithms implemented in this thesis yielded significant accuracy increases of up to a 7x relative improvement for the non-inertial case, which turned trajectory errors back to the centimeter scale from the decimeter one for the more challenging sequences. For the inertial case, the rectification scheme was even more crucial. It resulted in better trajectory accuracies, better than the non-inertial case for the less challenging sequences, and made tracking possible for the more challenging ones. SLAM rolling shutter rectification ORB-SLAM3 monocular camera compensation visual odometry simultaneous localization mapping
364	Automation of forest road inventory using computer vision and machine learning / Automatisering av skogsvägsinventering med hjälp av datorseende och maskininlärning de Flon, Jacques January 2023 (has links) There are around 300, 000 kilometer of gravel roads throughout the Swedish countryside, used every day by common people and companies. These roads face constant wear due to harsh weather as well as from heavy traffic, and thus, regular maintenance is required to keep up the road standard. A cost effective maintenance requires knowledge of where support is needed and such data is obtained through inventorying. Today, the road inventory is done primarily by hand using manual tools and requiring trained personel. With new tools, this work could be partially automated which could save on cost as well as open up for more complex analysis. This project aims to investigate the possibility of automating road inventory using computer vision and machine learning. Previous works within the field show promising results using deep convolutional networks to detect and classify road anomalies like potholes and cracks on paved roads. With their results in mind, we try to translate the solutions to also work on unpaved forest roads. During the project, we have collected our own dataset containing 3522 labelled images of gravel and forest roads. There are 203 instances of potholes, 614 bare roads and 3099 snow covered roads. These images were used to train an image segmentation model based on the YOLOv8 architecture for 30 epochs. Using transfer learning we took advantage of pretrained weights gained from training on the COCO dataset. The predicted road segmentation results were also used to estimate the width of the road, using the pinhole camera model and inverse projective geometry. The segmentation model reaches a AP50−95 = 0.746 for the road and 0.813 for the snow covered road. The model shows poor detection of potholes with AP50−95 = 0.048. Using the road segmentations to estimate the road width shows that the model can estimate road width with a mean average error of 0.24 m. The results from this project shows that there are already areas where machine learning could assist human operators with inventory work. Even difficult tasks, like estimating the road width of partially covered roads, can be solved with computer vision and machine learning. Computer vision Machine Learning Image recognition forestry road inventory pothole gravel road
365	Industrial 3D Anomaly Detection and Localization Using Unsupervised Machine Learning Bärudde, Kevin, Gandal, Marcus January 2023 (has links) Detecting defects in industrially manufactured products is crucial to ensure their safety and quality. This process can be both expensive and error-prone if done manually, making automated solutions desirable. There is extensive research on industrial anomaly detection in images, but recent studies have shown that adding 3D information can increase the performance. This thesis aims to extend the 2D anomaly detection framework, PaDiM, to incorporate 3D information. The proposed methods combine RGB with depth maps or point clouds and the effects of using PointNet++ and vision transformers to extract features are investigated. The methods are evaluated on the MVTec 3D-AD public dataset using the metrics image AUROC, pixel AUROC and AUPRO, and on a small dataset collected with a Time-of-Flight sensor. This thesis concludes that the addition of 3D information improves the performance of PaDiM and vision transformers achieve the best results, scoring an average image AUROC of 86.2±0.2 on MVTec 3D-AD. Machine Learning 3D Anomaly Detection Feature Extraction Manufacturing Computer Vision Vision Transformer PointNet
366	Quality inspection of multiple product variants using neural network modules Vuoluterä, Fredrik January 2022 (has links) Maintaining quality outcomes is an essential task for any manufacturing organization. Visual inspections have long been an avenue to detect defects in manufactured products, and recent advances within the field of deep learning has led to a surge of research in how technologies like convolutional neural networks can be used to perform these quality inspections automatically. An alternative to these often large and deep network structures is the modular neural network, which can instead divide a classification task into several sub-tasks to decrease the overall complexity of a problem. To investigate how these two approaches to image classification compare in a quality inspection task, a case study was performed at AR Packaging, a manufacturer of food containers. The many different colors, prints and geometries present in the AR Packaging product family served as a natural occurrence of complexity for the quality classification task. A modular network was designed, being formed by one routing module to classify variant type which is subsequently used to delegate the quality classification to an expert module trained for that specific variant. An image dataset was manually generated from within the production environment portraying a range of product variants in both defective and non-defective form. An image processing algorithm was developed to minimize image background and align the products in the pictures. To evaluate the adaptability of the two approaches, the networks were initially trained on same data from five variants, and then retrained with added data from a sixth variant. The modular networks were found to be overall less accurate and slower in their classification than the conventional single networks were. However, the modular networks were more than six times smaller and required less time to train initially, though the retraining times were roughly equivalent in both approaches. The retraining of the single network did also cause some fluctuation in the predictive accuracy, something which was not noted in the modular network. / <p>Det finns övrigt digitalt material (t.ex. film-, bild- eller ljudfiler) eller modeller/artefakter tillhörande examensarbetet som ska skickas till arkivet.</p> quality inspection defect detection variants modular neural network convolutional neural network case study
367	Increasing Autonomy of Unmanned Aircraft Systems Through the Use of Imaging Sensors Rudol, Piotr January 2011 (has links) The range of missions performed by Unmanned Aircraft Systems (UAS) has been steadily growing in the past decades thanks to continued development in several disciplines. The goal of increasing the autonomy of UAS's is widening the range of tasks which can be carried out without, or with minimal, external help. This thesis presents methods for increasing specific aspects of autonomy of UAS's operating both in outdoor and indoor environments where cameras are used as the primary sensors. First, a method for fusing color and thermal images for object detection, geolocation and tracking for UAS's operating primarily outdoors is presented. Specifically, a method for building saliency maps where human body locations are marked as points of interest is described. Such maps can be used in emergency situations to increase the situational awareness of first responders or a robotic system itself. Additionally, the same method is applied to the problem of vehicle tracking. A generated stream of geographical locations of tracked vehicles increases situational awareness by allowing for qualitative reasoning about, for example, vehicles overtaking, entering or leaving crossings. Second, two approaches to the UAS indoor localization problem in the absence of GPS-based positioning are presented. Both use cameras as the main sensors and enable autonomous indoor ight and navigation. The first approach takes advantage of cooperation with a ground robot to provide a UAS with its localization information. The second approach uses marker-based visual pose estimation where all computations are done onboard a small-scale aircraft which additionally increases its autonomy by not relying on external computational power. UAV UAS UAV autonomy human-body detection color-thermal image fusion vehicle tracking geolocation UAV indoor navigation
368	Detecting small and fast objects using image processing techniques : A project study within sport analysis Gustafsson, Simon, Persson, Andreas January 2021 (has links) This study has put three different object detecting techniques to the test. The goal was to investigate small and fast-moving objects to see which technique’s performance is most suitable within the sports of Padel. The study aims to cover and explain different affecting conditions that could cause better but also worse performance for small and fast object detection. The three techniques use different approaches for detecting one or multiple objects and could be a guideline for future object detection development. The proposed techniques utilize background histogram calculation, HSV masking with edge detection and DNN frameworks together with the COCO dataset. The process is tested through outdoor video footage across all techniques to generate data, which indicates that Canny edge detection is a prominent suggestion for further research given its high detection rate. However, YOLO shows excellent potential for multiple object detection at a very high confidence grade, which provides reliable and accurate detection of a targeted object. This study’s conclusion is that depending on what the end purpose aims to achieve, Canny and YOLO have potential for future small and fast object detection. Sports analysis Padel Computer vision Object detection Canny edge detection YOLO CAMshift Meanshift
369	Autonomous Quadcopter Landing with Visual Platform Localization Blaszczyk, Martin January 2023 (has links) Multicopters such as quadcopters are a popular tool within industries such as mining, shipping and surveillance where a high level of autonomy can save time, increase efficiency and most importantly provide safety. While Unmanned Aerial Vehicles have been a big area in research and used in the mentioned industries, the level of autonomy is still low. Simple actions such as loading and offloading payload or swapping batteries is still a manual task performed by humans. If multicopters are to be used as an autonomous tool the need for solutions where the machines can perform the simplest task such as swapping batteries become an important stepping stone to reach the autonomy goals. Earlier works propose landing solutions focused on landing autonomous vehicles but the lack of accuracy is hindering the vehicles to safely dock with a landing platform. This thesis combines multiple areas such as trajectory generation, visual marker tracking and UAV control where results are shown in both simulation and laboratory experiments. With the use of a Model Predictive Controller for both trajectory generation and UAV control, a multicopter can safely land on a small enough platform which can be mounted on a small mobile robot. Additionally an algorithm to tune the trajectory generator is presented which shows how much weights can be increased in the MPC controller for the system to remain stable. UAV Autonomy Landing Fiducial markers QR QR-code Drone Platform MPC UAV control
370	Thermal Imaging-Based Instance Segmentation for Automated Health Monitoring of Steel Ladle Refractory Lining / Infraröd-baserad Instanssegmentering för Automatiserad Övervakning av Eldfast Murbruk i Stålskänk Bråkenhielm, Emil, Drinas, Kastrati January 2022 (has links) Equipment and machines can be exposed to very high temperatures in the steel mill industry. One particularly critical part is the ladles used to hold and pour molten iron into mouldings. A refractory lining is used as an insulation layer between the outer steel shell and the molten iron to protect the ladle from the hot iron. Over time, or if the lining is not completely cured, the lining wears out or can potentially fail. Such a scenario can lead to a breakout of molten iron, which can cause damage to equipment and, in the worst case, workers. Previous work analyses how critical areas can be identified in a proactive matter. Using thermal imaging, the failing spots on the lining could show as high-temperature areas on the outside steel shell. The idea is that the outside temperature corresponds to the thickness of the insulating lining. The detection of these spots is identified when temperatures over a given threshold are registered within the thermal camera's field of view. The images must then be manually analyzed over time, to follow the progression of a detected spot. The existing solution is also prone to the background noise of other hot objects. This thesis proposes an initial step to automate monitoring the health of refractory lining in steel ladles. The report will investigate the usage of Instance Segmentation to isolate the ladle from its background. Thus, reducing false alarms and background noise in an autonomous monitoring setup. The model training is based on Mask R-CNN on our own thermal images, with pre-trained weights from visual images. Detection is done on two classes: open or closed ladle. The model proved reasonably successful on a small dataset of 1000 thermal images. Different models were trained with and without augmentation, pre-trained weights as well multi-phase fine-tuning. The highest mAP of 87.5\% was achieved on a pre-trained model with image augmentation without fine-tuning. Though it was not tested in production, temperature readings could lastly be extracted on the segmented ladle, decreasing the risk of false alarms from background noise. Computer Vision Deep Learning Thermal Imaging Instance Segmentation Mask R-CNN Steel Ladle Breakout Prevention

Search results