Global ETD Search

121	Object Detection and Semantic Segmentation Using Self-Supervised Learning Gustavsson, Simon January 2021 (has links) In this thesis, three well known self-supervised methods have been implemented and trained on road scene images. The three so called pretext tasks RotNet, MoCov2, and DeepCluster were used to train a neural network self-supervised. The self-supervised trained networks where then evaluated on different amount of labeled data on two downstream tasks, object detection and semantic segmentation. The performance of the self-supervised methods are compared to networks trained from scratch on the respective downstream task. The results show that it is possible to achieve a performance increase using self-supervision on a dataset containing road scene images only. When only a small amount of labeled data is available, the performance increase can be substantial, e.g., a mIoU from 33 to 39 when training semantic segmentation on 1750 images with a RotNet pre-trained backbone compared to training from scratch. However, it seems that when a large amount of labeled images are available (>70000 images), the self-supervised pretraining does not increase the performance as much or at all. Self-supervised learning Computer vision
122	Generating synthetic brain MR images using a hybrid combination of Noise-to-Image and Image-to-Image GANs Schilling, Lennart January 2020 (has links) Generative Adversarial Networks (GANs) have attracted much attention because of their ability to learn high-dimensional, realistic data distributions. In the field of medical imaging, they can be used to augment the often small image sets available. In this way, for example, the training of image classification or segmentation models can be improved to support clinical decision making. GANs can be distinguished according to their input. While Noise-to-Image GANs synthesize new images from a random noise vector, Image-To-Image GANs translate a given image into another domain. In this study, it is investigated if the performance of a Noise-To-Image GAN, defined by its generated output quality and diversity, can be improved by using elements of a previously trained Image-To-Image GAN within its training. The data used consists of paired T1- and T2-weighted MR brain images. With the objective of generating additional T1-weighted images, a hybrid model (Hybrid GAN) is implemented that combines elements of a Deep Convolutional GAN (DCGAN) as a Noise-To-Image GAN and a Pix2Pix as an Image-To-Image GAN. Thereby, starting from the dependency of an input image, the model is gradually converted into a Noise-to-Image GAN. Performance is evaluated by the use of an independent classifier that estimates the divergence between the generative output distribution and the real data distribution. When comparing the Hybrid GAN performance with the DCGAN baseline, no improvement, neither in the quality nor in the diversity of the generated images, could be observed. Consequently, it could not be shown that the performance of a Noise-To-Image GAN is improved by using elements of a previously trained Image-To-Image GAN within its training. Generative Adversarial Network GAN
123	FPGA acceleration of superpixel segmentation Östgren, Magnus January 2020 (has links) Superpixel segmentation is a preprocessing step for computer vision applications, where an image is split into segments referred to as superpixels. Then running the main algorithm on these superpixels reduces the number of data points processed in comparison to running the algorithm on pixels directly, while still keeping much of the same information. In this thesis, the possibility to run superpixel segmentation on an FPGA is researched. This has resulted in the development of a modified version of the algorithm SLIC, Simple Linear Iterative Clustering. An FPGA implementation of this algorithm has then been built in VHDL, it is designed as a pipeline, unrolling the iterations of SLIC. The designed algorithm shows a lot of potential and runs on real hardware, but more work is required to make the implementation more robust, and remove some visual artefacts. Embedded Systems Inbäddad systemteknik Signal Processing Signalbehandling
124	Simple feature detection inindoor geometry scanned with theMicrosoft Hololens Björk, Nils January 2020 (has links) The aim of this work was to determine whether line-type features(straight lines found in geometry considered interesting by auser) could be identified in spatial map data of indoorenvironments produced by the Microsoft Hololens augmented realityheadset. Five different data sets were used in this work onwhich the feature detection was performed, these data sets wereprovided as sample data representing the spatial map of fivedifferent rooms scanned using the Hololens headset which areavailable as part of the Hololens emulator. Related work onfeature detection in point clouds and 3D meshes were investigatedto try and find a suitable method to achieve line-type featuredetection. The chosen detection method used LSQ-plane fitting andrelevant cutoff variables to achieve this, which was inspired byrelated work on the subject of feature identification and meshsimplification. The method was evaluated using user-placedvalidation features and the distance between them and the detectedfeatures, defined using the midpoint diistance metric was used asa measure of quality for the detected measures. The resultingfeatures were not accurate enough to reliably or consistentlymatch the validation features inserted in the data and furtherimprovements to the detection method would be necessary to achievethis. A local feature-edge detection using the SOD & ESODoperators was considered and tested but was found to not besuitable for the spatial data provided by the Hololens emulator.The results shows that finding these features using the provideddata is possible, and the methods to produce them numerous. Thechoice of mehtod is however dependent on the ultimate applicationof these features, taking into account requirements for accuracyand performance. Hololens AR Spatial 3D Features Simple
125	Texture Enhancement in 3D Maps using Generative Adversarial Networks Birgersson, Anna, Hellgren, Klara January 2019 (has links) In this thesis we investigate the use of GANs for texture enhancement. To achievethis, we have studied if synthetic satellite images generated by GANs will improvethe texture in satellite-based 3D maps. We investigate two GANs; SRGAN and pix2pix. SRGAN increases the pixelresolution of the satellite images by generating upsampled images from low resolutionimages. As for pip2pix, the GAN performs image-to-image translation bytranslating a source image to a target image, without changing the pixel resolution. We trained the GANs in two different approaches, named SAT-to-AER andSAT-to-AER-3D, where SAT, AER and AER-3D are different datasets provided bythe company Vricon. In the first approach, aerial images were used as groundtruth and in the second approach, rendered images from an aerial-based 3D mapwere used as ground truth. The procedure of enhancing the texture in a satellite-based 3D map was dividedin two steps; the generation of synthetic satellite images and the re-texturingof the 3D map. Synthetic satellite images generated by two SRGAN models andone pix2pix model were used for the re-texturing. The best results were presentedusing SRGAN in the SAT-to-AER approach, in where the re-textured 3Dmap had enhanced structures and an increased perceived quality. SRGAN alsopresented a good result in the SAT-to-AER-3D approach, where the re-textured3D map had changed color distribution and the road markers were easier to distinguishfrom the ground. The images generated by the pix2pix model presentedthe worst result. As for the SAT-to-AER approach, even though the syntheticsatellite images generated by pix2pix were somewhat enhanced and containedless noise, they had no significant impact in the re-texturing. In the SAT-to-AER-3D approach, none of the investigated models based on the pix2pix frameworkpresented any successful results. We concluded that GANs can be used as a texture enhancer using both aerialimages and images rendered from an aerial-based 3D map as ground truth. Theuse of GANs as a texture enhancer have great potential and have several interestingareas for future works. GAN texture enhancement SRGAN pix2pix
126	To drone, or not to drone : A qualitative study in how and when information from UxV should be distributed in rescue missions at sea Laine, Rickard January 2020 (has links) Swedish maritime rescue consists of a number of resources from various organizations that will work together and achieve a common goal, to save people in need. It turns out that information is a significant factor in maritime rescue missions. Whether you are rescuer at the accident scene or coordinating the rescue mission from the control center, information provides you better situation awareness and knowledge of the situation, which creates better conditions in order achieve the goal for the mission. Applying Unmanned Vehicles (UxV) for Swedish maritime rescue means another resource that can provide additional necessary information. In this study, several methods have been used to find out where in the mission information from UxVs can conceivably contribute. The study identifies three critical situations where there is a need for UxV. This result, in turn, leads to other questions, such as who should be the recipient of the new information and how it affects the information flow as a whole? Information visualization proves to be an important factor in this. Where you can help the recipient of the information in their work with the help of clear and easily understood visualization without affecting the flow or coordination in their work. Human Computer Interaction
127	Automatic Gait Recognition : using deep metric learning / Automatisk gångstilsigenkänning Persson, Martin January 2020 (has links) Recent improvements in pose estimation has opened up the possibility of new areas of application. One of them is gait recognition, the task of identifying persons based on their unique style of walking, which is increasingly being recognized as an important method of biometric indentification. This thesis has explored the possibilities of using a pose estimation system, OpenPose, together with deep Recurrent Neural Networks (RNNs) in order to see if there is sufficient information in sequences of 2D poses to use for gait recognition. For this to be possible, a new multi-camera dataset consisting of persons walking on a treadmill was gathered, dubbed the FOI dataset. The results show that this approach has some promise. It achieved an overall classification accuracy of 95,5 % on classes it had seen during training and 83,8 % for classes it had not seen during training. It was unable to recognize sequences from angles it had not seen during training, however. For that to be possible, more data pre-processing will likely be required. Gait recognition Computer vision
128	Evaluation of Face Recognition Accuracy in Surveillance Video Tuvskog, Johanna January 2020 (has links) Automatic Face Recognition (AFR) can be useful in the forensic field when identifying people in surveillance footage. In AFR systems it is common to use deep neural networks which perform well if the quality of the images keeps a certain level. This is a problem when applying AFR on surveillance data since the quality of those images can be very poor. In this thesis the CNN FaceNet has been used to evaluate how different quality parameters influence the accuracy of the face recognition. The goal is to be able to draw conclusions about how to improve the recognition by using and avoiding certain parameters based on the conditions. Parameters that have been experimented with are angle of the face, image quality, occlusion, colour and lighting. This has been achieved by using datasets with different properties or by alternating the images. The parameters are meant to simulate different situations that can occur in surveillance footage that is difficult for the network to recognise. Three different models have been evaluated with different amount of embeddings and different training data. The results show that the two models trained on the VGGFace2 dataset performs much better than the one trained on CASIA-WebFace. All models performance drops on images with low quality compared to images with high quality because of the training data including mostly high-quality images. In some cases, the recognition results can be improved by applying some alterations in the images. This could be by using one frontal and one profile image when trying to identify a person or occluding parts of the shape of the face if it gets recognized as other persons with similar face shapes. One main improvement would be to extend the training datasets with more low-quality images. To some extent, this could be achieved by different kinds of data augmentation like artificial occlusion and down-sampled images. AFR face recognition surveillance CNN neural network
129	NAVIGATION AND PLANNED MOVEMENT OF AN UNMANNED BICYCLE Baaz, Hampus January 2020 (has links) A conventional bicycle is a stable system given adequate forward velocity. However, the velocity region of stability is limited and depends on the geometric parameters of the bicycle. An autonomous bicycle is just not about maintaining the balance but also controlling where the bicycle is heading. Following paths has been accomplished with bicycles and motorcycles in simulation for a while. Car-like vehicles have followed paths in the real world but few bicycles or motorcycles have done so. The goal of this work is to follow a planned path using a physical bicycle without overcoming the dynamic limitations of the bicycle. Using an iterative design process, controllers for direction and position are developed and improved. Kinematic models are also compared in their ability to simulate the bicycle movement and how controllers in simulation translate to outdoors driving. The result shows that the bicycle can follow a turning path on a residential road without human interaction and that some simulation behaviours do not translate to the real world. Self driving Bicycle PID Self balance
130	Tracking motion in mineshafts : Using monocular visual odometry Suikki, Karl January 2022 (has links) LKAB has a mineshaft trolley used for scanning mineshafts. It is suspended down into a mineshaft by wire, scanning the mineshaft on both descent and ascent using two LiDAR (Light Detection And Ranging) sensors and an IMU (Internal Measurement Unit) used for tracking the position. With good tracking, one could use the LiDAR scans to create a three-dimensional model of the mineshaft which could be used for monitoring, planning and visualization in the future. Tracking with IMU is very unstable since most IMUs are susceptible to disturbances and will drift over time; we strive to track the movement using monocular visual odometry instead. Visual odometry is used to track movement based on video or images. It is the process of retrieving the pose of a camera by analyzing a sequence of images from one or multiple cameras. The mineshaft trolley is also equipped with one camera which is filming the descent and ascent and we aim to use this video for tracking. We present a simple algorithm for visual odometry and test its tracking on multiple datasets being: KITTI datasets of traffic scenes accompanied by their ground truth trajectories, mineshaft data intended for the mineshaft trolley operator and self-captured data accompanied by an approximate ground truth trajectory. The algorithm is feature based, meaning that it is focused on tracking recognizable keypoints in sequent images. We compare the performance of our algortihm by tracking the different datasets using two different feature detection and description systems, ORB and SIFT. We find that our algorithm performs well on tracking the movement of the KITTI datasets using both ORB and SIFT whose largest total errors of estimated trajectories are $3.1$ m and $0.7$ m for ORB and SIFT respectively in $51.8$ m moved. This was compared to their ground truth trajectories. The tracking of the self-captured dataset shows by visual inspection that the algorithm can perform well on data which has not been as carefully captured as the KITTI datasets. We do however find that we cannot track the movement with the current data from the mineshaft. This is due to the algorithm finding too few matching features in sequent images, breaking the pose estimation of the visual odometry. We make a comparison of how ORB and SIFT finds features in the mineshaft images and find that SIFT performs better by finding more features. The mineshaft data was never intended for visual odometry and therefore it is not suitable for this purpose either. We argue that the tracking could work in the mineshaft if the visual conditions are made better by focusing on more even lighting and camera placement or if it can be combined with other sensors such as an IMU, that assist the visual odometry when it fails. Monocular Visual Odometry Tracking ORB SIFT

Search results