Global ETD Search

111	Bone Fragment Segmentation Using Deep Interactive Object Selection Estgren, Martin January 2019 (has links) In recent years semantic segmentation models utilizing Convolutional Neural Networks (CNN) have seen significant success for multiple different segmentation problems. Models such as U-Net have produced promising results within the medical field for both regular 2D and volumetric imaging, rivalling some of the best classical segmentation methods. In this thesis we examined the possibility of using a convolutional neural network-based model to perform segmentation of discrete bone fragments in CT-volumes with segmentation-hints provided by a user. We additionally examined different classical segmentation methods used in a post-processing refinement stage and their effect on the segmentation quality. We compared the performance of our model to similar approaches and provided insight into how the interactive aspect of the model affected the quality of the result. We found that the combined approach of interactive segmentation and deep learning produced results on par with some of the best methods presented, provided there were adequate amount of annotated training data. We additionally found that the number of segmentation hints provided to the model by the user significantly affected the quality of the result, with convergence of the result around 8 provided hints. Machine Learning Convolutional Neural Networks Computer Vision Medical Image Segmentation
112	Defect Detection and OCR on Steel Grönlund, Jakob, Johansson, Angelina January 2019 (has links) In large scale productions of metal sheets, it is important to maintain an effective way to continuously inspect the products passing through the production line. The inspection mainly consists of detection of defects and tracking of ID numbers. This thesis investigates the possibilities to create an automatic inspection system by evaluating different machine learning algorithms for defect detection and optical character recognition (OCR) on metal sheet data. Digit recognition and defect detection are solved separately, where the former compares the object detection algorithm Faster R-CNN and the classical machine learning algorithm NCGF, and the latter is based on unsupervised learning using a convolutional autoencoder (CAE). The advantage of the feature extraction method is that it only needs a couple of samples to be able to classify new digits, which is desirable in this case due to the lack of training data. Faster R-CNN, on the other hand, needs much more training data to solve the same problem. NCGF does however fail to classify noisy images and images of metal sheets containing an alloy, while Faster R-CNN seems to be a more promising solution with a final mean average precision of 98.59%. The CAE approach for defect detection showed promising result. The algorithm learned how to only reconstruct images without defects, resulting in reconstruction errors whenever a defect appears. The errors are initially classified using a basic thresholding approach, resulting in a 98.9% accuracy. However, this classifier requires supervised learning, which is why the clustering algorithm Gaussian mixture model (GMM) is investigated as well. The result shows that it should be possible to use GMM, but that it requires a lot of GPU resources to use it in an end-to-end solution with a CAE. Defect detection OCR Convolutional Autoencoder Faster R-CNN
113	Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data : Improving 3D Point Cloud Segmentation Using Multimodal Fusion of Projected 2D Imagery Data He, Linbo January 2019 (has links) Semantic segmentation is a key approach to comprehensive image data analysis. It can be applied to analyze 2D images, videos, and even point clouds that contain 3D data points. On the first two problems, CNNs have achieved remarkable progress, but on point cloud segmentation, the results are less satisfactory due to challenges such as limited memory resource and difficulties in 3D point annotation. One of the research studies carried out by the Computer Vision Lab at Linköping University was aiming to ease the semantic segmentation of 3D point cloud. The idea is that by first projecting 3D data points to 2D space and then focusing only on the analysis of 2D images, we can reduce the overall workload for the segmentation process as well as exploit the existing well-developed 2D semantic segmentation techniques. In order to improve the performance of CNNs for 2D semantic segmentation, the study has used input data derived from different modalities. However, how different modalities can be optimally fused is still an open question. Based on the above-mentioned study, this thesis aims to improve the multistream framework architecture. More concretely, we investigate how different singlestream architectures impact the multistream framework with a given fusion method, and how different fusion methods contribute to the overall performance of a given multistream framework. As a result, our proposed fusion architecture outperformed all the investigated traditional fusion methods. Along with the best singlestream candidate and few additional training techniques, our final proposed multistream framework obtained a relative gain of 7.3\% mIoU compared to the baseline on the semantic3D point cloud test set, increasing the ranking from 12th to 5th position on the benchmark leaderboard. deep learning multimodal fusion multimodality semantic segmentation point cloud segmentation
114	Football Shot Detection using Convolutional Neural Networks Jackman, Simeon January 2019 (has links) In this thesis, three different neural network architectures are investigated to detect the action of a shot within a football game using video data. The first architecture uses con- ventional convolution and pooling layers as feature extraction. It acts as a baseline and gives insight into the challenges faced during shot detection. The second architecture uses a pre-trained feature extractor. The last architecture uses three-dimensional convolution. All these networks are trained using short video clips extracted from football game video streams. Apart from investigating network architectures, different sampling methods are evaluated as well. This thesis shows that amongst the three evaluated methods, the ap- proach using MobileNetV2 as a feature extractor works best. However, when applying the networks to a video stream there are a multitude of challenges, such as false positives and incorrect annotations that inhibit the potential of detecting shots. Convolution Neural Network Football Action Detection MobileNet Shot 3D
115	Multi-camera Computer Vision for Object Tracking: A comparative study Turesson, Eric January 2021 (has links) Background: Video surveillance is a growing area where it can help with deterring crime, support investigation or to help gather statistics. These are just some areas where video surveillance can aid society. However, there is an improvement that could increase the efficiency of video surveillance by introducing tracking. More specifically, tracking between cameras in a network. Automating this process could reduce the need for humans to monitor and review since the tracking can track and inform the relevant people on its own. This has a wide array of usability areas, such as forensic investigation, crime alerting, or tracking down people who have disappeared. Objectives: What we want to investigate is the common setup of real-time multi-target multi-camera tracking (MTMCT) systems. Next up, we want to investigate how the components in an MTMCT system affect each other and the complete system. Lastly, we want to see how image enhancement can affect the MTMCT. Methods: To achieve our objectives, we have conducted a systematic literature review to gather information. Using the information, we implemented an MTMCT system where we evaluated the components to see how they interact in the complete system. Lastly, we implemented two image enhancement techniques to see how they affect the MTMCT. Results: As we have discovered, most often, MTMCT is constructed using a detection for discovering object, tracking to keep track of the objects in a single camera and a re-identification method to ensure that objects across cameras have the same ID. The different components have quite a considerable effect on each other where they can sabotage and improve each other. An example could be that the quality of the bounding boxes affect the data which re-identification can extract. We discovered that the image enhancement we used did not introduce any significant improvement. Conclusions: The most common structure for MTMCT are detection, tracking and re-identification. From our finding, we can see that all the component affect each other, but re-identification is the one that is mostly affected by the other components and the image enhancement. The two tested image enhancement techniques could not introduce enough improvement, but other image enhancement could be used to make the MTMCT perform better. The MTMCT system we constructed did not manage to reach real-time. Multi-target multi-camera tracking tracking re-identification and image enhancement
116	Raising Awareness of Computer Vision : How can a single purpose focused CV solution be improved? Zukas, Paulius January 2018 (has links) The concept of Computer Vision is not new or fresh. On contrary ideas have been shared and worked on for almost 60 years. Many use cases have been found throughout the years and various systems developed, but there is always a place for improvement. An observation was made, that methods used today are generally focused on a single purpose and implement expensive technology, which could be improved. In this report, we are going to go through an extensive research to find out if a professionally sold, expensive software, can be replaced by an off the shelf, low-cost solution entirely designed and developed in-house. To do that we are going to look at the history of Computer Vision, examples of applications, algorithms, and find general scenarios or computer vision problems which can be solved. We are then going take a step further and define solid use cases for each of the scenarios found. Finally, a prototype solution is going to be designed and presented. After analysing the results gathered we are going to reach out to the reader convincing him/her that such application can be developed and work efficiently in various areas saving investments to businesses. software design low-cost computer vision OpenCV C#
117	REDUNDANT FIRMWARE TEST SETUP IN SIMULATION AND HARDWARE: A FEASIBILITY STUDY Ekström, Per, Eriksson, Elisabeth January 2018 (has links) A reliable embedded real-time system has many requirements to fulfil. It must meet target deadlines in a number of situations, most of them in a situation that puts heavy stress on the system. To meet these demands, numerous tests have been created which test the hardware for any possible errors the developers might think of, in order to maximise system reliability and stability. These tests will take a lot of time to execute, and as system complexity grows, more tests are introduced leading to even longer testing times. In this thesis, a method to reduce the testing time of the software and, to a lesser extent, the hardware is examined. By using the full system simulator Simics, an existing industry system from ABB was integrated and tests were performed. A proof of concept test suite for automatic redundancy tests was also implemented. By looking at the test results, it was concluded that the method shows promise. However, problems with the average latency and performance troubles with Simics shows that more work must be put into this research before the system can be run at full speed. Hardware-In-the-Loop HIL Simulation Automated testing Simics Wind River Redundancy Latency
118	Three dimensional object recognition for robot conveyor picking Wikander, Gustav January 2009 (has links) Shape-based matching (SBM) is a method for matching objects in greyscale images. It extracts edges from search images and matches them to a model using a similarity measure. In this thesis we extend SBM to find the tilt and height position of the object in addition to the z-plane rotation and x-y-position. The search is conducted using a scale pyramid to improve the search speed. A 3D matching can be done for small tilt angles by using SBM on height data and extending it with additional steps to calculate the tilt of the object. The full pose is useful for picking objects with an industrial robot. The tilt of the object is calculated using a RANSAC plane estimator. After the 2D search the differences in height between all corresponding points of the model and the live image are calculated. By estimating a plane to this difference the tilt of the object can be calculated. Using the tilt the model edges are tilted in order to improve the matching at the next scale level. The problems that arise with occlusion and missing data have been studied. Missing data and erroneous data have been thresholded manually after conducting tests where automatic filling of missing data did not noticeably improve the matching. The automatic filling could introduce new false edges and remove true ones, thus lowering the score. Experiments have been conducted where objects have been placed at increasing tilt angles. The results show that the matching algorithm is object dependent and correct matches are almost always found for tilt angles less than 10 degrees. This is very similar to the original 2D SBM because the model edges does not change much for such small angels. For tilt angles up to about 25 degrees most objects can be matched and for nice objects correct matches can be done at large tilt angles of up to 40 degrees. object recognition laser triangulation occlusion edge scale-space rotation tilt robot
119	Calibration of Multispectral Sensors Isoz, Wilhelm January 2005 (has links) This thesis describes and evaluates a number of approaches and algorithms for nonuniform correction (NUC) and suppression of fixed pattern noise in a image sequence. The main task for this thesis work was to create a general NUC for infrared focal plane arrays. To create a radiometrically correct NUC, reference based methods using polynomial approximation are used instead of the more common scene based methods which creates a cosmetic NUC. The pixels that can not be adjusted to give a correct value for the incomming radiation are defined as dead. Four separate methods of identifying dead pixels are used to find these pixels. Both the scene sequence and calibration data are used in these identifying methods. The algorithms and methods have all been tested by using real image sequences. A graphical user interface using the presented algorithms has been created in Matlab to simplify the correction of image sequences. An implementation to convert the corrected values from the images to radiance and temperature is also performed. nonuniform correction infrared sensor reference based radiation graphical userinterface
120	Signal- och bildbehandling på moderna grafikprocessorer Pettersson, Erik January 2005 (has links) En modern grafikprocessor är oerhört kraftfull och har en prestanda som potentiellt sett är många gånger högre än för en modern mikroprocessor. I takt med att grafikprocessorn blivit alltmer programmerbar har det blivit möjligt att använda den för beräkningstunga tillämpningar utanför dess normala användningsområde. Inom det här arbetet utreds vilka möjligheter och begränsningar som uppstår vid användandet av grafikprocessorer för generell programmering. Arbetet inriktas främst mot signal- och bildbehandlingstillämpningar men mycket av principerna är tillämpliga även inom andra områden. Ett ramverk för bildbehandling implementeras och några algoritmer inom bildanalys realiseras och utvärderas, bland annat stereoseende och beräkning av optiskt flöde. Resultaten visar på att vissa tillämpningar kan uppvisa en avsevärd prestandaökning i en grafikprocessor jämfört med i en mikroprocessor men att andra tillämpningar kan vara ineffektiva eller mycket svåra att implementera. / The modern graphical processing unit, GPU, is an extremely powerful unit, potentially many times more powerful than a modern microprocessor. Due to its increasing programmability it has recently become possible to use it in computation intensive applications outside its normal usage. This work investigates the possibilities and limitations of general purpose programming on GPUs. The work mainly concentrates on signal and image processing although much of the principles are applicable to other areas as well. A framework for image processing on GPUs is implemented and a few computer vision algorithms are implemented and evaluated, among them stereo vision and optical flow. The results show that some applications can gain a substantial speedup when implemented correctly in the GPU but others can be inefficent or extremly hard to implement. GPU GPGPU image processing computer vision stereo vision optical flow

Search results