Global ETD Search

31	Designing a Lightweight Convolutional Neural Network for Onion and Weed Classification Bäckström, Nils January 2018 (has links) The data set for this project consists of images containing onion and weed samples. It is of interest to investigate if Convolutional Neural Networks can learn to classify the crops correctly as a step in automatizing weed removal in farming. The aim of this project is to solve a classification task involving few classes with relatively few training samples (few hundred per class). Usually, small data sets are prone to overfitting, meaning that the networks generalize bad to unseen data. It is also of interest to solve the problem using small networks with low computational complexity, since inference speed is important and memory often is limited on deployable systems. This work shows how transfer learning, network pruning and quantization can be used to create lightweight networks whose classification accuracy exceeds the same architecture trained from scratch. Using these techniques, a SqueezeNet v1.1 architecture (which is already a relatively small network) can reach 1/10th of the original model size and less than half MAC operations during inference, while still maintaining a higher classification accuracy compared to a SqueezeNet v1.1 trained from scratch (96.9±1.35% vs 92.0±3.11% on 5-fold cross validation)
32	Prototyping an automated robotic shopping cart with visual perception Norell, Jakob January 2018 (has links) Intelligent atonomous robots are expected to be more common in the future and it is a topic of interest for science and companies. Instead of letting the customer pull a heavy cart by hand, an intelligent robotic shopping cart can aid a customer with their shopping by automatically following them. For this purpose, a prototype of an automated robotic shopping cart was implemented on the robotino 3 system, using tools from the programming environment robotino view created by FESTO. Some tools were used for computer vision to identify a customer bearing a colored symbol. The symbol could be uniquely designed for one individual customer and the identification was not sensitive to external disturbances of light, thanks to two lamps attached to the symbol. Collision avoidance was implemented with IR-sensors using scripts written in LUA based on a version of the bug 2 algorithm. Distance was accurately determined to obstacles and to the customer by using information from these two implementations. The robot successfully followed a human while avoiding obstacles that were in the way. After moving towards the customer, it safely stopped close to the customer – making it possible for the customer to place an object in the shopping cart. The robotino used a comprehendable routine such that the customer and the robotino understood the intention of the other actor.
33	Online Learning for Robot Vision Öfjäll, Kristoffer January 2014 (has links) In tele-operated robotics applications, the primary information channel from the robot to its human operator is a video stream. For autonomous robotic systems however, a much larger selection of sensors is employed, although the most relevant information for the operation of the robot is still available in a single video stream. The issue lies in autonomously interpreting the visual data and extracting the relevant information, something humans and animals perform strikingly well. On the other hand, humans have great diculty expressing what they are actually looking for on a low level, suitable for direct implementation on a machine. For instance objects tend to be already detected when the visual information reaches the conscious mind, with almost no clues remaining regarding how the object was identied in the rst place. This became apparent already when Seymour Papert gathered a group of summer workers to solve the computer vision problem 48 years ago [35]. Articial learning systems can overcome this gap between the level of human visual reasoning and low-level machine vision processing. If a human teacher can provide examples of what to be extracted and if the learning system is able to extract the gist of these examples, the gap is bridged. There are however some special demands on a learning system for it to perform successfully in a visual context. First, low level visual input is often of high dimensionality such that the learning system needs to handle large inputs. Second, visual information is often ambiguous such that the learning system needs to be able to handle multi modal outputs, i.e. multiple hypotheses. Typically, the relations to be learned are non-linear and there is an advantage if data can be processed at video rate, even after presenting many examples to the learning system. In general, there seems to be a lack of such methods. This thesis presents systems for learning perception-action mappings for robotic systems with visual input. A range of problems are discussed, such as vision based autonomous driving, inverse kinematics of a robotic manipulator and controlling a dynamical system. Operational systems demonstrating solutions to these problems are presented. Two dierent approaches for providing training data are explored, learning from demonstration (supervised learning) and explorative learning (self-supervised learning). A novel learning method fullling the stated demands is presented. The method, qHebb, is based on associative Hebbian learning on data in channel representation. Properties of the method are demonstrated on a vision-based autonomously driving vehicle, where the system learns to directly map low-level image features to control signals. After an initial training period, the system seamlessly continues autonomously. In a quantitative evaluation, the proposed online learning method performed comparably with state of the art batch learning methods.
34	Global Pose Estimation from Aerial Images : Registration with Elevation Models Grelsson, Bertil January 2014 (has links) Over the last decade, the use of unmanned aerial vehicles (UAVs) has increased drastically. Originally, the use of these aircraft was mainly military, but today many civil applications have emerged. UAVs are frequently the preferred choice for surveillance missions in disaster areas, after earthquakes or hurricanes, and in hazardous environments, e.g. for detection of nuclear radiation. The UAVs employed in these missions are often relatively small in size which implies payload restrictions. For navigation of the UAVs, continuous global pose (position and attitude) estimation is mandatory. Cameras can be fabricated both small in size and light in weight. This makes vision-based methods well suited for pose estimation onboard these vehicles. It is obvious that no single method can be used for pose estimation in all dierent phases throughout a ight. The image content will be very dierent on the runway, during ascent, during ight at low or high altitude, above urban or rural areas, etc. In total, a multitude of pose estimation methods is required to handle all these situations. Over the years, a large number of vision-based pose estimation methods for aerial images have been developed. But there are still open research areas within this eld, e.g. the use of omnidirectional images for pose estimation is relatively unexplored. The contributions of this thesis are three vision-based methods for global egopositioning and/or attitude estimation from aerial images. The rst method for full 6DoF (degrees of freedom) pose estimation is based on registration of local height information with a geo-referenced 3D model. A dense local height map is computed using motion stereo. A pose estimate from navigation sensors is used as an initialization. The global pose is inferred from the 3D similarity transform between the local height map and the 3D model. Aligning height information is assumed to be more robust to season variations than feature matching in a single-view based approach. The second contribution is a method for attitude (pitch and roll angle) estimation via horizon detection. It is one of only a few methods in the literature that use an omnidirectional (sheye) camera for horizon detection in aerial images. The method is based on edge detection and a probabilistic Hough voting scheme. In a ight scenario, there is often some knowledge on the probability density for the altitude and the attitude angles. The proposed method allows this prior information to be used to make the attitude estimation more robust. The third contribution is a further development of method two. It is the very rst method presented where the attitude estimates from the detected horizon in omnidirectional images is rened through registration with the geometrically expected horizon from a digital elevation model. It is one of few methods where the ray refraction in the atmosphere is taken into account, which contributes to the highly accurate pose estimates. The attitude errors obtained are about one order of magnitude smaller than for any previous vision-based method for attitude estimation from horizon detection in aerial images.
35	Development and Evaluation of a Kinect based Bin-Picking System Mishra, Chintan, Khan, Zeeshan January 2015 (has links) No description available.
36	Single and multiple stereo view navigation for planetary rovers Bartolome, D R 08 October 2013 (has links) This thesis deals with the challenge of autonomous navigation of the ExoMars rover. The absence of global positioning systems (GPS) in space, added to the limitations of wheel odometry makes autonomous navigation based on these two techniques - as done in the literature - an inviable solution and necessitates the use of other approaches. That, among other reasons, motivates this work to use solely visual data to solve the robot’s Egomotion problem. The homogeneity of Mars’ terrain makes the robustness of the low level image processing technique a critical requirement. In the first part of the thesis, novel solutions are presented to tackle this specific problem. Detection of robust features against illumination changes and unique matching and association of features is a sought after capability. A solution for robustness of features against illumination variation is proposed combining Harris corner detection together with moment image representation. Whereas the first provides a technique for efficient feature detection, the moment images add the necessary brightness invariance. Moreover, a bucketing strategy is used to guarantee that features are homogeneously distributed within the images. Then, the addition of local feature descriptors guarantees the unique identification of image cues. In the second part, reliable and precise motion estimation for the Mars’s robot is studied. A number of successful approaches are thoroughly analysed. Visual Simultaneous Localisation And Mapping (VSLAM) is investigated, proposing enhancements and integrating it with the robust feature methodology. Then, linear and nonlinear optimisation techniques are explored. Alternative photogrammetry reprojection concepts are tested. Lastly, data fusion techniques are proposed to deal with the integration of multiple stereo view data. Our robust visual scheme allows good feature repeatability. Because of this, dimensionality reduction of the feature data can be used without compromising the overall performance of the proposed solutions for motion estimation. Also, the developed Egomotion techniques have been extensively validated using both simulated and real data collected at ESA-ESTEC facilities. Multiple stereo view solutions for robot motion estimation are introduced, presenting interesting benefits. The obtained results prove the innovative methods presented here to be accurate and reliable approaches capable to solve the Egomotion problem in a Mars environment. / © Cranfield University Autonomous systems Data fusion Navigation systems Planetary rovers
37	Evaluation of Optical Flow for Estimation of Liquid Glass Flow Velocity Rudin, Malin January 2021 (has links) In the glass wool industry, the molten glass flow is monitored for regulation purposes. Given the progress in the computer vision field, the current monitoring solution might be replaced by a camera based solution. The aim of this thesis is to investigate the possibility of using optical flow techniques for estimation of the molten glass flow displacement. Three glass melt flow datasets were recorded, as well as two additional melt flow datasets, using a NIR camera. The block matching techniques Full Search (FS) and Adaptive Rood Pattern Search (ARPS), as well as the local feature methods ORB and A-KAZE were considered. These four techniques were compared to RAFT, the state-of-the-art approach for optical flow estimation, using available pre-trained models, as well as an approach of using the tracking method ECO for the optical flow estimation. The methods have been evaluated using the metrics MAE, MSE, and SSIM to compare the warped flow to the target image. In addition, ground truth for 50 frames from each dataset was manually annotated as to use the optical flow metric End-Point Error. To investigate the computational complexity the average computational time per frame was calculated. The investigation found that RAFT does not perform well on the given data, due to the large displacements of the flows. For simulated displacements of up to about 100 pixels at full resolution, the performance is satisfactory, with results comparable to the traditional methods. Using ECO for optical flow estimation encounters similar problems as RAFT, where the large displacement proved challenging for the tracker. Simulating smaller motions of up to 60 pixels resulted in good performance, though computation time of the used implementation is much too high for a real-time implementation. The four traditional block matching and local feature approaches examined in this thesis outperform the state-of-the-art approaches. FS, ARPS, A-KAZE, and ORB all have similar performance on the glass flow datasets, whereas the block matching approaches fail on the alternative melt flow data as the template extraction approach is inadequate. The two local feature approaches, though working reasonably well on all datasets given full resolution, struggle to identify features on down-sampled data. This might be mitigated by fine-tuning the settings of the methods. Generally, ORB mostly outperforms A-KAZE with respect to the evaluation metrics, and is considerably faster.
38	Image-based fashion recommender systems : Considering Deep learning role in computer vision development shirkhani, shaghayegh January 2021 (has links) Fashion is perceived as a meaningful way of self-expressing that people use for different purposes. It seems to be an integral part of every person in modern societies, from everyday life to exceptional events and occasions. Fashionable products are highly demanded, and consequently, fashion is perceived as a desirable and profitable industry. Although this massive demand for fashion products provides an excellent opportunity for companies to invest in fashion-related sectors, it also faces different challenges in answering their customer needs. Fashion recommender systems have been introduced to address these needs. This thesis aims to provide deeper insight into the fashion recommender system domain by conducting a comprehensive literature review on more than 100 papers in this field focusing on image-based fashion recommender systems considering computer vision advancements. Justifying fashion domain-specific characteristics, the subtle notions of this domain and their relevancy have been conceptualized. Four main tasks in image-based fashion recommender systems have been recognized, including cloth-item retrievals, Complementary item recommendation, Outfit recommendation, and Capsule wardrobes. An evolvement trajectory of image-based fashion recommender systems concerning computer vision advancements has been illustrated consists of three main eras and the most recent developments. Finally, a comparison between traditional computer vision techniques and deep learning-based has been made. Although the main objective of this literature review was to perform a comprehensive, integrated overview of researches in this field, there is still a need for conducting further studies considering image-based fashion recommender systems from a more practical perspective.
39	Classification of black plastic granulates using computer vision / Classification of black plastic granulates using computer vision Persson, Anton, Dymne, Niklas January 2021 (has links) Pollution and climate change are some of the biggest challenges facing humanity. Moreover, for a sustainable future, recycling is needed. Plas- tic is a big part of the recycled material today, but there are problems that the recycling world is facing. The modern-day recycling facilities can handle plastics of all colours except black plastics. For this reason, most recycling companies have resorted to methods unaffected by colour, like the method used at Stena Nordic Recycling Central. The unawareness of the individual plastics causes the problem that Stena Nordic Recycling Central has to wait until an entire bag of plastic granulates has been run through the production line and sorted to test its purity using a chemistry method. Finding out if the electrostats divider settings are correct using this testing method is costly and causes many re-runs. If the divider set- ting is valid in an earlier state, it will save both time and the number of re-runs needed.This thesis aims to create a system that can classify different types of plas- tics by using image analysis. This thesis will explore two techniques to solve this problem. The two computer vision techniques will be the RGB method see 3.3.2 and machine learning see 3.3.4 using transfer learning with an AlexNet. The aim is the accuracy of at least 95% when classifying the plastics granulates.The Convolutional neural network used in this thesis is an AlexNet. The choice of method to further explore is decided in the method part of this thesis. The results of the computer vision method and RGB method were difficult to determine more about in section 4.2. It was not clear if one plastic was blacker than the other. This uncertainty and the fact that a Convolutional neural network takes more features than just RGB into a count, discussed in section 3.3, makes the computer vision method, Con- volutional neural network, a method to further explore in this thesis. The results gathered from the Convolutional neural network’s training was 95% accuracy in classifying the plastic granulates. A separate test is also needed to make sure the accuracy is close to the network accuracy. The result from the stand-alone test was 86.6% accurate, where the plastic- type Polystyrene had a subpar result of 73.3% and 100% accuracy when classifying Acrylonitrile butadiene styrene. The results from the Convo- lutional neural network show that black plastics could be classified using machine learning and could be an excellent solution for classifying and recycling black plastics if further research on the field is conducted.
40	Point Cloud Data Augmentation for Safe 3D Object Detection using Geometric Techniques Kapoor, Shrayash January 2021 (has links) Background: Autonomous navigation has become increasingly popular. This surge in popularity caused a lot of interest in sensor technologies, driving the cost of sensor technology down. This has resulted in increasing developments in deep learning for computer vision. There is, however, not a lot of available, adaptable research for directly performing data augmentation on point cloud data independent of the training process. This thesis focuses on the impact of point cloud augmentation techniques on 3D object detection quality. Objectives: The objectives of this thesis are to evaluate the efficiency of geometric data augmentation techniques for point cloud data. The identified techniques are then implemented on a 3D object detector, and the results obtained are then compared based on selected metrics. Methods: This thesis uses two literature reviews to find the appropriate point cloud techniques to implement for data augmentation and a 3D object detector to implement data augmentation. Subsequently, an experiment is performed to quantitatively discern how much improvement augmentation offers in the detection quality. Metrics used to compare the algorithms include precision, recall, average precision, mean average precision, memory usage and training time. Results: The literature review results indicate flipping, scaling, translation and rotation to be ideal candidates for performing geometric data augmentation and ComplexYOLO to be a capable detector for 3D object detection. Experimental results indicate that at the expense of some training time, the developed library "Aug3D" can boost the detection quality and results of the ComplexYOLO algorithm. Conclusions: After analysis of results, it was found that the implementation of geometric data augmentations (namely flipping, translation, scaling and rotation) yielded an increase of over 50% in the mean average precision for the performance of the ComplexYOLO 3D detection model on the Car and Pedestrian classes.

Search results