Return to search

RGB-D SLAM : an implementation framework based on the joint evaluation of spatial velocities

In pursuit of creating a fully automated navigation system that is capable of operating in dynamic environments, a large amount of research is being devoted to systems that use visual odometry assisted methods to estimate the position of a platform with regards to the environment surrounding it. This includes systems that do and do not know the environment a priori, as both rely on the same methods for localisation. For the combined problem of localisation and mapping, Simultaneous Localisation and Mapping (SLAM) is the de facto choice, and in recent years with the advent of color and depth (RGB-D) sensors, RGB-D SLAM has become a hot topic for research.

Most research being performed is on improving the overall system accuracy or more specifically the performance with regards to the overall trajectory error. While this approach quantifies the performance of the system as a whole, the individual frame-to-frame performance is often not mentioned or explored properly. While this will directly tie in to the overall performance, the level of scene cohesion experienced between two successive observations can vary greatly over a single dataset of observations.

The focus of this dissertation will be the relevant levels of translational and rotational velocities experienced by the sensor between two successive observations and the effect on the final accuracy of the SLAM implementation. The frame rate will specifically be used to alter and evaluate the different spatial velocities experienced over multiple datasets of RGB-D data.

Two systems were developed to illustrate and evaluate the potential of various approaches to RGB-D SLAM. The first system is a real-world implementation where SLAM is used to localise and map the environment surrounding a quadcopter platform. A Microsoft Kinect is directly mounted to the quadcopter and is used to provide a RGB-D datastream to a remote processing terminal. This terminal runs a SLAM implementation that can alternate between different visual odometry methods. The remote terminal acts as the position controller for the quadcopter, replacing the need for a direct human operator. A semi-automated system is implemented, that allows a human operator to designate waypoints within the environment that the quadcopter moves to.

The second system uses a series of publicly available RGB-D datasets with their accompanying ground-truth readings to simulate a real RGB-D datasteam. This is used to evaluate the performance of the various RGB-D SLAM approaches to visual odometry. For each of the datasets, the accompanying translational and angular velocity on a frame-to-frame basis can be calculated. This can, in turn, be used to evaluate the frame-to-frame accuracy of the SLAM implementation, where the spatial velocity can be manually altered by occluding frames within the sequence. Thus, an accurate relationship can be calculated between the frame rate, the spatial velocity and the performance of the SLAM implementation.

Three image processing techniques were used to implement the visual odometry for RGB-D SLAM. SIFT, SURF and ORB were compared across eight of the TUM database datasets. SIFT had the best performance, with a 30% increase over SURF and doubling the performance of ORB. By implementing SIFT using CUDA, the feature detection and description process only takes 18ms, negating the disadvantage that SIFT has compared to SURF and ORB. The RGB-D SLAM implementation was compared to four prominent research papers, and showed comparable results. The effect of rotation and translation was evaluated, based on the effect of each rotation and translation axis. It was found that the z-axis (scale) and the roll-axis (scene orientation) have a lower effect on the average RPE error in a frame-to-frame basis. It was found that rotation has a much greater impact on the performance, when evaluating rotation and translation separately. On average, a rotation of 1deg resulted in a 4mm translation error and a 20% rotation error , where a translation of 10mm resulted in a rotation error of 0.2deg and a translation error of 45%. The combined effect of rotation and translation had a multiplicative effect on the error metric.

The quadcopter platform designed to work with the SLAM implementation did not function ideally, but it was sufficient for the purpose. The quadcopter is able to self stabilise within the environment, given a spacious area. For smaller, enclosed areas the backdraft generated by the quadcopter motors lead to some instability in the system. A frame-to-frame error of 40.34mm and 1.93deg was estimated for the quadcopter system. / Dissertation (MEng)--University of Pretoria, 2017. / Electrical, Electronic and Computer Engineering / MEng / Unrestricted

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:up/oai:repository.up.ac.za:2263/64524
Date January 2017
CreatorsCoppejans, Hugo Herman Godelieve
ContributorsMyburgh, Hermanus Carel, hcoppejans@gmail.com
PublisherUniversity of Pretoria
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeDissertation
Rights© 2018 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.

Page generated in 0.0023 seconds