Global ETD Search

1	Teaching robots social autonomy from in situ human supervision Senft, Emmanuel January 2018 (has links) Traditionally the behaviour of social robots has been programmed. However, increasingly there has been a focus on letting robots learn their behaviour to some extent from example or through trial and error. This on the one hand excludes the need for programming, but also allows the robot to adapt to circumstances not foreseen at the time of programming. One such occasion is when the user wants to tailor or fully specify the robot's behaviour. The engineer often has limited knowledge of what the user wants or what the deployment circumstances specifically require. Instead, the user does know what is expected from the robot and consequently, the social robot should be equipped with a mechanism to learn from its user. This work explores how a social robot can learn to interact meaningfully with people in an efficient and safe way by learning from supervision by a human teacher in control of the robot's behaviour. To this end we propose a new machine learning framework called Supervised Progressively Autonomous Robot Competencies (SPARC). SPARC enables non-technical users to control and teach a robot, and we evaluate its effectiveness in Human-Robot Interaction (HRI). The core idea is that the user initially remotely operates the robot, while an algorithm associates actions to states and gradually learns. Over time, the robot takes over the control from the user while still giving the user oversight of the robot's behaviour by ensuring that every action executed by the robot has been actively or passively approved by the user. This is particularly important in HRI, as interacting with people, and especially vulnerable users, is a complex and multidimensional problem, and any errors by the robot may have negative consequences for the people involved in the interaction. Through the development and evaluation of SPARC, this work contributes to both HRI and Interactive Machine Learning, especially on how autonomous agents, such as social robots, can learn from people and how this specific teacher-robot interaction impacts the learning process. We showed that a supervised robot learning from their user can reduce the workload of this person, and that providing the user with the opportunity to control the robot's behaviour substantially improves the teaching process. Finally, this work also demonstrated that a robot supervised by a user could learn rich social behaviours in the real world, in a large multidimensional and multimodal sensitive environment, as a robot learned quickly (25 interactions of 4 sessions during in average 1.9 minutes) to tutor children in an educational game, achieving similar behaviours and educational outcomes compared to a robot fully controlled by the user, both providing 10 to 30% improvement in game metrics compared to a passive robot.
2	Efficient supervision for robot learning via imitation, simulation, and adaptation Wulfmeier, Markus January 2018 (has links) In order to enable more widespread application of robots, we are required to reduce the human effort for the introduction of existing robotic platforms to new environments and tasks. In this thesis, we identify three complementary strategies to address this challenge, via the use of imitation learning, domain adaptation, and transfer learning based on simulations. The overall work strives to reduce the effort of generating training data by employing inexpensively obtainable labels and by transferring information between different domains with deviating underlying properties. Imitation learning enables a straightforward way for untrained personnel to teach robots to perform tasks by providing demonstrations, which represent a comparably inexpensive source of supervision. We develop a scalable approach to identify the preferences underlying demonstration data via the framework of inverse reinforcement learning. The method enables integration of the extracted preferences as cost maps into existing motion planning systems. We further incorporate prior domain knowledge and demonstrate that the approach outperforms the baselines including manually crafted cost functions. In addition to employing low-cost labels from demonstration, we investigate the adaptation of models to domains without available supervisory information. Specifically, the challenge of appearance changes in outdoor robotics such as illumination and weather shifts is addressed using an adversarial domain adaptation approach. A principal advantage of the method over prior work is the straightforwardness of adapting arbitrary, state-of-the-art neural network architectures. Finally, we demonstrate performance benefits of the method for semantic segmentation of drivable terrain. Our last contribution focuses on simulation to real world transfer learning, where the characteristic differences are not only regarding the visual appearance but the underlying system dynamics. Our work aims at parallel training in both systems and mutual guidance via auxiliary alignment rewards to accelerate training for real world systems. The approach is shown to outperform various baselines as well as a unilateral alignment variant.
3	A Developmental Grasp Learning Scheme For Humanoid Robots Bozcuoglu, Asil Kaan 01 September 2012 (has links) (PDF) While an infant is learning to grasp, there are two key processes that she uses for leading a successful development. In the first process, infants use an intuitional approach where the hand is moved towards the object to create an initial contact regardless of the object properties. The contact is followed by a tactile grasping phase where the object is enclosed by the hand. This intuitive grasping behavior leads an grasping mechanism, which utilizes visual input and incorporates this into the grasp plan. The second process is called scaffolding, a guidance by stating how to accomplish the task or modifying its behaviors by interference. Infants pay attention to such guidance and understand the indication of important features of an object from 9 months of age. This supervision mechanism plays an important role for learning how to grasp certain objects in a proper way. To simulate these behavioral findings, a reaching and a tactile grasping controller was implemented on iCub humanoid robot which allowed it to reach an object from different directions, and enclose its fingers to cover the object. With these, a human-like grasp learning for iCub is proposed. Namely, the first step is an unsupervised learning where the robot is experimenting how to grasp objects. The second step is supervised learning phase where a caregiver modifies the end-effectors position when the robot is mistaken. By doing several experiments for two different grasping styles, we observe that the proposed methodology shows a better learning rate comparing to the scaffolding-only learning mechanism.
4	Robot Planning Based On Learned Affordances Cakmak, Maya 01 August 2007 (has links) (PDF) This thesis studies how an autonomous robot can learn affordances from its interactions with the environment and use these affordances in planning. It is based on a new formalization of the concept which proposes that affordances are relations that pertain to the interactions of an agent with its environment. The robot interacts with environments containing different objects by executing its atomic actions and learns the different effects it can create, as well as the invariants of the environments that afford creating that effect with a certain action. This provides the robot with the ability to predict the consequences of its future interactions and to deliberatively plan action sequences to achieve a goal. The study shows that the concept of affordances provides a common framework for studying reactive control, deliberation and adaptation in autonomous robots. It also provides solutions to the major problems in robot planning, by grounding the planning operators in the low-level interactions of the robot.
5	A Developmental Framework For Learning Affordances Ugur, Emre 01 December 2010 (has links) (PDF) We propose a developmental framework that enables the robot to learn affordances through interaction with the environment in an unsupervised way and to use these affordances at different levels of robot control, ranging from reactive response to planning. Inspired from Developmental Psychology, the robot&rsquo / s discovery of action possibilities is realized in two sequential phases. In the first phase, the robot that initially possesses a limited number of basic actions and reflexes discovers new behavior primitives by exercising these actions and by monitoring the changes created in its initially crude perception system. In the second phase, the robot explores a more complicated environment by executing the discovered behavior primitives and using more advanced perception to learn further action possibilities. For this purpose, first, the robot discovers commonalities in action-effect experiences by finding effect categories, and then builds predictors for each behavior to map object features and behavior parameters into effect categories. After learning affordances through self-interaction and self-observation, the robot can make plans to achieve desired goals, emulate end states of demonstrated actions, monitor the plan execution and take corrective actions using the perceptual structures employed or discovered during learning. Mobile and manipulator robots were used to realize the proposed framework. Similar to infants, these robots were able to form behavior repertoires, learn affordances, and gain prediction capabilities. The learned affordances were shown to be relative to the robots, provide perceptual economy and encode general relations. Additionally, the affordance-based planning ability was verified in various tasks such as table cleaning and object transportation. QA Computer Software 76.75-76.765
6	ROBOT NAVIGATION IN CROWDED DYNAMIC SCENES Xie, Zhanteng, 0000-0002-5442-1252 08 1900 (has links) Autonomous mobile robots are beginning to try to help us provide different delivery services in people's lives, such as delivering medicines in hospitals, delivering goods in warehouses, and delivering food in restaurants. To realize this vision, robots need to navigate autonomously and efficiently through complex, crowded, and dynamic environments filled with static obstacles, such as tables and chairs, as well as people and/or other robots, and to achieve this using the computational resources available onboard a mobile robot. This dissertation improves the state-of-the-art in autonomous navigation by developing learning-based algorithms to model the environment around the robot, predict changes in the environment, and control the robot, all of which can run onboard a mobile robot in real time. Specifically, this dissertation first proposes a set of specialized preprocessed data representations to extract and encode useful high-level information about crowded dynamic environments from raw sensor data (i.e., a short history of lidar data, kinematic data about nearby pedestrians, and a sub-goal that leads the robots towards its final destination). Then, using these combined preprocessed data representations, this dissertation proposes a novel crowd-aware navigation control policy that can balance collision avoidance and speed in crowded dynamic scenes by designing a velocity obstacle-based reward function that is used to train the robot leveraging deep reinforcement learning techniques. This dissertation then proposes a series of hardware-friendly prediction algorithms, based on variational autoencoder networks, to predict a distribution of possible future states in dynamic scenes by exploiting the kinematics and dynamics of the robot and its surrounding objects. Furthermore, this dissertation proposes a novel predictive uncertainty-aware navigation framework to improve the safety performance of current existing control policies by incorporating the output of the proposed stochastic environment prediction algorithms into general navigation frameworks. Many different collected real-world datasets as well as a series of 3D simulation experiments and hardware experiments are used to demonstrate the effectiveness of these proposed novel learning-based prediction and control algorithms. The new algorithms outperform other state-of-the-art algorithms in terms of collision avoidance, robot speed, and prediction accuracy across a range of environments, crowd densities, and robot models. It is believed that all the work included in this dissertation will promote the development of autonomous navigation for modern mobile robots, provide highly innovative solutions to the open problem of autonomous navigation in crowded dynamic scenes, and make our daily lives more convenient and efficient. / Mechanical Engineering Robotics Collision avoidance Deep learning in robotics and automation Environment prediction Reactive and sensor-based planning Robot navigation
7	Using Learned Affordances For Robotic Behavior Development Dogar, Mehmet Remzi 01 September 2007 (has links) (PDF) &ldquo / Developmental robotics&rdquo / proposes that, instead of trying to build a robot that shows intelligence once and for all, what one must do is to build robots that can develop. A robot should go through cognitive development just like an animal baby does. These robots should be equipped with behaviors that are simple but enough to bootstrap the system. Then, as the robot interacts with its environment, it should display increasingly complex behaviors. Studies in developmental psychology and neurophysiology provide support for the view that, the animals start with innate simple behaviors, and develop more complex behaviors through the differentiation, sequencing, and combination of these primitive behaviors. In this thesis, we propose such a development scheme for a mobile robot. J.J. Gibson&#039 / s concept of &ldquo / affordances&rdquo / provides the basis of this development scheme, and we use a formalization of affordances to make the robot learn about the dynamics of its interactions with its environment. We show that an autonomous robot can start with pre-coded primitive behaviors, and as it executes its behaviors randomly in an environment, it can learn the affordance relations between the environment and its behaviors. We then present two ways of using these learned structures, in achieving more complex, voluntary behaviors. In the first case, the robot still uses its pre-coded primitive behaviors only, but the sequencing of these are such that new more complex behaviors emerge. In the second case, the robot uses its pre-coded primitive behaviors to create new behaviors.
8	Gradient-based reinforcement learning techniques for underwater robotics behavior learning El-Fakdi Sencianes, Andrés 03 March 2011 (has links) Darrerament, l'interès pel desenvolupament d'aplicacions amb robots submarins autònoms (AUV) ha crescut de forma considerable. Els AUVs són atractius gràcies al seu tamany i el fet que no necessiten un operador humà per pilotar-los. Tot i això, és impossible comparar, en termes d'eficiència i flexibilitat, l'habilitat d'un pilot humà amb les escasses capacitats operatives que ofereixen els AUVs actuals. L'utilització de AUVs per cobrir grans àrees implica resoldre problemes complexos, especialment si es desitja que el nostre robot reaccioni en temps real a canvis sobtats en les condicions de treball. Per aquestes raons, el desenvolupament de sistemes de control autònom amb l'objectiu de millorar aquestes capacitats ha esdevingut una prioritat. Aquesta tesi tracta sobre el problema de la presa de decisions utilizant AUVs. El treball presentat es centra en l'estudi, disseny i aplicació de comportaments per a AUVs utilitzant tècniques d'aprenentatge per reforç (RL). La contribució principal d'aquesta tesi consisteix en l'aplicació de diverses tècniques de RL per tal de millorar l'autonomia dels robots submarins, amb l'objectiu final de demostrar la viabilitat d'aquests algoritmes per aprendre tasques submarines autònomes en temps real. En RL, el robot intenta maximitzar un reforç escalar obtingut com a conseqüència de la seva interacció amb l'entorn. L'objectiu és trobar una política òptima que relaciona tots els estats possibles amb les accions a executar per a cada estat que maximitzen la suma de reforços totals. Així, aquesta tesi investiga principalment dues tipologies d'algoritmes basats en RL: mètodes basats en funcions de valor (VF) i mètodes basats en el gradient (PG). Els resultats experimentals finals mostren el robot submarí Ictineu en una tasca autònoma real de seguiment de cables submarins. Per portar-la a terme, s'ha dissenyat un algoritme anomenat mètode d'Actor i Crític (AC), fruit de la fusió de mètodes VF amb tècniques de PG. / A considerable interest has arisen around Autonomous Underwater Vehicle (AUV) applications. AUVs are very useful because of their size and their independence from human operators. However, comparison with humans in terms of efficiency and flexibility is often unequal. The development of autonomous control systems able to deal with such issues becomes a priority. The use of AUVs for covering large unknown dynamic underwater areas is a very complex problem, mainly when the AUV is required to react in real time to unpredictable changes in the environment. This thesis is concerned with the field of AUVs and the problem of action-decision. The methodology chosen to solve this problem is Reinforcement Learning (RL). The work presented here focuses on the study and development of RL-based behaviors and their application to AUVs in real robotic tasks. The principal contribution of this thesis is the application of different RL techniques for autonomy improvement of an AUV, with the final purpose of demonstrating the feasibility of learning algorithms to help AUVs perform autonomous tasks. In RL, the robot tries to maximize a scalar evaluation obtained as a result of its interaction with the environment with the aim of finding an optimal policy to map the state of the environment to an action which in turn will maximize the accumulated future rewards. Thus, this dissertation is based on the principals of RL theory, surveying the two main classes of RL algorithms: Value Function (VF)-based methods and Policy Gradient (PG)-based techniques. A particular class of algorithms, Actor-Critic methods, born of the combination of PG algorithms with VF methods, is used for the final experimental results of this thesis: a real underwater task in which the underwater robot Ictineu AUV learns to perform an autonomous cable tracking task. Reinforcement learning Underwater robotics Learning in robotics Aprendizaje por refuerzo Robótica submarina Aprendizaje en robótica Aprenentatge per reforç Robòtica submarina Aprenentatge en robòtica 621.3 68
9	Living in a dynamic world : semantic segmentation of large scale 3D environments Miksik, Ondrej January 2017 (has links) As we navigate the world, for example when driving a car from our home to the work place, we continuously perceive the 3D structure of our surroundings and intuitively recognise the objects we see. Such capabilities help us in our everyday lives and enable free and accurate movement even in completely unfamiliar places. We largely take these abilities for granted, but for robots, the task of understanding large outdoor scenes remains extremely challenging. In this thesis, I develop novel algorithms for (near) real-time dense 3D reconstruction and semantic segmentation of large-scale outdoor scenes from passive cameras. Motivated by "smart glasses" for partially sighted users, I show how such modeling can be integrated into an interactive augmented reality system which puts the user in the loop and allows her to physically interact with the world to learn personalized semantically segmented dense 3D models. In the next part, I show how sparse but very accurate 3D measurements can be incorporated directly into the dense depth estimation process and propose a probabilistic model for incremental dense scene reconstruction. To relax the assumption of a stereo camera, I address dense 3D reconstruction in its monocular form and show how the local model can be improved by joint optimization over depth and pose. The world around us is not stationary. However, reconstructing dynamically moving and potentially non-rigidly deforming texture-less objects typically require "contour correspondences" for shape-from-silhouettes. Hence, I propose a video segmentation model which encodes a single object instance as a closed curve, maintains correspondences across time and provide very accurate segmentation close to object boundaries. Finally, instead of evaluating the performance in an isolated setup (IoU scores) which does not measure the impact on decision-making, I show how semantic 3D reconstruction can be incorporated into standard Deep Q-learning to improve decision-making of agents navigating complex 3D environments.
10	Teaching mobile robots to use spatial words Dobnik, Simon January 2009 (has links) The meaning of spatial words can only be evaluated by establishing a reference to the properties of the environment in which the word is used. For example, in order to evaluate what is to the left of something or how fast is fast in a given context, we need to evaluate properties such as the position of objects in the scene, their typical function and behaviour, the size of the scene and the perspective from which the scene is viewed. Rather than encoding the semantic rules that define spatial expressions by hand, we developed a system where such rules are learned from descriptions produced by human commentators and information that a mobile robot has about itself and its environment. We concentrate on two scenarios and words that are used in them. In the first scenario, the robot is moving in an enclosed space and the descriptions refer to its motion ('You're going forward slowly' and 'Now you're turning right'). In the second scenario, the robot is static in an enclosed space which contains real-size objects such as desks, chairs and walls. Here we are primarily interested in prepositional phrases that describe relationships between objects ('The chair is to the left of you' and 'The table is further away than the chair'). The perspective can be varied by changing the location of the robot. Following the learning stage, which is performed offline, the system is able to use this domain specific knowledge to generate new descriptions in new environments or to 'understand' these expressions by providing feedback to the user, either linguistically or by performing motion actions. If a robot can be taught to 'understand' and use such expressions in a manner that would seem natural to a human observer, then we can be reasonably sure that we have captured at least something important about their semantics. Two kinds of evaluation were performed. First, the performance of machine learning classifiers was evaluated on independent test sets using 10-fold cross-validation. A comparison of classifier performance (in regard to their accuracy, the Kappa coefficient (κ), ROC and Precision-Recall graphs) is made between (a) the machine learning algorithms used to build them, (b) conditions under which the learning datasets were created and (c) the method by which data was structured into examples or instances for learning. Second, with some additional knowledge required to build a simple dialogue interface, the classifiers were tested live against human evaluators in a new environment. The results show that the system is able to learn semantics of spatial expressions from low level robotic data. For example, a group of human evaluators judged that the live system generated a correct description of motion in 93.47% of cases (the figure is averaged over four categories) and that it generated the correct description of object relation in 59.28% of cases. 006.3

Search results