1 |
Spatial and Temporal Learning in Robotic Pick-and-Place Domains via Demonstrations and ObservationsToris, Russell C 20 April 2016 (has links)
Traditional methods for Learning from Demonstration require users to train the robot through the entire process, or to provide feedback throughout a given task. These previous methods have proved to be successful in a selection of robotic domains; however, many are limited by the ability of the user to effectively demonstrate the task. In many cases, noisy demonstrations or a failure to understand the underlying model prevent these methods from working with a wider range of non-expert users. My insight is that in many mobile pick-and-place domains, teaching is done at a too fine grained level. In many such tasks, users are solely concerned with the end goal. This implies that the complexity and time associated with training and teaching robots through the entirety of the task is unnecessary. The robotic agent needs to know (1) a probable search location to retrieve the task's objects and (2) how to arrange the items to complete the task. This thesis work develops new techniques for obtaining such data from high-level spatial and temporal observations and demonstrations which can later be applied in new, unseen environments. This thesis makes the following contributions: (1) This work is built on a crowd robotics platform and, as such, we contribute the development of efficient data streaming techniques to further these capabilities. By doing so, users can more easily interact with robots on a number of platforms. (2) The presentation of new algorithms that can learn pick-and-place tasks from a large corpus of goal templates. My work contributes algorithms that produce a metric which ranks the appropriate frame of reference for each item based solely on spatial demonstrations. (3) An algorithm which can enhance the above templates with ordering constraints using coarse and noisy temporal information. Such a method eliminates the need for a user to explicitly specify such constraints and searches for an optimal ordering and placement of items. (4) A novel algorithm which is able to learn probable search locations of objects based solely on sparsely made temporal observations. For this, we introduce persistence models of objects customized to a user's environment.
|
2 |
Semantically Grounded Learning from Unstructured DemonstrationsNiekum, Scott D. 01 September 2013 (has links)
Robots exhibit flexible behavior largely in proportion to their degree of semantic knowledge about the world. Such knowledge is often meticulously hand-coded for a narrow class of tasks, limiting the scope of possible robot competencies. Thus, the primary limiting factor of robot capabilities is often not the physical attributes of the robot, but the limited time and skill of expert programmers. One way to deal with the vast number of situations and environments that robots face outside the laboratory is to provide users with simple methods for programming robots that do not require the skill of an expert.
For this reason, learning from demonstration (LfD) has become a popular alternative to traditional robot programming methods, aiming to provide a natural mechanism for quickly teaching robots. By simply showing a robot how to perform a task, users can easily demonstrate new tasks as needed, without any special knowledge about the robot. Unfortunately, LfD often yields little semantic knowledge about the world, and thus lacks robust generalization capabilities, especially for complex, multi-step tasks.
To address this shortcoming of LfD, we present a series of algorithms that draw from recent advances in Bayesian nonparametric statistics and control theory to automatically detect and leverage repeated structure at multiple levels of abstraction in demonstration data. The discovery of repeated structure provides critical insights into task invariants, features of importance, high-level task structure, and appropriate skills for the task. This culminates in the discovery of semantically meaningful skills that are flexible and reusable, providing robust generalization and transfer in complex, multi-step robotic tasks. These algorithms are tested and evaluated using a PR2 mobile manipulator, showing success on several complex real-world tasks, such as furniture assembly.
|
3 |
Robots learning actions and goals from everyday peopleAkgun, Baris 07 January 2016 (has links)
Robots are destined to move beyond the caged factory floors towards domains where they will be interacting closely with humans. They will encounter highly varied environments, scenarios and user demands. As a result, programming robots after deployment will be an important requirement. To address this challenge, the field of Learning from Demonstration (LfD) emerged with the vision of programming robots through demonstrations of the desired behavior instead of explicit programming. The field of LfD within robotics has been around for more than 30 years and is still an actively researched field. However, very little research is done on the implications of having a non-robotics expert as a teacher. This thesis aims to bridge this gap by developing learning from demonstration algorithms and interaction paradigms that allow non-expert people to teach robots new skills.
The first step of the thesis was to evaluate how non-expert teachers provide demonstrations to robots. Keyframe demonstrations are introduced to the field of LfD to help people teach skills to robots and compared with the traditional trajectory demonstrations. The utility of keyframes are validated by a series of experiments with more than 80 participants. Based on the experiments, a hybrid of trajectory and keyframe demonstrations are proposed to take advantage of both and a method was developed to learn from trajectories, keyframes and hybrid demonstrations in a unified way.
A key insight from these user experiments was that teachers are goal oriented. They concentrated on achieving the goal of the demonstrated skills rather than providing good quality demonstrations. Based on this observation, this thesis introduces a method that can learn actions and goals from the same set of demonstrations. The action models are used to execute the skill and goal models to monitor this execution. A user study with eight participants and two skills showed that successful goal models can be learned from non- expert teacher data even if the resulting action models are not as successful. Following these results, this thesis further develops a self-improvement algorithm that uses the goal monitoring output to improve the action models, without further user input. This approach is validated with an expert user and two skills. Finally, this thesis builds an interactive LfD system that incorporates both goal learning and self-improvement and evaluates it with 12 naive users and three skills. The results suggests that teacher feedback during experiments increases skill execution and monitoring success. Moreover, non-expert data can be used as a seed to self-improvement to fix unsuccessful action models.
|
4 |
Learning From Demonstrations in Changing Environments: Learning Cost Functions and Constraints for Motion PlanningGritsenko, Artem 08 September 2015 (has links)
"We address the problem of performing complex tasks for a robot operating in changing environments. We propose two approaches to the following problem: 1) define task-specific cost functions for motion planning that represent path quality by learning from an expert's preferences and 2) using constraint-based representation of the task inside learning from demonstration paradigm. In the first approach, we generate a set of paths for a given task using a motion planner and collect data about their features (path length, distance from obstacles, etc.). We provide these paths to an expert as a set of pairwise comparisons. We then form a ranking of the paths from the expert's comparisons. This ranking is used as training data for learning algorithms, which attempt to produce a cost function that maps path feature values to a cost that is consistent with the expert's ranking. We test our method on two simulated car-maintenance tasks with the PR2 robot: removing a tire and extracting an oil filter. We found that learning methods which produce non-linear combinations of the features are better able to capture expert preferences for the tasks than methods which produce linear combinations. This result suggests that the linear combinations used in previous work on this topic may be too simple to capture the preferences of experts for complex tasks. In the second approach, we propose to introduce a constraint-based description of the task that can be used together with the motion planner to produce the trajectories. The description is automatically created from the demonstration by performing segmentation and extracting constraints from the motion. The constraints are represented with the Task Space Regions (TSR) that are extracted from the demonstration and used to produce a desired motion. To account for the parts of the motion where constraints are different a segmentation of the demonstrated motion is performed using TSRs. The proposed approach allows performing tasks on robot from human demonstration in changing environments, where obstacle distribution or poses of the objects could change between demonstration and execution. The experimental evaluation on two example motions was performed to estimate the ability of our approach to produce the desired motion and recover a demonstrated trajectory."
|
5 |
Collaborative Learning of Hierarchical Task Networks from Demonstration and InstructionMohseni-Kabir, Anahita 10 September 2015 (has links)
"This thesis presents learning and interaction algorithms to support a human teaching hierarchical task models to a robot using a single or multiple examples in the context of a mixed-initiative interaction with bi-directional communication. Our first contribution is an approach for learning a high level task from a single example using the bottom-up style. In particular, we have identified and implemented two important heuristics for suggesting task groupings and repetitions based on the data flow between tasks and on the physical structure of the manipulated artifact. We have evaluated our heuristics with users in a simulated environment and shown that the suggestions significantly improve the learning and interaction. For our second contribution, we extended this interaction by enabling users to teaching tasks using the top-down teaching style in addition to the bottom-up teaching style. Results obtained in a pilot study show that users utilize both the bottom-up and the top-down teaching styles to teach tasks. Our third contribution is an algorithm that merges multiple examples when there are alternative ways of doing a task. The merging algorithm is still under evaluation. "
|
6 |
Cognition Rehearsed : Recognition and Reproduction of Demonstrated Behavior / Robotövningar : Igenkänning och återgivande av demonstrerat beteendeBilling, Erik January 2012 (has links)
The work presented in this dissertation investigates techniques for robot Learning from Demonstration (LFD). LFD is a well established approach where the robot is to learn from a set of demonstrations. The dissertation focuses on LFD where a human teacher demonstrates a behavior by controlling the robot via teleoperation. After demonstration, the robot should be able to reproduce the demonstrated behavior under varying conditions. In particular, the dissertation investigates techniques where previous behavioral knowledge is used as bias for generalization of demonstrations. The primary contribution of this work is the development and evaluation of a semi-reactive approach to LFD called Predictive Sequence Learning (PSL). PSL has many interesting properties applied as a learning algorithm for robots. Few assumptions are introduced and little task-specific configuration is needed. PSL can be seen as a variable-order Markov model that progressively builds up the ability to predict or simulate future sensory-motor events, given a history of past events. The knowledge base generated during learning can be used to control the robot, such that the demonstrated behavior is reproduced. The same knowledge base can also be used to recognize an on-going behavior by comparing predicted sensor states with actual observations. Behavior recognition is an important part of LFD, both as a way to communicate with the human user and as a technique that allows the robot to use previous knowledge as parts of new, more complex, controllers. In addition to the work on PSL, this dissertation provides a broad discussion on representation, recognition, and learning of robot behavior. LFD-related concepts such as demonstration, repetition, goal, and behavior are defined and analyzed, with focus on how bias is introduced by the use of behavior primitives. This analysis results in a formalism where LFD is described as transitions between information spaces. Assuming that the behavior recognition problem is partly solved, ways to deal with remaining ambiguities in the interpretation of a demonstration are proposed. The evaluation of PSL shows that the algorithm can efficiently learn and reproduce simple behaviors. The algorithm is able to generalize to previously unseen situations while maintaining the reactive properties of the system. As the complexity of the demonstrated behavior increases, knowledge of one part of the behavior sometimes interferes with knowledge of another parts. As a result, different situations with similar sensory-motor interactions are sometimes confused and the robot fails to reproduce the behavior. One way to handle these issues is to introduce a context layer that can support PSL by providing bias for predictions. Parts of the knowledge base that appear to fit the present context are highlighted, while other parts are inhibited. Which context should be active is continually re-evaluated using behavior recognition. This technique takes inspiration from several neurocomputational models that describe parts of the human brain as a hierarchical prediction system. With behavior recognition active, continually selecting the most suitable context for the present situation, the problem of knowledge interference is significantly reduced and the robot can successfully reproduce also more complex behaviors.
|
7 |
A Mixed-Response Intelligent Tutoring System Based on Learning from DemonstrationAlvarez Xochihua, Omar 2012 May 1900 (has links)
Intelligent Tutoring Systems (ITS) have a significant educational impact on student's learning. However, researchers report time intensive interaction is needed between ITS developers and domain-experts to gather and represent domain knowledge. The challenge is augmented when the target domain is ill-defined. The primary problem resides in often using traditional approaches for gathering domain and tutoring experts' knowledge at design time and conventional methods for knowledge representation built for well-defined domains. Similar to evolving knowledge acquisition approaches used in other fields, we replace this restricted view of ITS knowledge learning merely at design time with an incremental approach that continues training the ITS during run time. We investigate a gradual knowledge learning approach through continuous instructor-student demonstrations. We present a Mixed-response Intelligent Tutoring System based on Learning from Demonstration that gathers and represents knowledge at run time. Furthermore, we implement two knowledge representation methods (Weighted Markov Models and Weighted Context Free Grammars) and corresponding algorithms for building domain and tutoring knowledge-bases at run time.
We use students' solutions to cybersecurity exercises as the primary data source for our initial framework testing. Five experiments were conducted using various granularity levels for data representation, multiple datasets differing in content and size, and multiple experts to evaluate framework performance. Using our WCFG-based knowledge representation method in conjunction with a finer data representation granularity level, the implemented framework reached 97% effectiveness in providing correct feedback. The ITS demonstrated consistency when applied to multiple datasets and experts. Furthermore, on average, only 1.4 hours were needed by instructors to build the knowledge-base and required tutorial actions per exercise. Finally, the ITS framework showed suitable and consistent performance when applied to a second domain.
These results imply that ITS domain models for ill-defined domains can be gradually constructed, yet generate successful results with minimal effort from instructors and framework developers. We demonstrate that, in addition to providing an effective tutoring performance, an ITS framework can offer: scalability in data magnitude, efficiency in reducing human effort required for building a confident knowledge-base, metacognition in inferring its current knowledge, robustness in handling different pedagogical and tutoring criteria, and portability for multiple domain use.
|
8 |
Action Recognition Through Action GenerationAkgun, Baris 01 August 2010 (has links) (PDF)
This thesis investigates how a robot can use action generation mechanisms to recognize the action of an observed actor in an on-line manner i.e., before the completion of the action. Towards this end, Dynamic Movement Primitives (DMP), an action generation method proposed for imitation, are modified to recognize the actions of an actor. Specifically, a human actor performed three different reaching actions to two different objects. Three DMP' / s, each corresponding to a different reaching action, were trained using this data. The proposed method used an object-centered coordinate system to define the variables for the action, eliminating the difference between the actor and the robot. During testing, the robot simulated action trajectories by its learned DMPs and compared the resulting trajectories against the observed one. The error between the simulated and the observed trajectories were integrated into a recognition signal, over which recognition was done. The proposed method was applied on the iCub humanoid robot platform using an active motion capture device for sensing. The results showed that the system was able to recognize actions with high accuracy as they unfold in time. Moreover, the feasibility of the approach is demonstrated in an interactive game between the robot and a human.
|
9 |
Guided teaching interactions with robots: embodied queries and teaching heuristicsCakmak, Maya 17 May 2012 (has links)
The vision of personal robot assistants continues to become more realistic with technological advances in robotics. The increase in the capabilities of robots, presents boundless opportunities for them to perform useful tasks for humans.
However, it is not feasible for engineers to program robots for all possible uses. Instead, we envision general-purpose robots that can be programmed by their end-users.
Learning from Demonstration (LfD), is an approach that allows users to program new capabilities on a robot by demonstrating what is required from the robot. Although LfD has become an established area of Robotics, many challenges remain in making it effective and intuitive for naive users. This thesis contributes to addressing these challenges in several ways. First, the problems that occur in teaching-learning interactions between humans and robots are characterized through human-subject experiments in three different domains. To address these problems, two mechanisms for guiding human teachers in their interactions are developed: embodied queries and teaching heuristics.
Embodied queries, inspired from Active Learning queries, are questions asked by the robot so as to steer the teacher towards providing more informative demonstrations. They leverage the robot's embodiment to physically manipulate the environment and to communicate the question. Two technical contributions are made in developing embodied queries. The first is Active Keyframe-based LfD -- a framework for learning human-segmented skills in continuous action spaces and producing four different types of embodied queries to improve learned skills. The second is Intermittently-Active Learning in which a learner makes queries selectively, so as to create balanced interactions with the benefits of fully-active learning. Empirical findings from five experiments with human subjects are presented. These identify interaction-related issues in generating embodied queries, characterize human question asking, and evaluate implementations of Intermittently-Active Learning and Active Keyframe-based LfD on the humanoid robot Simon.
The second mechanism, teaching heuristics, is a set of instructions given to human teachers in order to elicit more informative demonstrations from them. Such instructions are devised based on an understanding of what constitutes an optimal teacher for a given learner, with techniques grounded in Algorithmic Teaching. The utility of teaching heuristics is empirically demonstrated through six human-subject experiments, that involve teaching different concepts or tasks to a virtual agent, or teaching skills to Simon.
With a diverse set of human subject experiments, this thesis demonstrates the necessity for guiding humans in teaching interactions with robots, and verifies the utility of two proposed mechanisms in improving sample efficiency and final performance, while enhancing the user interaction.
|
10 |
Human skill capturing and modelling using wearable devicesZhao, Yuchen January 2017 (has links)
Industrial robots are delivering more and more manipulation services in manufacturing. However, when the task is complex, it is difficult to programme a robot to fulfil all the requirements because even a relatively simple task such as a peg-in-hole insertion contains many uncertainties, e.g. clearance, initial grasping position and insertion path. Humans, on the other hand, can deal with these variations using their vision and haptic feedback. Although humans can adapt to uncertainties easily, most of the time, the skilled based performances that relate to their tacit knowledge cannot be easily articulated. Even though the automation solution may not fully imitate human motion since some of them are not necessary, it would be useful if the skill based performance from a human could be firstly interpreted and modelled, which will then allow it to be transferred to the robot. This thesis aims to reduce robot programming efforts significantly by developing a methodology to capture, model and transfer the manual manufacturing skills from a human demonstrator to the robot. Recently, Learning from Demonstration (LfD) is gaining interest as a framework to transfer skills from human teacher to robot using probability encoding approaches to model observations and state transition uncertainties. In close or actual contact manipulation tasks, it is difficult to reliabley record the state-action examples without interfering with the human senses and activities. Therefore, wearable sensors are investigated as a promising device to record the state-action examples without restricting the human experts during the skilled execution of their tasks. Firstly to track human motions accurately and reliably in a defined 3-dimensional workspace, a hybrid system of Vicon and IMUs is proposed to compensate for the known limitations of the individual system. The data fusion method was able to overcome occlusion and frame flipping problems in the two camera Vicon setup and the drifting problem associated with the IMUs. The results indicated that occlusion and frame flipping problems associated with Vicon can be mitigated by using the IMU measurements. Furthermore, the proposed method improves the Mean Square Error (MSE) tracking accuracy range from 0.8˚ to 6.4˚ compared with the IMU only method. Secondly, to record haptic feedback from a teacher without physically obstructing their interactions with the workpiece, wearable surface electromyography (sEMG) armbands were used as an indirect method to indicate contact feedback during manual manipulations. A muscle-force model using a Time Delayed Neural Network (TDNN) was built to map the sEMG signals to the known contact force. The results indicated that the model was capable of estimating the force from the sEMG armbands in the applications of interest, namely in peg-in-hole and beater winding tasks, with MSE of 2.75N and 0.18N respectively. Finally, given the force estimation and the motion trajectories, a Hidden Markov Model (HMM) based approach was utilised as a state recognition method to encode and generalise the spatial and temporal information of the skilled executions. This method would allow a more representative control policy to be derived. A modified Gaussian Mixture Regression (GMR) method was then applied to enable motions reproduction by using the learned state-action policy. To simplify the validation procedure, instead of using the robot, additional demonstrations from the teacher were used to verify the reproduction performance of the policy, by assuming human teacher and robot learner are physical identical systems. The results confirmed the generalisation capability of the HMM model across a number of demonstrations from different subjects; and the reproduced motions from GMR were acceptable in these additional tests. The proposed methodology provides a framework for producing a state-action model from skilled demonstrations that can be translated into robot kinematics and joint states for the robot to execute. The implication to industry is reduced efforts and time in programming the robots for applications where human skilled performances are required to cope robustly with various uncertainties during tasks execution.
|
Page generated in 0.1727 seconds