Global ETD Search

1	Adaptation non-anticipée de comportement : application au déverminage de programmes en cours d'exécution / Unanticipated behavior adaptation : application to the debugging of running programs Costiou, Steven 28 November 2018 (has links) Certains programmes doivent fonctionner en continu et ne peuvent pas être interrompus en cas de dysfonctionnement. C'est par exemple le cas de drones en mission, de satellites et de certains objets connectés. Pour de telles applications, le défi est d’identifier les problèmes et de les corriger pendant l'exécution du programme. De plus, dans le contexte des systèmes à objets, il peut être nécessaire d’observer et d’instrumenter individuellement le comportement de certains objets particuliers. Dans cette thèse, nous proposons une solution d’adaptation dynamique de comportement permettant de déverminer individuellement les objets d'un programme en cours d'exécution. Cette solution est présentée sous la forme d’un patron applicable aux langages objets à typage dynamique. Ce patron permet d'implanter de façon minimale et générique des capacités additionnelles d’adaptation dynamique à granularité objet. Une mise en oeuvre de ce patron pour un langage de programmation particulier permet d'instrumenter dynamiquement un programme pour collecter des objets spécifiques et d'adapter leur comportement pendant l’exécution. Nous expérimentons notre patron par des mises en oeuvre en Pharo et en Python. Des dévermineurs dédiés à la mise au point de programmes en cours d’exécution sont mis en oeuvre pour ces deux langages objet. Ces outils sont évalués pour des cas de déverminage concrets : pour une simulation de drones, pour des applications connectées déployées sur des systèmes cyber-physiques distants, pour un serveur de discussion en ligne ainsi que sur un défaut en production d’un logiciel de génération de documents. / Some programs must run continuously and cannot be interrupted in the event of a malfunction.This is, for example, the case of drones, satellites and some internet-of-things applications. For such applications, the challenge is to identify and fix problems while the program is still running. Moreover, in the context of object-oriented Systems, it may be necessary to observe and instrument the behavior of very specifie objects.In this thesis, we propose a method to adapt object behavior in a running program. This solution is presented as a pattern applicable to dynamically typed object-oriented languages. This pattern makes it possible to implement, in a minimal and generic way, additional debugging capabilities at the level of objects. An implementation of this pattern for a particular programming language makes it possible to dynamically instrument a program, by collecting specifie objects and adapting their behavior during run-time. We experiment this pattern in Pharo and Python implementations with dedicated debuggers for each language.These tools are evaluated on typical debugging case studies: a simulation of drones, connected applications deployed on remote cyber-physical Systems, an online discussion server and a debugging session to fix a defect in a production software. Déverminage non-anticipé Adaptation de comportement Adaptation dynamique centrée objet Unanticipated debugging Behavior adaptation Dynamic object-centric adaptation
2	BI-DIRECTIONAL COACHING THROUGH SPARSE HUMAN-ROBOT INTERACTIONS Mythra Varun Balakuntala Srinivasa Mur (16377864) 15 June 2023 (has links) <p>Robots have become increasingly common in various sectors, such as manufacturing, healthcare, and service industries. With the growing demand for automation and the expectation for interactive and assistive capabilities, robots must learn to adapt to unpredictable environments like humans can. This necessitates the development of learning methods that can effectively enable robots to collaborate with humans, learn from them, and provide guidance. Human experts commonly teach their collaborators to perform tasks via a few demonstrations, often followed by episodes of coaching that refine the trainee’s performance during practice. Adopting a similar approach that facilitates interactions to teaching robots is highly intuitive and enables task experts to teach the robots directly. Learning from Demonstration (LfD) is a popular method for robots to learn tasks by observing human demonstrations. However, for contact-rich tasks such as cleaning, cutting, or writing, LfD alone is insufficient to achieve a good performance. Further, LfD methods are developed to achieve observed goals while ignoring actions to maximize efficiency. By contrast, we recognize that leveraging human social learning strategies of practice and coaching in conjunction enables learning tasks with improved performance and efficacy. To address the deficiencies of learning from demonstration, we propose a Coaching by Demonstration (CbD) framework that integrates LfD-based practice with sparse coaching interactions from a human expert.</p> <p><br></p> <p>The LfD-based practice in CbD was implemented as an end-to-end off-policy reinforcement learning (RL) agent with the action space and rewards inferred from the demonstration. By modeling the reward as a similarity network trained on expert demonstrations, we eliminate the need for designing task-specific engineered rewards. Representation learning was leveraged to create a novel state feature that captures interaction markers necessary for performing contact-rich skills. This LfD-based practice was combined with coaching, where the human expert can improve or correct the objectives through a series of interactions. The dynamics of interaction in coaching are formalized using a partially observable Markov decision process. The robot aims to learn the true objectives by observing the corrective feedback from the human expert. We provide an approximate solution by reducing this to a policy parameter update using KL divergence between the RL policy and a Gaussian approximation based on coaching. The proposed framework was evaluated on a dataset of 10 contact-rich tasks from the assembly (peg-insertion), service (cleaning, writing, peeling), and medical domains (cricothyroidotomy, sonography). Compared to baselines of behavioral cloning and reinforcement learning algorithms, CbD demonstrates improved performance and efficiency.</p> <p><br></p> <p>During the learning process, the demonstrations and coaching feedback imbue the robot with expert knowledge of the task. To leverage this expertise, we develop a reverse coaching model where the robot can leverage knowledge from demonstrations and coaching corrections to provide guided feedback to human trainees to improve their performance. Providing feedback adapted to individual trainees' "style" is vital to coaching. To this end, we have proposed representing style as objectives in the task null space. Unsupervised clustering of the null-space trajectories using Gaussian mixture models allows the robot to learn different styles of executing the same skill. Given the coaching corrections and style clusters database, a style-conditioned RL agent was developed to provide feedback to human trainees by coaching their execution using virtual fixtures. The reverse coaching model was evaluated on two tasks, a simulated incision and obstacle avoidance through a haptic teleoperation interface. The model improves human trainees’ accuracy and completion time compared to a baseline without corrective feedback. Thus, by taking advantage of different human-social learning strategies, human-robot collaboration can be realized in human-centric environments. </p> <p><br></p> Medical robotics Intelligent robotics Social robotics robot learning and behavior adaptation Learning from Demonstration (LfD) coaching (performance) Human- robot/agent interaction human action segmentation reinforcement learning agent Programming by Demonstration

Search results

Adaptation non-anticipée de comportement : application au déverminage de programmes en cours d'exécution / Unanticipated behavior adaptation : application to the debugging of running programs

BI-DIRECTIONAL COACHING THROUGH SPARSE HUMAN-ROBOT INTERACTIONS