• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 51
  • 8
  • 7
  • 6
  • 3
  • 3
  • 2
  • 1
  • Tagged with
  • 99
  • 99
  • 30
  • 20
  • 16
  • 15
  • 14
  • 12
  • 12
  • 11
  • 10
  • 9
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Fatores organizacionais que influenciam a aprendizagem a partir dos erros e sua relação com os comportamentos inovadores no trabalho em uma empresa do segmento farmacêutico

Barbarini, Antonio César 04 March 2015 (has links)
Made available in DSpace on 2016-03-15T19:26:24Z (GMT). No. of bitstreams: 1 Antonio Cesar Barbariniprot.pdf: 5498245 bytes, checksum: 8ee5821aba2d4d3741856a81ee04776b (MD5) Previous issue date: 2015-03-04 / The challenges of the business environment in the 2010s are becoming increasingly complex, with fast-paced changes and different kinds of pressure from competition, customers, regulatory agencies, unions, the economy, etc. In this context, organizations need to be more responsive to change and learn quickly in the face of new demands. In the 80s and 90s, the quality of products and services was the rule to differentiate between companies on the market, which made organizations seek models and systems to ensure compliance with the highest standards of quality required in different markets. In addition, companies increased their standards of operational efficiency, productivity and reliability, but also ended up becoming more similar and only slightly differentiated among themselves. Organizations now face the challenge of adopting more flexible systems, standards and structures in order to facilitate adaptation to the current context, which is extremely dynamic. In this scenario, innovation can help organizations become more competitive, as long as they can continually learn, by taking advantage of the informal learning in the workplace, which represents more than 80% of the total number of learning opportunities. Considering the current levels of complexity, dynamism and uncertainty present in the work environment, errors or failures are by-products of organizational processes and are not necessarily bad; it is important that organizations can quickly learn from the mistakes, through people. Some factors may influence the creation of an environment where people can learn from mistakes, experiment and apply ideas, by adopting innovative work behaviors. This quantitative study seeks to examine the relationship between the factors that influence learning from errors and the innovative work behaviors in a multinational organization in the pharmaceutical segment. The research with employees from different areas received 146 valid responses which were analyzed using confirmatory factor analysis with the SmartPLS 2.0 M3 software. The hypothesis that the Factors that Influence the Learning from Mistakes have positive relationship with the Innovative Work Behaviors was confirmed. The structural coefficient obtained between the two dimensions was 0.618 (p< 0.001), which means that the construct Factors that Influence the Learning from Mistakes explains 38% of the variation in the indices of the Innovative Work Behaviors construct. The companies ability to quickly learn from mistakes and experiments, taking advantage of the human capital potential, particularly through innovative work behaviors, can be a critical aspect of differentiation in the marketplace and a way for companies to obtain competitive advantage. This study aims to contribute to the expansion of knowledge on the subject, highlighting the importance of the proper management of the factors that may influence the learning from mistakes in the work environment. In addition, this study contributes with the validation and adaptation of the original scales of constructs studied to the Brazilian context. / Os desafios do ambiente de negócios nos anos 2010 se tornam cada vez mais complexos, com mudanças em ritmo acelerado e pressões de ordem econômica, regulatória, da concorrência, dos clientes etc. Nesse contexto, as organizações precisam ser mais ágeis, para mudar e aprender rapidamente, diante das novas demandas. Nas décadas de 1980 e 1990, a qualidade dos produtos e serviços servia para diferenciar as empresas no mercado, o que fez com que as organizações buscassem modelos e sistemas para assegurar conformidade com os altos padrões de qualidade exigidos nos diferentes mercados. Além disso, as empresas aumentaram seus padrões de eficácia operacional, produtividade e confiabilidade, mas também acabaram tornando-se mais parecidas e pouco diferenciadas entre si. As organizações agora se deparam com o desafio de adotar sistemas, normas e padrões mais flexíveis, que facilitem a adaptação ao atual contexto extremamente dinâmico. Nesse cenário, a inovação pode ajudar as organizações a se tornarem mais competitivas, desde que possam aprender continuamente, aproveitando especialmente as situações de aprendizagem informal no trabalho, que representam mais de 80% do total das oportunidades de aprendizagem. Considerando os níveis atuais de complexidade, dinamismo e incertezas presentes no ambiente de trabalho, os erros ou falhas acabam sendo subprodutos dos processos organizacionais e não são necessariamente ruins; é importante que as organizações possam aprender rapidamente a partir dos erros, por meio das pessoas. Alguns fatores no âmbito da organização podem influenciar a criação de um ambiente onde as pessoas possam aprender com os erros, experimentar e aplicar ideias, adotando comportamentos inovadores no trabalho. Este estudo de natureza quantitativa busca examinar as relações entre os fatores que influenciam a aprendizagem a partir dos erros e os comportamentos inovadores no trabalho em uma organização multinacional do segmento farmacêutico. A pesquisa com funcionários de diferentes áreas obteve 146 respostas válidas e os dados foram analisados utilizando-se a análise fatorial confirmatória com o software SmartPLS 2.0 M3. A hipótese de que os Fatores que Influenciam a Aprendizagem a Partir dos Erros têm relação positiva com os Comportamentos Inovadores no Trabalho foi confirmada. O coeficiente estrutural obtido entre as dimensões foi de 0,618 (p<0,001), sendo que o construto Fatores que Influenciam a Aprendizagem a Partir dos Erros explica 38% da variação dos índices do construto Comportamentos Inovadores no Trabalho. A capacidade das empresas aprenderem rapidamente a partir dos erros e experimentos, aproveitando o potencial do capital humano existente, especialmente através de comportamentos inovadores no trabalho, pode ser um aspecto crítico para diferenciação no mercado e para a obtenção de vantagem competitiva. Este estudo visa contribuir para a ampliação dos conhecimentos sobre o tema, destacando a importância da gestão adequada dos fatores que influenciam a aprendizagem a partir dos erros no ambiente de trabalho. Adicionalmente, contribui para a validação e adaptação das escalas originais dos construtos estudados para o contexto brasileiro.
82

Apprentissage Intelligent des Robots Mobiles dans la Navigation Autonome / Intelligent Mobile Robot Learning in Autonomous Navigation

Xia, Chen 24 November 2015 (has links)
Les robots modernes sont appelés à effectuer des opérations ou tâches complexes et la capacité de navigation autonome dans un environnement dynamique est un besoin essentiel pour les robots mobiles. Dans l’objectif de soulager de la fastidieuse tâche de préprogrammer un robot manuellement, cette thèse contribue à la conception de commande intelligente afin de réaliser l’apprentissage des robots mobiles durant la navigation autonome. D’abord, nous considérons l’apprentissage des robots via des démonstrations d’experts. Nous proposons d’utiliser un réseau de neurones pour apprendre hors-ligne une politique de commande à partir de données utiles extraites d’expertises. Ensuite, nous nous intéressons à l’apprentissage sans démonstrations d’experts. Nous utilisons l’apprentissage par renforcement afin que le robot puisse optimiser une stratégie de commande pendant le processus d’interaction avec l’environnement inconnu. Un réseau de neurones est également incorporé et une généralisation rapide permet à l’apprentissage de converger en un certain nombre d’épisodes inférieur à la littérature. Enfin, nous étudions l’apprentissage par fonction de récompenses potentielles compte rendu des démonstrations d’experts optimaux ou non-optimaux. Nous proposons un algorithme basé sur l’apprentissage inverse par renforcement. Une représentation non-linéaire de la politique est désignée et la méthode du max-margin est appliquée permettant d’affiner les récompenses et de générer la politique de commande. Les trois méthodes proposées sont évaluées sur des robots mobiles afin de leurs permettre d’acquérir les compétences de navigation autonome dans des environnements dynamiques et inconnus / Modern robots are designed for assisting or replacing human beings to perform complicated planning and control operations, and the capability of autonomous navigation in a dynamic environment is an essential requirement for mobile robots. In order to alleviate the tedious task of manually programming a robot, this dissertation contributes to the design of intelligent robot control to endow mobile robots with a learning ability in autonomous navigation tasks. First, we consider the robot learning from expert demonstrations. A neural network framework is proposed as the inference mechanism to learn a policy offline from the dataset extracted from experts. Then we are interested in the robot self-learning ability without expert demonstrations. We apply reinforcement learning techniques to acquire and optimize a control strategy during the interaction process between the learning robot and the unknown environment. A neural network is also incorporated to allow a fast generalization, and it helps the learning to converge in a number of episodes that is greatly smaller than the traditional methods. Finally, we study the robot learning of the potential rewards underneath the states from optimal or suboptimal expert demonstrations. We propose an algorithm based on inverse reinforcement learning. A nonlinear policy representation is designed and the max-margin method is applied to refine the rewards and generate an optimal control policy. The three proposed methods have been successfully implemented on the autonomous navigation tasks for mobile robots in unknown and dynamic environments.
83

信用卡信用風險預警範例學習系統之研究 / Predicting Credit Card Risks Using Learning From Examples

馬芳資, Ma, Fang-tsz Unknown Date (has links)
近年來,信用卡市場快速地成長,發卡銀行亦大量地發卡,然而目前國內 發卡銀行在整個信用卡信用風險管理上,大都採行人類專家經驗判斷的方 式進行。發卡銀行隨著持卡人數快速地增加,其信用資料亦呈等比例急速 上升,若仍採用人工處理方式,除了會大幅增加工作負荷外,其授信品質 也不易控制。因此,本研究擬引進資訊技術來解決大量信用卡信用資料之 信用管理問題。 首先,我們探討信用卡信用管理業務,並根據其作業 流程來建構一信用卡信用管理自動化的架構,此架構包括徵信驗證系統、 審核系統、預警系統、高風險客戶管理系統、及催收系統等五個系統,其 目的在於輔助授信管理之業務、減少授管人員的工作負荷、以有效控制授 信品質、及降低授信的風險。 其次,本研究針對上述信用卡信用管理 自動化中的預警系統,利用範例學習法來建立信用卡信用風險預警範例學 習系統,且實際以一家發卡銀行的信用資料來建立並驗證四個預警模式, 期能事先讓系統自動查核信用不良之客戶。此四類預警模式為: (一)提前 預警模式(二)群體決策預警模式(三)追蹤管理預警模式(四)例外管理預警 模式 最後,我們亦提出一些未來研究之課題,期能進一步發展本研究 之信用卡信用管理自動化系統及預警模式,以推廣應用至各發卡機構。
84

信用卡信用風險審核範例學習系統之研究 / Assessing Credit Card Risks Using Learning From Examples

許愛惠, Ai-Huey Shu Unknown Date (has links)
隨著國人持信用卡消費購物方式的蔚為風氣,致使發卡機構每日所需處理 的申請案件激增;同時,由於信用卡業務的日趨多元化,更增添了審核的 複雜度。傳統以人為判斷為主的審核方法,在有限的人力之下,勢將難以 因應如此龐大的審核需求,而在時間緊迫、經驗累積不足的情形下,難免 會危及授信品質,而增加了此項授信業務的風險。有鑑於此,本研究希望 能藉由範例學習法建立一信用卡信用風險審核模式,期能有效輔助信用卡 發卡審核作業,以降低授信風險,並提昇發卡機構的經營績效。本研究以 某發卡機構為研究對象。抽取個案機構81全年度,審核通過的資料作為研 究樣本。其中以截至82年度3月止,被強制停卡者之不良卡戶,計2,788筆 ;而仍繼續流通的正常卡戶,計有97,001筆,總計99,789筆,作為系統學 習及測試所需之資料。本研究首先針對信用卡審核問題的特性,探討範例 學習法的處理策略,我們將計質線索以相對風險的觀念轉換為順序尺度, 並使所建構的二元樹之葉節點(判斷法則)精簡了二分之一左右,和原分 類樹預測能力並無顯著差異。其次,我們進一步運用修剪策略,可將原判 斷法則數由230條減至26條,大幅的提升了執行效率;修剪策略的運用, 雖然降低了區別率,但卻將預測能力(命中率)由67.1%提升至72.58%。 亦即研究結果顯示,避免分類過細,有助於系統預測能力的提升。本研究 範例學習審核模式之預測能力達72.58%,較 Logistic Re- gression 審 核模式高出約6.49%。在重要性線索的選取上,二者具有相當的一致性; 研究結果顯示,原持有一般卡張數、金卡張數、教育程度、公司等級、職 級等為區別力較佳的信用要素。其中區別能力最強的因素為原一般卡持有 張數,張數愈多,信用風險愈高,而此因素為原審核模式所疏漏者,值得 授信人員警惕。此外,我們再將成本效益因素加入分類樹判斷法則,透過 此方式可調整授信的門檻,以增加發卡機構所能獲取的利潤。進一步考量 申請者的信用風險與所得,建立一信用額度核定的方式。研究結果顯示, 以此一模式授與信用額度較原機構之授與方式,高出約一仟二佰萬元淨收 益。此乃由於本模式能有效辨識出不良客戶(命中率為 80.58%),因而 大幅地降低了呆帳的損失。最後,我們綜合本研究的心得,提出一些未來 研究課題,期能使最適分類樹的產生更具效率,並且擴大研究的範圍,希 望能將信用卡範例學習系統推展至各發卡機構,並應用於信用管理的各層 面,以有效提昇信用卡經營效益。
85

Human-Inspired Robot Task Teaching and Learning

Wu, Xianghai 28 October 2009 (has links)
Current methods of robot task teaching and learning have several limitations: highly-trained personnel are usually required to teach robots specific tasks; service-robot systems are limited in learning different types of tasks utilizing the same system; and the teacher’s expertise in the task is not well exploited. A human-inspired robot-task teaching and learning method is developed in this research with the aim of allowing general users to teach different object-manipulation tasks to a service robot, which will be able to adapt its learned tasks to new task setups. The proposed method was developed to be interactive and intuitive to the user. In a closed loop with the robot, the user can intuitively teach the tasks, track the learning states of the robot, direct the robot attention to perceive task-related key state changes, and give timely feedback when the robot is practicing the task, while the robot can reveal its learning progress and refine its knowledge based on the user’s feedback. The human-inspired method consists of six teaching and learning stages: 1) checking and teaching the needed background knowledge of the robot; 2) introduction of the overall task to be taught to the robot: the hierarchical task structure, and the involved objects and robot hand actions; 3) teaching the task step by step, and directing the robot to perceive important state changes; 4) demonstration of the task in whole, and offering vocal subtask-segmentation cues in subtask transitions; 5) robot learning of the taught task using a flexible vote-based algorithm to segment the demonstrated task trajectories, a probabilistic optimization process to assign obtained task trajectory episodes (segments) to the introduced subtasks, and generalization of the taught task trajectories in different reference frames; and 6) robot practicing of the learned task and refinement of its task knowledge according to the teacher’s timely feedback, where the adaptation of the learned task to new task setups is achieved by blending the task trajectories generated from pertinent frames. An agent-based architecture was designed and developed to implement this robot-task teaching and learning method. This system has an interactive human-robot teaching interface subsystem, which is composed of: a) a three-camera stereo vision system to track user hand motion; b) a stereo-camera vision system mounted on the robot end-effector to allow the robot to explore its workspace and identify objects of interest; and c) a speech recognition and text-to-speech system, utilized for the main human-robot interaction. A user study involving ten human subjects was performed using two tasks to evaluate the system based on time spent by the subjects on each teaching stage, efficiency measures of the robot’s understanding of users’ vocal requests, responses, and feedback, and their subjective evaluations. Another set of experiments was done to analyze the ability of the robot to adapt its previously learned tasks to new task setups using measures such as object, target and robot starting-point poses; alignments of objects on targets; and actual robot grasp and release poses relative to the related objects and targets. The results indicate that the system enabled the subjects to naturally and effectively teach the tasks to the robot and give timely feedback on the robot’s practice performance. The robot was able to learn the tasks as expected and adapt its learned tasks to new task setups. The robot properly refined its task knowledge based on the teacher’s feedback and successfully applied the refined task knowledge in subsequent task practices. The robot was able to adapt its learned tasks to new task setups that were considerably different from those in the demonstration. The alignments of objects on the target were quite close to those taught, and the executed grasping and releasing poses of the robot relative to objects and targets were almost identical to the taught poses. The robot-task learning ability was affected by limitations of the vision-based human-robot teleoperation interface used in hand-to-hand teaching and the robot’s capacity to sense its workspace. Future work will investigate robot learning of a variety of different tasks and the use of more robot in-built primitive skills.
86

Human-Inspired Robot Task Teaching and Learning

Wu, Xianghai 28 October 2009 (has links)
Current methods of robot task teaching and learning have several limitations: highly-trained personnel are usually required to teach robots specific tasks; service-robot systems are limited in learning different types of tasks utilizing the same system; and the teacher’s expertise in the task is not well exploited. A human-inspired robot-task teaching and learning method is developed in this research with the aim of allowing general users to teach different object-manipulation tasks to a service robot, which will be able to adapt its learned tasks to new task setups. The proposed method was developed to be interactive and intuitive to the user. In a closed loop with the robot, the user can intuitively teach the tasks, track the learning states of the robot, direct the robot attention to perceive task-related key state changes, and give timely feedback when the robot is practicing the task, while the robot can reveal its learning progress and refine its knowledge based on the user’s feedback. The human-inspired method consists of six teaching and learning stages: 1) checking and teaching the needed background knowledge of the robot; 2) introduction of the overall task to be taught to the robot: the hierarchical task structure, and the involved objects and robot hand actions; 3) teaching the task step by step, and directing the robot to perceive important state changes; 4) demonstration of the task in whole, and offering vocal subtask-segmentation cues in subtask transitions; 5) robot learning of the taught task using a flexible vote-based algorithm to segment the demonstrated task trajectories, a probabilistic optimization process to assign obtained task trajectory episodes (segments) to the introduced subtasks, and generalization of the taught task trajectories in different reference frames; and 6) robot practicing of the learned task and refinement of its task knowledge according to the teacher’s timely feedback, where the adaptation of the learned task to new task setups is achieved by blending the task trajectories generated from pertinent frames. An agent-based architecture was designed and developed to implement this robot-task teaching and learning method. This system has an interactive human-robot teaching interface subsystem, which is composed of: a) a three-camera stereo vision system to track user hand motion; b) a stereo-camera vision system mounted on the robot end-effector to allow the robot to explore its workspace and identify objects of interest; and c) a speech recognition and text-to-speech system, utilized for the main human-robot interaction. A user study involving ten human subjects was performed using two tasks to evaluate the system based on time spent by the subjects on each teaching stage, efficiency measures of the robot’s understanding of users’ vocal requests, responses, and feedback, and their subjective evaluations. Another set of experiments was done to analyze the ability of the robot to adapt its previously learned tasks to new task setups using measures such as object, target and robot starting-point poses; alignments of objects on targets; and actual robot grasp and release poses relative to the related objects and targets. The results indicate that the system enabled the subjects to naturally and effectively teach the tasks to the robot and give timely feedback on the robot’s practice performance. The robot was able to learn the tasks as expected and adapt its learned tasks to new task setups. The robot properly refined its task knowledge based on the teacher’s feedback and successfully applied the refined task knowledge in subsequent task practices. The robot was able to adapt its learned tasks to new task setups that were considerably different from those in the demonstration. The alignments of objects on the target were quite close to those taught, and the executed grasping and releasing poses of the robot relative to objects and targets were almost identical to the taught poses. The robot-task learning ability was affected by limitations of the vision-based human-robot teleoperation interface used in hand-to-hand teaching and the robot’s capacity to sense its workspace. Future work will investigate robot learning of a variety of different tasks and the use of more robot in-built primitive skills.
87

Aplicação de métodos não supervisionados: estudo empírico com os dados de segurança pública do estado do Rio de Janeiro

Nascimento, Otto Tavares 20 December 2016 (has links)
Submitted by Otto Tavares Nascimento (otavares93@gmail.com) on 2017-05-12T09:14:03Z No. of bitstreams: 1 Dissertação_Otto_Tavares_Nascimento.pdf: 9875781 bytes, checksum: fe5bb21c41c1cb3b1dc79d84841fe938 (MD5) / Approved for entry into archive by Leiliane Silva (leiliane.silva@fgv.br) on 2017-05-12T20:37:41Z (GMT) No. of bitstreams: 1 Dissertação_Otto_Tavares_Nascimento.pdf: 9875781 bytes, checksum: fe5bb21c41c1cb3b1dc79d84841fe938 (MD5) / Made available in DSpace on 2017-05-30T14:11:36Z (GMT). No. of bitstreams: 1 Dissertação_Otto_Tavares_Nascimento.pdf: 9875781 bytes, checksum: fe5bb21c41c1cb3b1dc79d84841fe938 (MD5) Previous issue date: 2016-12-20 / Este trabalho é uma abordagem multidisciplinar, o qual aplica-se a metodologia de matemática aplicada, em específico, aprendizagem não supervisionada, a dados de segurança pública. Busca-se identificar a semelhança entre batalhões da polícia, utilizando métodos de clusterização de modo a otimizar numericamente o critério de avaliação de McClain. Além da otimização, aborda-se intuitivamente o modelo de clusterização hierárquica, para posteriormente extrair ordem no padrão criminal dos clusters e, finalmente, aplicar o modelo de classificação OLogit, utilizando variáveis características desses clusters. Encontramos evidência de clusterização dos dados e significância na utilização de dados socioeconômicos e de policiamento na ordenação dos clusters. Resumindo, quanto maior o efetivo policial por habitante e o IDH de renda mínima em determinado batalhão maior a probabilidade de se estar em um cluster de menor incidência criminal. / This multidisciplinary work use an applied math methodology, especially unsupervised learning, in public security data. We seek to find the similiarity beetwen policies battalions, using clustering methods, while otimizing numerically the McCLain index. Besides that, we extract learning from data, using OLogit models in cluster's order with feature variables. We find data clustering evidence and extract significance of socioeconomic and policing data in cluster's order. In summary, a higher police force per inhabitant and a higher minimum income HDI in a given batallion results in a greater probability of being in a cluster of lower criminal incidence.
88

From examples to knowledge in model-driven engineering : a holistic and pragmatic approach

Batot, Edouard 11 1900 (has links)
No description available.
89

Využití powerpointových prezentací v hodinách zeměpisu / Powerpoint presentation usage in geography education

Mrázová, Marcela January 2010 (has links)
The thesis studies using PowerPoint presentation in geography teaching, especially for the impact of visualized substance on the effectiveness of teaching. The theoretical part is followed by questionnaire survey to monitor the current situation in Czech schools. Students and teachers of Geography were asked if they are computer users and how are their subjective views and experiences of using PowerPoint in teaching. I also examined the technical background of Czech schools and whether there is even possible to use audio-visual technology in education. I also asked which teaching method was preferred by students and whether they prefered the lessons with PowerPoint presentations. The numer of 62 teachers and 283 students were participants in this survey. Most of them is an ordinary computer user. PowerPoint presentations are often used in Geography. Their opinion of using powerpoint presentations in lessons is rather positive. The most popular method among students is an interpretation, which corresponds with the most common PowerPoint usage. The second part of the thesis includes qualitative research of effectiveness using PowerPoint in Geography lessons. In two parallel classes, there were teached substance, which has been visualized by PowerPoint presentation in one class and by any analog...
90

BI-DIRECTIONAL COACHING THROUGH SPARSE HUMAN-ROBOT INTERACTIONS

Mythra Varun Balakuntala Srinivasa Mur (16377864) 15 June 2023 (has links)
<p>Robots have become increasingly common in various sectors, such as manufacturing, healthcare, and service industries. With the growing demand for automation and the expectation for interactive and assistive capabilities, robots must learn to adapt to unpredictable environments like humans can. This necessitates the development of learning methods that can effectively enable robots to collaborate with humans, learn from them, and provide guidance. Human experts commonly teach their collaborators to perform tasks via a few demonstrations, often followed by episodes of coaching that refine the trainee’s performance during practice. Adopting a similar approach that facilitates interactions to teaching robots is highly intuitive and enables task experts to teach the robots directly. Learning from Demonstration (LfD) is a popular method for robots to learn tasks by observing human demonstrations. However, for contact-rich tasks such as cleaning, cutting, or writing, LfD alone is insufficient to achieve a good performance. Further, LfD methods are developed to achieve observed goals while ignoring actions to maximize efficiency. By contrast, we recognize that leveraging human social learning strategies of practice and coaching in conjunction enables learning tasks with improved performance and efficacy. To address the deficiencies of learning from demonstration, we propose a Coaching by Demonstration (CbD) framework that integrates LfD-based practice with sparse coaching interactions from a human expert.</p> <p><br></p> <p>The LfD-based practice in CbD was implemented as an end-to-end off-policy reinforcement learning (RL) agent with the action space and rewards inferred from the demonstration. By modeling the reward as a similarity network trained on expert demonstrations, we eliminate the need for designing task-specific engineered rewards. Representation learning was leveraged to create a novel state feature that captures interaction markers necessary for performing contact-rich skills. This LfD-based practice was combined with coaching, where the human expert can improve or correct the objectives through a series of interactions. The dynamics of interaction in coaching are formalized using a partially observable Markov decision process. The robot aims to learn the true objectives by observing the corrective feedback from the human expert. We provide an approximate solution by reducing this to a policy parameter update using KL divergence between the RL policy and a Gaussian approximation based on coaching. The proposed framework was evaluated on a dataset of 10 contact-rich tasks from the assembly (peg-insertion), service (cleaning, writing, peeling), and medical domains (cricothyroidotomy, sonography). Compared to baselines of behavioral cloning and reinforcement learning algorithms, CbD demonstrates improved performance and efficiency.</p> <p><br></p> <p>During the learning process, the demonstrations and coaching feedback imbue the robot with expert knowledge of the task. To leverage this expertise, we develop a reverse coaching model where the robot can leverage knowledge from demonstrations and coaching corrections to provide guided feedback to human trainees to improve their performance. Providing feedback adapted to individual trainees' "style" is vital to coaching. To this end, we have proposed representing style as objectives in the task null space. Unsupervised clustering of the null-space trajectories using Gaussian mixture models allows the robot to learn different styles of executing the same skill. Given the coaching corrections and style clusters database, a style-conditioned RL agent was developed to provide feedback to human trainees by coaching their execution using virtual fixtures. The reverse coaching model was evaluated on two tasks, a simulated incision and obstacle avoidance through a haptic teleoperation interface. The model improves human trainees’ accuracy and completion time compared to a baseline without corrective feedback. Thus, by taking advantage of different human-social learning strategies, human-robot collaboration can be realized in human-centric environments. </p> <p><br></p>

Page generated in 0.4911 seconds