Global ETD Search

1	Risk Mitigation for Human-Robot Collaboration Using Artificial Intelligence / Riskreducering för människa-robot-samarbete baserad på artificiell intelligens Istar Terra, Ahmad January 2019 (has links) In human-robot collaborative (HRC) scenarios where humans and robots work together sharing the same workspace, there is a risk of potential hazard that may occur. In this work, an AI-based risk analysis solution has been developed to identify any condition that may harm a robot and its environment. The information from the risk analysis is used in a risk mitigation module to reduce the possibility of being in a hazardous situation. The goal is to develop safety for HRC scenarios using different AI algorithms and to check the possibilities of improving efficiency of the system without any compromise on the safety. This report presents risk mitigation strategies that were built on top of the robot’s control system and based on the ISO 15066 standard. Each of them used semantic information (scene graph) about the robot’s environment and changed the robot’s movement by scaling speed. The first implementation of risk mitigation strategy used Fuzzy Logic System. This system analyzed the riskiest object’s properties to adjust the speed of the robot accordingly. The second implementation used Reinforcement Learning and considered every object’s properties. Three networks (fully connected network, convolutional neural network, and hybrid network) were implemented to estimate the Qvalue function. Additionally, local and edge computation architecture wereimplemented to measure the computational performance on the real robot. Each model was evaluated by measuring the safety aspect and the performance of the robot in a simulated warehouse scenario. All risk mitigation modules were able to reduce the risk of potential hazard. The fuzzy logic system was able to increase the safety aspect with the least efficiency reduction. The reinforcement learning model had safer operation but showed a more compromised efficiency than the fuzzy logic system. Generally, the fuzzy logic system performed up to 28% faster than reinforcement learning but compromised up to 23% in terms of safety (mean risk speed value). In terms of computational performance, edge computation was performed faster than local computation. The bottleneck of the process was the scene graph generation which analyzed an image to produce information for safety analysis. It took approximately 15 seconds to run the scene graph generation on the robot’s CPU and 0.3 seconds on an edge device. The risk mitigation module can be selected depending on KPIs of the warehouse operation while the edge architecture must be implemented to achieve a realistic performance. / I HRC-scenarier mellan människor och robotar där människor och robotar arbetar tillsammans och delar samma arbetsyta finns det risk för potentiell fara som kan uppstå. I detta arbete har en AI-baserad lösning för riskanalys utvecklats för att identifiera alla tillstånd som kan skada en robot och dess miljö. Informationen från riskanalys används i en riskreduceringsmodul för att minska risken för att vara i en farlig situation. Målet är att utveckla säkerhet för HRC-scenarier med olika AI-algoritmer och att kontrollera möjligheterna att förbättra systemets effektivitet utan att kompromissa med säkerheten.Denna rapport presenterar strategier för riskreducering som byggdes ovanpå robotens styrsystem och baserade på ISO 15066-standarden. Var och en av dem använder semantisk information (scendiagram) om robotens miljö och förändrar robotens rörelse genom skalning av hastighet. Den första implementetationen av riskreducerande strategi använder Fuzzy Logic System. Detta system analyserade de mest riskabla objektens egenskaper för att justera robotens hastighet i enlighet därmed. Den andra implementeringen använder förstärkningslärande och betraktade varje objekts egenskaper. Tre nätverk (fully connected network, convolutional neural network, and hybrid network) implementeras för att uppskatta Q-värde-funktionen. Dessutom implementerade vi också lokaloch edge-arkitektur för att beräkna beräkningsprestanda på den verkliga roboten. Varje modell utvärderas genom att mäta säkerhetsaspekten och robotens prestanda i ett simulerat lagerscenario. Alla riskreduceringsmoduler kunde minska risken för potentiell fara. Fuzzy logicsystem kunde öka säkerhetsaspekten med minsta effektivitetsminskning. Förstärkningsinlärningsmodellen har säkrare drift men har en mer begränsad effektivitet än det fuzzy logiska systemet. I allmänhet fungerar fuzzy logicsystem upp till 28 % snabbare än förstärkningslärande men komprometterar upp till 23 % när det gäller säkerhet (medelrisk hastighetsvärde). När det gäller beräkningsprestanda utfördes kantberäkningen snabbare än lokal beräkning. Flaskhalsen för processen var scengrafgenerering som analyserade en bild för att producera information för säkerhetsanalys. Det tog cirka 15 sekunder att köra scengrafgenerering på robotens CPU och 0,3 sekunder på en kantenhet. Modulen för riskreducering kan väljas beroende på KPI för lagerdriften medan edge-arkitekturen måste implementeras för att uppnå en realistisk prestanda. human-robot collaboration safety analysis risk mitigation fuzzy logic system reinforcement learning and computation architecture. människa-robotinteraktion säkerhetsanalys riskreducering fuzzy logicsystem förstärkningslärande och beräkningsarkitektur. Computer and Information Sciences Data- och informationsvetenskap
2	Bayesian Reinforcement Learning Methods for Network Intrusion Prevention Nesti Lopes, Antonio Frederico January 2021 (has links) A growing problem in network security stems from the fact that both attack methods and target systems constantly evolve. This problem makes it difficult for human operators to keep up and manage the security problem. To deal with this challenge, a promising approach is to use reinforcement learning to adapt security policies to a changing environment. However, a drawback of this approach is that traditional reinforcement learning methods require a large amount of data in order to learn effective policies, which can be both costly and difficult to obtain. To address this problem, this thesis investigates ways to incorporate prior knowledge in learning systems for network security. Our goal is to be able to learn security policies with less data compared to traditional reinforcement learning algorithms. To investigate this question, we take a Bayesian approach and consider Bayesian reinforcement learning methods as a complement to current algorithms in reinforcement learning. Specifically, in this work, we study the following algorithms: Bayesian Q-learning, Bayesian REINFORCE, and Bayesian Actor-Critic. To evaluate our approach, we have implemented the mentioned algorithms and techniques and applied them to different simulation scenarios of intrusion prevention. Our results demonstrate that the Bayesian reinforcement learning algorithms are able to learn more efficiently compared to their non-Bayesian counterparts but that the Bayesian approach is more computationally demanding. Further, we find that the choice of prior and the kernel function have a large impact on the performance of the algorithms. / Ett växande problem inom cybersäkerhet är att både attackmetoder samt system är i en konstant förändring och utveckling: å ena sidan så blir attackmetoder mer och mer sofistikerade, och å andra sidan så utvecklas system via innovationer samt uppgraderingar. Detta problem gör det svårt för mänskliga operatörer att hantera säkerhetsproblemet. En lovande metod för att hantera denna utmaning är förstärkningslärande. Med förstärkningslärande kan en autonom agent automatiskt lära sig att anpassa säkerhetsstrategier till en föränderlig miljö. En utmaning med detta tillvägagångsätt är dock att traditionella förstärkningsinlärningsmetoder kräver en stor mängd data för att lära sig effektiva strategier, vilket kan vara både kostsamt och svårt att erskaffa. För att lösa detta problem så undersöker denna avhandling Bayesiska metoder för att inkorporera förkunskaper i inlärningsalgoritmen, vilket kan möjliggöra lärande med mindre data. Specifikt så studerar vi följande Bayesiska algoritmer: Bayesian Q-learning, Bayesian REINFORCE och Bayesian Actor- Critic. För att utvärdera vårt tillvägagångssätt har vi implementerat de nämnda algoritmerna och utvärderat deras prestanda i olika simuleringsscenarier för intrångsförebyggande samt analyserat deras komplexitet. Våra resultat visar att de Bayesiska förstärkningsinlärningsalgoritmerna kan användas för att lära sig strategier med mindre data än vad som kravs vid användande av icke-Bayesiska motsvarigheter, men att den Bayesiska metoden är mer beräkningskrävande. Vidare finner vi att metoden för att inkorporera förkunskap i inlärningsalgoritmen, samt val av kernelfunktion, har stor inverkan på algoritmernas prestanda. Network Security Reinforcement Learning Bayesian Q-Learning Bayesian Policy Gradient Bayesian Actor-Critic Markov Security Games Nätverkssäkerhet förstärkningslärande Bayesian Q-Learning Bayesian Policy Gradient Bayesian Actor-Critic Markov Security Games Computer and Information Sciences Data- och informationsvetenskap
3	Explainable Reinforcement Learning for Risk Mitigation in Human-Robot Collaboration Scenarios / Förklarbar förstärkningsinlärning inom människa-robot sammarbete för riskreducering Iucci, Alessandro January 2021 (has links) Reinforcement Learning (RL) algorithms are highly popular in the robotics field to solve complex problems, learn from dynamic environments and generate optimal outcomes. However, one of the main limitations of RL is the lack of model transparency. This includes the inability to provide explanations of why the output was generated. The explainability becomes even more crucial when RL outputs influence human decisions, such as in Human-Robot Collaboration (HRC) scenarios, where safety requirements should be met. This work focuses on the application of two explainability techniques, “Reward Decomposition” and “Autonomous Policy Explanation”, on a RL algorithm which is the core of a risk mitigation module for robots’ operation in a collaborative automated warehouse scenario. The “Reward Decomposition” gives an insight into the factors that impacted the robot’s choice by decomposing the reward function into sub-functions. It also allows creating Minimal Sufficient Explanation (MSX), sets of relevant reasons for each decision taken during the robot’s operation. The second applied technique, “Autonomous Policy Explanation”, provides a global overview of the robot’s behavior by answering queries asked by human users. It also provides insights into the decision guidelines embedded in the robot’s policy. Since the synthesis of the policy descriptions and the queries’ answers are in natural language, this tool facilitates algorithm diagnosis even by non-expert users. The results proved that there is an improvement in the RL algorithm which now chooses more evenly distributed actions and a full policy to the robot’s decisions is produced which is for the most part aligned with the expectations. The work provides an analysis of the results of the application of both techniques which both led to increased transparency of the robot’s decision process. These explainability methods not only built trust in the robot’s choices, which proved to be among the optimal ones in most of the cases but also made it possible to find weaknesses in the robot’s policy, making them a tool helpful for debugging purposes. / Algoritmer för förstärkningsinlärning (RL-algoritmer) är mycket populära inom robotikområdet för att lösa komplexa problem, att lära sig av dynamiska miljöer och att generera optimala resultat. En av de viktigaste begränsningarna för RL är dock bristen på modellens transparens. Detta inkluderar den oförmåga att förklara bakomliggande process (algoritm eller modell) som genererade ett visst returvärde. Förklarbarheten blir ännu viktigare när resultatet från en RL-algoritm påverkar mänskliga beslut, till exempel i HRC-scenarier där säkerhetskrav bör uppfyllas. Detta arbete fokuserar på användningen av två förklarbarhetstekniker, “Reward Decomposition” och “Autonomous policy Explanation”, tillämpat på en RL-algoritm som är kärnan i en riskreduceringsmodul för drift av samarbetande robotars på ett automatiserat lager. “Reward Decomposition” ger en inblick i vilka faktorer som påverkade robotens val genom att bryta ner belöningsfunktionen i mindre funktioner. Det gör det också möjligt att formulera en MSX (minimal sufficient explanation), uppsättning av relevanta skäl för varje beslut som har fattas under robotens drift. Den andra tillämpade tekniken, “Autonomous Policy Explanation”, ger en generellt prespektiv över robotens beteende genom att mänskliga användare får ställa frågor till roboten. Detta ger även insikt i de beslutsriktlinjer som är inbäddade i robotens policy. Ty syntesen av policybeskrivningarna och frågornas svar är naturligt språk underlättar detta en algoritmdiagnos även för icke-expertanvändare. Resultaten visade att det finns en förbättring av RL-algoritmen som nu väljer mer jämnt fördelade åtgärder. Dessutom produceras en fullständig policy för robotens beslut som för det mesta är anpassad till förväntningarna. Rapporten ger en analys av resultaten av tillämpningen av båda teknikerna, som visade att båda ledde till ökad transparens i robotens beslutsprocess. Förklaringsmetoderna gav inte bara förtroende för robotens val, vilket visade sig vara bland de optimala i de flesta fall, utan gjorde det också möjligt att hitta svagheter i robotens policy, vilket gjorde dem till ett verktyg som är användbart för felsökningsändamål. Explainable Reinforcement Learning Human-Robot Collaboration Risk Mitigation Reward Decomposition Autonomous Policy Explanation Collaborative Robots Förklarbar förstärkningslärande Mänskligt-robot-samarbete Riskreducering Reward Decomposition Autonomous Policy Explanation Samarbetsrobotar Computer and Information Sciences Data- och informationsvetenskap

1

Page generated in 0.1191 seconds