Global ETD Search

11	Using Reinforcement Learning for Games with Nondeterministic State Transitions / Reinforcement Learning för spel med icke-deterministiska tillståndsövergångar Fischer, Max January 2019 (has links) Given the recent advances within a subfield of machine learning called reinforcement learning, several papers have shown that it is possible to create self-learning digital agents, agents that take actions and pursue strategies in complex environments without any prior knowledge. This thesis investigates the performance of the state-of-the-art reinforcement learning algorithm proximal policy optimization, when trained on a task with nondeterministic state transitions. The agent’s policy was constructed using a convolutional neural network and the game Candy Crush Friends Saga, a single-player match-three tile game, was used as the environment. The purpose of this research was to evaluate if the described agent could achieve a higher win rate than average human performance when playing the game of Candy Crush Friends Saga. The research also analyzed the algorithm's generalization capabilities on this task. The results showed that all trained models perform better than a random policy baseline, thus showing it is possible to use the proximal policy optimization algorithm to learn tasks in an environment with nondeterministic state transitions. It also showed that, given the hyperparameters chosen, it was not able to perform better than average human performance. reinforcement learning proximal policy optimization PPO machine learning artificial intelligence deep learning neural network candy crush mobile game Computer Sciences Datavetenskap (datalogi)
12	Virtual reality therapy for Alzheimer’s disease with speech instruction and real-time neurofeedback system Ai, Yan 05 1900 (has links) La maladie d'Alzheimer (MA) est une maladie cérébrale dégénérative qui entraîne une perte progressive de la mémoire, un déclin cognitif et une détérioration graduelle de la capacité d'une personne à faire face à la complexité et à l'exigence des tâches quotidiennes nécessaires pour vivre en autonomie dans notre société actuelle. Les traitements pharmacologiques actuels peuvent ralentir le processus de dégradation attribué à la maladie, mais ces traitements peuvent également provoquer certains effets secondaires indésirables. L'un des traitements non pharmacologiques qui peut soulager efficacement les symptômes est la thérapie assistée par l'animal (T.A.A.). Mais en raison de certaines limitations telles que le prix des animaux et des problèmes d'hygiène, des animaux virtuels sont utilisés dans ce domaine. Cependant, les animaux virtuels animés, la qualité d'image approximative et le mode d'interaction unidirectionnel des animaux qui attendent passivement les instructions de l’utilisateur, peuvent difficilement stimuler le retour émotionnel entre l'utilisateur et les animaux virtuels, ce qui affaiblit considérablement l'effet thérapeutique. Cette étude vise à explorer l'efficacité de l'utilisation d'animaux virtuels à la place d’animaux vivants et leur impact sur la réduction des émotions négatives chez le patient. Cet objectif a été gardé à l'esprit lors de la conception du projet Zoo Therapy, qui présente un environnement immersif d'animaux virtuels en 3D, où l'impact sur l'émotion du patient est mesuré en temps réel par électroencéphalographie (EEG). Les objets statiques et les animaux virtuels de Zoo Therapy sont tous présentés à l'aide de modèles 3D réels. Les mouvements des animaux, les sons et les systèmes de repérage spécialement développés prennent en charge le comportement interactif simulé des animaux virtuels. De plus, pour que l'expérience d'interaction de l'utilisateur soit plus réelle, Zoo Therapy propose un mécanisme de communication novateur qui met en œuvre une interaction bidirectionnelle homme-machine soutenue par 3 méthodes d'interaction : le menu sur les panneaux, les instructions vocales et le Neurofeedback. La manière la plus directe d'interagir avec l'environnement de réalité virtuelle (RV) est le menu sur les panneaux, c'est-à-dire une interaction en cliquant sur les boutons des panneaux par le contrôleur de RV. Cependant, il était difficile pour certains utilisateurs ayant la MA d'utiliser le contrôleur de RV. Pour accommoder ceux qui ne sont pas bien adaptés ou compatibles avec le contrôleur de RV, un système d'instructions vocales peut être utilisé comme interface. Ce système a été reçu positivement par les 5 participants qui l'ont essayé. Même si l'utilisateur choisit de ne pas interagir activement avec l'animal virtuel dans les deux méthodes ci-dessus, le système de Neurofeedback guidera l'animal pour qu'il interagisse activement avec l'utilisateur en fonction des émotions de ce dernier. Le système de Neurofeedback classique utilise un système de règles pour donner des instructions. Les limites de cette méthode sont la rigidité et l'impossibilité de prendre en compte la relation entre les différentes émotions du participant. Pour résoudre ces problèmes, ce mémoire présente une méthode basée sur l'apprentissage par renforcement (AR) qui donne des instructions à différentes personnes en fonction des différentes émotions. Dans l'expérience de simulation des données émotionnelles synthétiques de la MD, la méthode basée sur l’AR est plus sensible aux changements émotionnels que la méthode basée sur les règles et peut apprendre automatiquement des règles potentielles pour maximiser les émotions positives de l'utilisateur. En raison de l'épidémie de Covid-19, nous n'avons pas été en mesure de mener des expériences à grande échelle. Cependant, un projet de suivi a combiné la thérapie de RV Zoo avec la reconnaissance des gestes et a prouvé son efficacité en évaluant les valeurs d'émotion EEG des participants. / Alzheimer’s disease (AD) is a degenerative brain disease that causes progressive memory loss, cognitive decline, and gradually impairs one’s ability to cope with the complexity and requirement of the daily routine tasks necessary to live in autonomy in our current society. Actual pharmacological treatments can slow down the degradation process attributed to the disease, but such treatments may also cause some undesirable side effects. One of the non-pharmacological treatments that can effectively relieve symptoms is animal-assisted treatment (AAT). But due to some limitations such as animal cost and hygiene issues, virtual animals are used in this field. However, the animated virtual animals, the rough picture quality presentation, and the one-direction interaction mode of animals passively waiting for the user's instructions can hardly stimulate the emotional feedback background between the user and the virtual animals, which greatly weakens the therapeutic effect. This study aims to explore the effectiveness of using virtual animals in place of their living counterpart and their impact on the reduction of negative emotions in the patient. This approach has been implemented in the Zoo Therapy project, which presents an immersive 3D virtual reality animal environment, where the impact on the patient’s emotion is measured in real-time by using electroencephalography (EEG). The static objects and virtual animals in Zoo Therapy are all presented using real 3D models. The specially developed animal movements, sounds, and pathfinding systems support the simulated interactive behavior of virtual animals. In addition, for the user's interaction experience to be more real, the innovation of this approach is also in its communication mechanism as it implements a bidirectional human-computer interaction supported by 3 interaction methods: Menu panel, Speech instruction, and Neurofeedback. The most straightforward way to interact with the VR environment is through Menu panel, i.e., interaction by clicking buttons on panels by the VR controller. However, it was difficult for some AD users to use the VR controller. To accommodate those who are not well suited or compatible with VR controllers, a speech instruction system can be used as an interface, which was received positively by the 5 participants who tried it. Even if the user chooses not to actively interact with the virtual animal in the above two methods, the Neurofeedback system will guide the animal to actively interact with the user according to the user's emotions. The mainstream Neurofeedback system has been using artificial rules to give instructions. The limitation of this method is inflexibility and cannot take into account the relationship between the various emotions of the participant. To solve these problems, this thesis presents a reinforcement learning (RL)-based method that gives instructions to different people based on multiple emotions accordingly. In the synthetic AD emotional data simulation experiment, the RL-based method is more sensitive to emotional changes than the rule-based method and can automatically learn potential rules to maximize the user's positive emotions. Due to the Covid-19 epidemic, we were unable to conduct large-scale experiments. However, a follow-up project combined VR Zoo Therapy with gesture recognition and proved the effectiveness by evaluating participant's EEG emotion values. Alzheimer’s Disease EEG Intelligent Agent Zoo Therapy Emotion Immersive Virtual Reality Reinforcement learning Proximal Policy Optimization Algorithms Auto encoder Neurofeedback Speech Recognition Immersive environment Maladie d’Alzheimer Réalité virtuelle immersive Reconnaissance vocale Zoo thérapie Émotions Apprentissage par renforcement Encodeur automatique Environnement immersif
13	Deep Reinforcement Learning for Multi-Agent Path Planning in 2D Cost Map Environments : using Unity Machine Learning Agents toolkit Persson, Hannes January 2024 (has links) Multi-agent path planning is applied in a wide range of applications in robotics and autonomous vehicles, including aerial vehicles such as drones and other unmanned aerial vehicles (UAVs), to solve tasks in areas like surveillance, search and rescue, and transportation. In today's rapidly evolving technology in the fields of automation and artificial intelligence, multi-agent path planning is growing increasingly more relevant. The main problems encountered in multi-agent path planning are collision avoidance with other agents, obstacle evasion, and pathfinding from a starting point to an endpoint. In this project, the objectives were to create intelligent agents capable of navigating through two-dimensional eight-agent cost map environments to a static target, while avoiding collisions with other agents and simultaneously minimizing the path cost. The method of reinforcement learning was used by utilizing the development platform Unity and the open-source ML-Agents toolkit that enables the development of intelligent agents with reinforcement learning inside Unity. Perlin Noise was used to generate the cost maps. The reinforcement learning algorithm Proximal Policy Optimization was used to train the agents. The training was structured as a curriculum with two lessons, the first lesson was designed to teach the agents to reach the target, without colliding with other agents or moving out of bounds. The second lesson was designed to teach the agents to minimize the path cost. The project successfully achieved its objectives, which could be determined from visual inspection and by comparing the final model with a baseline model. The baseline model was trained only to reach the target while avoiding collisions, without minimizing the path cost. A comparison of the models showed that the final model outperformed the baseline model, reaching an average of $27.6\%$ lower path cost. / Multi-agent-vägsökning används inom en rad olika tillämpningar inom robotik och autonoma fordon, inklusive flygfarkoster såsom drönare och andra obemannade flygfarkoster (UAV), för att lösa uppgifter inom områden som övervakning, sök- och räddningsinsatser samt transport. I dagens snabbt utvecklande teknik inom automation och artificiell intelligens blir multi-agent-vägsökning allt mer relevant. De huvudsakliga problemen som stöts på inom multi-agent-vägsökning är kollisioner med andra agenter, undvikande av hinder och vägsökning från en startpunkt till en slutpunkt. I detta projekt var målen att skapa intelligenta agenter som kan navigera genom tvådimensionella åtta-agents kostnadskartmiljöer till ett statiskt mål, samtidigt som de undviker kollisioner med andra agenter och minimerar vägkostnaden. Metoden förstärkningsinlärning användes genom att utnyttja utvecklingsplattformen Unity och Unitys open-source ML-Agents toolkit, som möjliggör utveckling av intelligenta agenter med förstärkningsinlärning inuti Unity. Perlin Brus användes för att generera kostnadskartorna. Förstärkningsinlärningsalgoritmen Proximal Policy Optimization användes för att träna agenterna. Träningen strukturerades som en läroplan med två lektioner, den första lektionen var utformad för att lära agenterna att nå målet, utan att kollidera med andra agenter eller röra sig utanför gränserna. Den andra lektionen var utformad för att lära agenterna att minimera vägkostnaden. Projektet uppnådde framgångsrikt sina mål, vilket kunde fastställas genom visuell inspektion och genom att jämföra den slutliga modellen med en basmodell. Basmodellen tränades endast för att nå målet och undvika kollisioner, utan att minimera vägen kostnaden. En jämförelse av modellerna visade att den slutliga modellen överträffade baslinjemodellen, och uppnådde en genomsnittlig $27,6\%$ lägre vägkostnad. deep reinforcement learning reinforcement learning machine learning path planning cost map ML-agents unity artificial neural networks collision avoidance PPO multi agent multi-agent multi-agent system förstärkningsinlärning djup förstärkningsinlärning fleragentssystem kostnadkarta kostnadskartor artificiella neurala nätverk maskininlärning proximal policy optimization PPO svärmintelligens

Page generated in 0.1123 seconds