Spelling suggestions: "subject:"farsa""
1 |
Reinforcement Learning in Keepaway Framework for RoboCup Simulation LeagueLi, Wei January 2011 (has links)
This thesis aims to apply the reinforcement learning into soccer robot and show the great power of reinforcement learning for the RoboCup. In the first part, the background of reinforcement learning is briefly introduced before showing the previous work on it. Therefore the difficulty in implementing reinforcement learning is proposed. The second section demonstrates basic concepts in reinforcement learning, including three fundamental elements, state, action and reward respectively, and three classical approaches, dynamic programming, monte carlo methods and temporal-difference learning respectively. When it comes to keepaway framework, more explanations are given to further combine keepaway with reinforcement learning. After the suggestion about sarsa algorithm with two function approximation, artificial neural network and tile coding, it is implemented successfully during the simulations. The results show it significantly improves the performance of soccer robot.
|
2 |
Machine Learning for Traffic Control of Unmanned Mining Machines : Using the Q-learning and SARSA algorithms / Maskininlärning för Trafikkontroll av Obemannade Gruvmaskiner : Med användning av algoritmerna Q-learning och SARSAGustafsson, Robin, Fröjdendahl, Lucas January 2019 (has links)
Manual configuration of rules for unmanned mining machine traffic control can be time-consuming and therefore expensive. This paper presents a Machine Learning approach for automatic configuration of rules for traffic control in mines with autonomous mining machines by using Q-learning and SARSA. The results show that automation might be able to cut the time taken to configure traffic rules from 1-2 weeks to a maximum of approximately 6 hours which would decrease the cost of deployment. Tests show that in the worst case the developed solution is able to run continuously for 24 hours 82% of the time compared to the 100% accuracy of the manual configuration. The conclusion is that machine learning can plausibly be used for the automatic configuration of traffic rules. Further work in increasing the accuracy to 100% is needed for it to replace manual configuration. It remains to be examined whether the conclusion retains pertinence in more complex environments with larger layouts and more machines. / Manuell konfigurering av trafikkontroll för obemannade gruvmaskiner kan vara en tidskrävande process. Om denna konfigurering skulle kunna automatiseras så skulle det gynnas tidsmässigt och ekonomiskt. Denna rapport presenterar en lösning med maskininlärning med Q-learning och SARSA som tillvägagångssätt. Resultaten visar på att konfigureringstiden möjligtvis kan tas ned från 1–2 veckor till i värsta fallet 6 timmar vilket skulle minska kostnaden för produktionssättning. Tester visade att den slutgiltiga lösningen kunde köra kontinuerligt i 24 timmar med minst 82% träffsäkerhet jämfört med 100% då den manuella konfigurationen används. Slutsatsen är att maskininlärning eventuellt kan användas för automatisk konfiguration av trafikkontroll. Vidare arbete krävs för att höja träffsäkerheten till 100% så att det kan användas istället för manuell konfiguration. Fler studier bör göras för att se om detta även är sant och applicerbart för mer komplexa scenarier med större gruvlayouts och fler maskiner.
|
3 |
Game-independent AI agents for playing Atari 2600 console gamesNaddaf, Yavar 06 1900 (has links)
This research focuses on developing AI agents that play arbitrary Atari 2600 console games without having any game-specific assumptions or prior knowledge. Two main approaches are considered: reinforcement learning based methods and search based methods. The RL-based methods use feature vectors generated from the game screen as well as the console RAM to learn to play a given game. The search-based methods use the emulator to simulate the consequence of actions into the future, aiming to play as well as possible by only exploring a very small fraction of the state-space.
To insure the generic nature of our methods, all agents are designed and tuned using four specific games. Once the development and parameter selection is complete, the performance of the agents is evaluated on a set of 50 randomly selected games. Significant learning is reported for the RL-based methods on most games. Additionally, some instances of human-level performance is achieved by the search-based methods.
|
4 |
Reinforcement Learning for Active Length Control and Hysteresis Characterization of Shape Memory AlloysKirkpatrick, Kenton C. 16 January 2010 (has links)
Shape Memory Alloy actuators can be used for morphing, or shape change, by
controlling their temperature, which is effectively done by applying a voltage difference
across their length. Control of these actuators requires determination of the relationship
between voltage and strain so that an input-output map can be developed. In this
research, a computer simulation uses a hyperbolic tangent curve to simulate the
hysteresis behavior of a virtual Shape Memory Alloy wire in temperature-strain space,
and uses a Reinforcement Learning algorithm called Sarsa to learn a near-optimal
control policy and map the hysteretic region. The algorithm developed in simulation is
then applied to an experimental apparatus where a Shape Memory Alloy wire is
characterized in temperature-strain space. This algorithm is then modified so that the
learning is done in voltage-strain space. This allows for the learning of a control policy
that can provide a direct input-output mapping of voltage to position for a real wire.
This research was successful in achieving its objectives. In the simulation phase,
the Reinforcement Learning algorithm proved to be capable of controlling a virtual
Shape Memory Alloy wire by determining an accurate input-output map of temperature to strain. The virtual model used was also shown to be accurate for characterizing Shape
Memory Alloy hysteresis by validating it through comparison to the commonly used
modified Preisach model. The validated algorithm was successfully applied to an
experimental apparatus, in which both major and minor hysteresis loops were learned in
temperature-strain space. Finally, the modified algorithm was able to learn the control
policy in voltage-strain space with the capability of achieving all learned goal states
within a tolerance of +-0.5% strain, or +-0.65mm. This policy provides the capability of
achieving any learned goal when starting from any initial strain state. This research has
validated that Reinforcement Learning is capable of determining a control policy for
Shape Memory Alloy crystal phase transformations, and will open the door for research
into the development of length controllable Shape Memory Alloy actuators.
|
5 |
Game-independent AI agents for playing Atari 2600 console gamesNaddaf, Yavar Unknown Date
No description available.
|
6 |
Generating adaptive companion behaviors using reinforcement learning in gamesSharifi, AmirAli Unknown Date
No description available.
|
7 |
Generating adaptive companion behaviors using reinforcement learning in gamesSharifi, AmirAli 11 1900 (has links)
Non-Player Character (NPC) behaviors in todays computer games are mostly generated from manually written scripts. The high cost of manually creating complex behaviors for each NPC to exhibit intelligence in response to every situation in the game results in NPCs with repetitive and artificial looking behaviors. The goal of this research is to enable NPCs in computer games to exhibit natural and human-like behaviors in non-combat situations. The quality of these behaviors affects the game experience especially in story-based games, which rely heavily on player-NPC interactions. Reinforcement Learning has been used in this research for BioWare Corp.s Neverwinter Nights to learn natural-looking behaviors for companion NPCs. The proposed method enables NPCs to rapidly learn reasonable behaviors and adapt to the changes in the game environment. This research also provides a learning architecture to divide the NPC behavior into sub-behaviors and sub-tasks called decision domains.
|
8 |
Element and Event-Based Test Suite Reduction for Android Test Suites Generated by Reinforcement LearningAlenzi, Abdullah Sawdi M. 07 1900 (has links)
Automated test generation for Andriod apps with reinforcement learning algorithms often produce test suites with redundant coverage. We looked at minimizing test suites that have already been generated based on state–action–reward–state–action (SARSA) algorithms. In this dissertation, we hypothesize that there is room for improvement by introducing novel hybrid approaches that combine SARSA-generated test suites with greedy reduction algorithms following the principle of Head-up Guidance System (HGS™) approach. In addition, we apply an empirical study on Android test suites that reveals the value of these new hybrid methods. Our novel approaches focus on post-processing test suites by applying greedy reduction algorithms. To reduce Android test suites, we utilize different coverage criteria including event-based criterion (EBC), element-based criterion (ELBC), and combinatorial-based sequences criteria (CBSC) that follow the principle of combinatorial testing to generate sequences of events and elements. The proposed criteria effectively decreased the test suites generated by SARSA and revealed a high performance in maintaining code coverage. These findings suggest that test suite reduction using these criteria is particularly well suited for SARSA-generated test suites of Android apps.
|
9 |
Applying Agent Modeling to Behaviour Patterns of Characters in Story-Based GamesZhao, Richard 11 1900 (has links)
Most story-based games today have manually-scripted non-player characters (NPCs) and the scripts are usually simple and repetitive since it is time-consuming for game developers to script each character individually. ScriptEase, a publicly-available author-oriented developer tool, attempts to solve this problem by generating script code from high-level design patterns, for BioWare Corp.'s role-playing game Neverwinter Nights. The ALeRT algorithm uses reinforcement learning (RL) to automatically generate NPC behaviours that change over time as the NPCs learn from the successes or failures of their own actions. This thesis aims to provide a new learning mechanism to game agents so they are capable of adapting to new behaviours based on the actions of other agents. The new on-line RL algorithm, ALeRT-AM, which includes an agent-modeling mechanism, is applied in a series of combat experiments in Neverwinter Nights and integrated into ScriptEase to produce adaptive behaviour patterns for NPCs.
|
10 |
Applying Agent Modeling to Behaviour Patterns of Characters in Story-Based GamesZhao, Richard Unknown Date
No description available.
|
Page generated in 0.052 seconds