Global ETD Search

1	BUILDING AN ARTIFICIAL CEREBELLUM USING A SYSTEM OF DISTRIBUTED Q-LEARNING AGENTS Soto Santibanez, Miguel Angel January 2010 (has links) About 400 million years ago sharks developed a separate co-processor in their brains that not only made them faster but also more precisely coordinated. This co-processor, which is nowadays called cerebellum, allowed sharks to outperform their peers and survive as one of the fittest. For the last 40 years or so, researchers have been attempting to provide robots and other machines with this type of capability. This thesis discusses currently used methods to create artificial cerebellums and points out two main shortcomings: 1) framework usability issues and 2) building blocks incompatibility issues. This research argues that the framework usability issues hinder the production of good quality artificial cerebellums for a large number of applications. Furthermore, this study argues that the building blocks incompatibility issues make artificial cerebellums less efficient that they could be, given our current technology. To tackle the framework usability issues, this thesis research proposes the use of a new framework, which formalizes the task of creating artificial cerebellums and offers a list of simple steps to accomplish this task. Furthermore, to tackle the building blocks incompatibility issues, this research proposes thinking of artificial cerebellums as a set of cooperating q-learning agents, which utilize a new technique called Moving Prototypes to make better use of the available memory and computational resources. Furthermore, this work describes a set of general guidelines that can be applied to accelerate the training of this type of system. Simulation is used to show examples of the performance improvements resulting from the use of these guidelines. To illustrate the theory developed in this dissertation, this paper implements a cerebellum for a real life application, namely, a cerebellum capable of controlling a type of mining equipment called front-end loader. Finally, this thesis proposes the creation of a development tool based on this formalization. This research argues that such a development tool would allow engineers, scientists and technicians to quickly build customized cerebellums for a wide range of applications without the need of becoming experts on the area of Artificial Intelligence, Neuroscience or Machine Learning. Artificial Cerebellum Distributed Q-learning
2	Self-adjusting reinforcement learning Der, Ralf, Herrmann, Michael 10 December 2018 (has links) We present a variant of the Q-learning algorithm with automatic control of the exploration rate by a competition scheme. The theoretical approach is accompanied by systematic simulations of a chaos control task. Finally, we give interpretations of the algorithm in the context of computational ecology and neural networks.
3	Millimeter-Wave Software-Defined Radios with Programmable Directionality jean, Marc H 01 January 2024 (has links) (PDF) Fifth generation (5G) networks are currently being deployed at millimeter wave (mmWave) bands, beyond 22 GHz. The wireless node density and gigabit-per-second demands of 5G Internet-of Things (IoT) devices are pushing for more spatial reuse and higher frequency bands, which can be achieved by directional beamforming methods. Over the years, researchers have relied on synthetic data and simulation for studying directionality and beamforming, due to the lack and high cost of mmWave hardware. Hence, there is a major need for software-defined radio (SDR) platforms that enable programmable directionality in wireless studies and experimentation. Recently, more affordable and commercially available mmWave radio frequency (RF) front-ends with off-the shelf SDRs have made it possible to set up experimental test-bed platforms for beam alignment studies. In this thesis, we present a low-cost “directional SDR” test-bed that enables convenient programming of mmWave beam directions from a high-level programming language. The test bed design allows modular use of different mmWave antenna systems, including horn and path array antennas. Using a multi-threaded software configuration, the test-bed facilitates real-time access to legacy SDR methods including machine learning (ML) algorithm libraries. With a focus on receiver side Angle-of-Arrival (AoA) detection as a use case, we demonstrate the test-bed’s capabilities in ML-based mmWave beamforming. AoA Beamforming mmWave Machine Learning Q-learning Double Q-learning Deep Q-learning
4	Learning medical triage by using a reinforcement learning approach Sundqvist, Niklas January 2022 (has links) Many emergency departments are today suffering from a overcrowding of people seeking care. The first stage in seeking care is being prioritised in different orders depending on symptoms by a doctor or nurse called medical triage. This is a cumbersome process that could be subject of automatisation. This master thesis investigates the possibility of using reinforcement learning for performing medical triage of patients. A deep Q-learning approach is taken for designing the agent for the environment together with the two extensions of using double Q-learning and a duelling network architecture. The agent is deployed to train in two different environments. The goal for the agent in the first environment is to ask questions to a patient and then decide, when enough information has been collected, how the patient should be prioritised. The second environment makes the agent decide which questions should be asked to the patient and then a separate classifier is used with the information gained to perform the actual triage decision of the patient. The training and testing process of the agent in the two environments reveal difficulties in exploring the environment efficiently and thoroughly. It was also shown that defining a reward function for the environments that guides the agent into asking valuable questions and learninga stopping condition for asking questions is a complicated task. Suitable future work is discussed that would, in combination with the work performed in this paper, create a better reinforcement learning model that could potentially show more promising results in the task of performing medical triage of patients. machine learning reinforcement learning medical triage q-learning deep q-learning double deep q-learning Computer Sciences Datavetenskap (datalogi)
5	Cooperative and intelligent control of multi-robot systems using machine learning Wang, Ying 05 1900 (has links) This thesis investigates cooperative and intelligent control of autonomous multi-robot systems in a dynamic, unstructured and unknown environment and makes significant original contributions with regard to self-deterministic learning for robot cooperation, evolutionary optimization of robotic actions, improvement of system robustness, vision-based object tracking, and real-time performance. A distributed multi-robot architecture is developed which will facilitate operation of a cooperative multi-robot system in a dynamic and unknown environment in a self-improving, robust, and real-time manner. It is a fully distributed and hierarchical architecture with three levels. By combining several popular AI, soft computing, and control techniques such as learning, planning, reactive paradigm, optimization, and hybrid control, the developed architecture is expected to facilitate effective autonomous operation of cooperative multi-robot systems in a dynamically changing, unknown, and unstructured environment. A machine learning technique is incorporated into the developed multi-robot system for self-deterministic and self-improving cooperation and coping with uncertainties in the environment. A modified Q-learning algorithm termed Sequential Q-learning with Kalman Filtering (SQKF) is developed in the thesis, which can provide fast multi-robot learning. By arranging the robots to learn according to a predefined sequence, modeling the effect of the actions of other robots in the work environment as Gaussian white noise and estimating this noise online with a Kalman filter, the SQKF algorithm seeks to solve several key problems in multi-robot learning. As a part of low-level sensing and control in the proposed multi-robot architecture, a fast computer vision algorithm for color-blob tracking is developed to track multiple moving objects in the environment. By removing the brightness and saturation information in an image and filtering unrelated information based on statistical features and domain knowledge, the algorithm solves the problems of uneven illumination in the environment and improves real-time performance. In order to validate the developed approaches, a Java-based simulation system and a physical multi-robot experimental system are developed to successfully transport an object of interest to a goal location in a dynamic and unknown environment with complex obstacle distribution. The developed approaches in this thesis are implemented in the prototype system and rigorously tested and validated through computer simulation and experimentation. Multi-robot Machine learning Q-learning
6	Cooperative and intelligent control of multi-robot systems using machine learning Wang, Ying 05 1900 (has links) This thesis investigates cooperative and intelligent control of autonomous multi-robot systems in a dynamic, unstructured and unknown environment and makes significant original contributions with regard to self-deterministic learning for robot cooperation, evolutionary optimization of robotic actions, improvement of system robustness, vision-based object tracking, and real-time performance. A distributed multi-robot architecture is developed which will facilitate operation of a cooperative multi-robot system in a dynamic and unknown environment in a self-improving, robust, and real-time manner. It is a fully distributed and hierarchical architecture with three levels. By combining several popular AI, soft computing, and control techniques such as learning, planning, reactive paradigm, optimization, and hybrid control, the developed architecture is expected to facilitate effective autonomous operation of cooperative multi-robot systems in a dynamically changing, unknown, and unstructured environment. A machine learning technique is incorporated into the developed multi-robot system for self-deterministic and self-improving cooperation and coping with uncertainties in the environment. A modified Q-learning algorithm termed Sequential Q-learning with Kalman Filtering (SQKF) is developed in the thesis, which can provide fast multi-robot learning. By arranging the robots to learn according to a predefined sequence, modeling the effect of the actions of other robots in the work environment as Gaussian white noise and estimating this noise online with a Kalman filter, the SQKF algorithm seeks to solve several key problems in multi-robot learning. As a part of low-level sensing and control in the proposed multi-robot architecture, a fast computer vision algorithm for color-blob tracking is developed to track multiple moving objects in the environment. By removing the brightness and saturation information in an image and filtering unrelated information based on statistical features and domain knowledge, the algorithm solves the problems of uneven illumination in the environment and improves real-time performance. Robot systems AI Q-learning SQKF
7	Cooperative and intelligent control of multi-robot systems using machine learning Wang, Ying 05 1900 (has links) This thesis investigates cooperative and intelligent control of autonomous multi-robot systems in a dynamic, unstructured and unknown environment and makes significant original contributions with regard to self-deterministic learning for robot cooperation, evolutionary optimization of robotic actions, improvement of system robustness, vision-based object tracking, and real-time performance. A distributed multi-robot architecture is developed which will facilitate operation of a cooperative multi-robot system in a dynamic and unknown environment in a self-improving, robust, and real-time manner. It is a fully distributed and hierarchical architecture with three levels. By combining several popular AI, soft computing, and control techniques such as learning, planning, reactive paradigm, optimization, and hybrid control, the developed architecture is expected to facilitate effective autonomous operation of cooperative multi-robot systems in a dynamically changing, unknown, and unstructured environment. A machine learning technique is incorporated into the developed multi-robot system for self-deterministic and self-improving cooperation and coping with uncertainties in the environment. A modified Q-learning algorithm termed Sequential Q-learning with Kalman Filtering (SQKF) is developed in the thesis, which can provide fast multi-robot learning. By arranging the robots to learn according to a predefined sequence, modeling the effect of the actions of other robots in the work environment as Gaussian white noise and estimating this noise online with a Kalman filter, the SQKF algorithm seeks to solve several key problems in multi-robot learning. As a part of low-level sensing and control in the proposed multi-robot architecture, a fast computer vision algorithm for color-blob tracking is developed to track multiple moving objects in the environment. By removing the brightness and saturation information in an image and filtering unrelated information based on statistical features and domain knowledge, the algorithm solves the problems of uneven illumination in the environment and improves real-time performance. Robot systems AI Q-learning SQKF
8	Learning, Evolution, and Bayesian Estimation in Games and Dynamic Choice Models Monte Calvo, Alexander 29 September 2014 (has links) This dissertation explores the modeling and estimation of learning in strategic and individual choice settings. While learning has been extensively used in economics, I introduce the concept into standard models in unorthodox ways. In each case, changing the perspective of what learning is drastically changes standard models. Estimation proceeds using advanced Bayesian techniques which perform very well in simulated data. The first chapter proposes a framework called Experienced-Based Ability (EBA) in which players increase the payoffs of a particular strategy in the future through using the strategy today. This framework is then introduced into a model of differentiated duopoly in which firms can utilize price or quantity contracts, and I explore how the resulting equilibrium is affected by changes in model parameters. The second chapter extends the EBA model into an evolutionary setting. This new model offers a simple and intuitive way to theoretically explain complicated dynamics. Moreover, this chapter demonstrates how to estimate posterior distributions of the model's parameters using a particle filter and Metropolis-Hastings algorithm, a technique that can also be used in estimating standard evolutionary models. This allows researchers to recover estimates of unobserved fitness and skill across time while only observing population share data. The third chapter investigates individual learning in a dynamic discrete choice setting. This chapter relaxes the assumption that individuals base decisions off an optimal policy and investigates the importance of policy learning. Q-learning is proposed as a model of individual choice when optimal policies are unknown, and I demonstrate how it can be used in the estimation of dynamic discrete choice (DDC) models. Using Bayesian Markov chain Monte Carlo techniques on simulated data, I show that the Q-learning model performs well at recovering true parameter values and thus functions as an alternative structural DDC model for researchers who want to move away from the rationality assumption. In addition, the simulated data are used to illustrate possible issues with standard structural estimation if the rationality assumption is incorrect. Lastly, using marginal likelihood analysis, I demonstrate that the Q-learning model can be used to test for the significance of learning effects if this is a concern. Bayesian Estimation Evolution Learning Q-Learning
9	Cooperative and intelligent control of multi-robot systems using machine learning Wang, Ying 05 1900 (has links) This thesis investigates cooperative and intelligent control of autonomous multi-robot systems in a dynamic, unstructured and unknown environment and makes significant original contributions with regard to self-deterministic learning for robot cooperation, evolutionary optimization of robotic actions, improvement of system robustness, vision-based object tracking, and real-time performance. A distributed multi-robot architecture is developed which will facilitate operation of a cooperative multi-robot system in a dynamic and unknown environment in a self-improving, robust, and real-time manner. It is a fully distributed and hierarchical architecture with three levels. By combining several popular AI, soft computing, and control techniques such as learning, planning, reactive paradigm, optimization, and hybrid control, the developed architecture is expected to facilitate effective autonomous operation of cooperative multi-robot systems in a dynamically changing, unknown, and unstructured environment. A machine learning technique is incorporated into the developed multi-robot system for self-deterministic and self-improving cooperation and coping with uncertainties in the environment. A modified Q-learning algorithm termed Sequential Q-learning with Kalman Filtering (SQKF) is developed in the thesis, which can provide fast multi-robot learning. By arranging the robots to learn according to a predefined sequence, modeling the effect of the actions of other robots in the work environment as Gaussian white noise and estimating this noise online with a Kalman filter, the SQKF algorithm seeks to solve several key problems in multi-robot learning. As a part of low-level sensing and control in the proposed multi-robot architecture, a fast computer vision algorithm for color-blob tracking is developed to track multiple moving objects in the environment. By removing the brightness and saturation information in an image and filtering unrelated information based on statistical features and domain knowledge, the algorithm solves the problems of uneven illumination in the environment and improves real-time performance. / Applied Science, Faculty of / Mechanical Engineering, Department of / Graduate Robot systems AI Q-learning SQKF
10	Training reinforcement learning model with custom OpenAI gym for IIoT scenario Norman, Pontus January 2022 (has links) Denna studie består av ett experiment för att se, som ett test, hur bra det skulle fungera att implementera en industriell gymmiljö för att träna en reinforcement learning modell. För att fastställa det här tränas modellen upprepade gånger och modellen testas. Om modellen lyckas lösa scenariot, som är en representation av miljön, räknas den träningsiterationen som en framgång. Tiden det tar att träna för ett visst antal spelavsnitt mäts. Antalet avsnitt det tar för reinforcement learning modellen att uppnå ett acceptabelt resultat på 80 % av maximal poäng mäts och tiden det tar att träna dessa avsnitt mäts. Dessa mätningar utvärderas och slutsatser dras om hur väl reinforcement learning modellerna fungerade. Verktygen som används är Q-learning algoritmen implementerad på egen hand och djup Q-learning med TensorFlow. Slutsatsen visade att den manuellt implementerade Q-learning algoritmen visade varierande resultat beroende på miljödesign och hur länge modellen tränades. Det gav både hög och låg framgångsfrekvens varierande från 100 % till 0 %. Och tiderna det tog att träna agenten till en acceptabel nivå var 0,116, 0,571 och 3,502 sekunder beroende på vilken miljö som testades (se resultatkapitlet för mer information om hur modellerna ser ut). TensorFlow-implementeringen gav antingen 100 % eller 0 % framgång och eftersom jag tror att de polariserande resultaten berodde på något problem med implementeringen så valde jag att inte göra fler mätningar än för en miljö. Och eftersom modellen aldrig nådde ett stabilt utfall på mer än 80 % mättes ingen tid på länge den behöver tränas för denna implementering. / This study consists of an experiment to see, as a proof of concept, how well it would work to implement an industrial gym environment to train a reinforcement learning model. To determine this, the reinforcement learning model is trained repeatedly and tested. If the model completes the training scenario, then that training iteration counts as a success. The time it takes to train for certain amount of game episodes is measured. The number of episodes it takes for the reinforcement learning model to achieve an acceptable outcome of 80% of maximum score is measured and the time it takes to train those episodes are measured. These measurements are evaluated, and conclusions are drawn on how well the reinforcement learning models worked. The tools used is the Q-learning algorithm implemented on its own and deep Q-learning with TensorFlow. The conclusion showed that the manually implemented Q-learning algorithm showed varying results depending on environment design and how long the agent is trained. It gave both high and low success rate varying from 100% to 0%. And the times it took to train the agent to an acceptable level was 0.116, 0.571 and 3.502 seconds depending on what environment was tested (see the result chapter for more information on the environments). The TensorFlow implementation gave either 100% or 0% success rate and since I believe the polarizing results was because of some issue with the implementation I chose to not do more measurements than for one environment. And since the model never reached a stable outcome of more than 80% no time for long it needs to train was measured for this implementation. Q-learning. Reinforcement Learning OpenAI gym Q-learning. Reinforcement Learning OpenAI gym Software Engineering Programvaruteknik

Search results