Spelling suggestions: "subject:"reinforcement learning"" "subject:"einforcement learning""
481 |
An intelligent fault diagnosis framework for the Smart Grid using neuro-fuzzy reinforcement learningEsgandarnejad, Babak 30 September 2020 (has links)
Accurate and timely diagnosis of faults is essential for the reliability and security of power grid operation and maintenance. The emergence of big data has enabled the incorporation of a vast amount of information in order to create custom fault datasets and improve the diagnostic capabilities of existing frameworks. Intelligent systems have been successful in incorporating big data to improve diagnostic performance using computational intelligence and machine learning based on fault datasets. Among these systems are fuzzy inference systems with the ability to tackle the ambiguities and uncertainties of a variety of input data such as climate data. This makes these systems a good choice for extracting knowledge from energy big data. In this thesis, qualitative climate information is used to construct a fault dataset. A fuzzy inference system is designed whose parameters are optimized using a single layer artificial neural network. This fault diagnosis framework maps the relationship between fault variables in the fault dataset and fault types in real-time to improve the accuracy and cost efficiency of the framework. / Graduate
|
482 |
Model-Free Optimized Tracking Control HeuristicWang, Ning 02 September 2020 (has links)
Tracking control algorithms often target the convergence of a tracking error. However, this can be at the expense of other important system characteristics, such as the control effort used to annihilate the tracking error, transient response, or steady-state characteristics, for example. Furthermore, most tracking control methods assume prior knowledge of the system dynamics, which is not always a realistic assumption, especially in the case of highly complex systems.
In this thesis, a model-free optimized tracking control architectural heuristic is proposed. The suggested feedback system is composed of two control loops. The first is the tracking loop. It focuses on the convergence of the tracking error. It is implemented using two different model-free control algorithms for comparison purpose: Reinforcement Learning (RL) and the Nonlinear Threshold Accepting (NLTA) technique. The RL scheme reformulates the tracking error combinations into a form of Markov-Decision-Process (MDP) and applies Q-Learning to build the best tracking control policy for the dynamic system under consideration. On the other hand, the NLTA algorithm is applied to tune the gains of a PID controller. The second control loop is in the form of a nonlinear state feedback loop. It is implemented using a feedforward artificial neural network (ANN) to optimize a system-wide cost function which can be flexible enough to encompass a set of desired design requirements pertaining to the targeted system behavior. This may include, for instance, the target overshoot, settling time, rise time, etc. The proposed architectural heuristic provides a model-free framework to tackle such control problems, in the sense that the plant's dynamic model is not required to be known in advance. Yet, at least a subset of the stability region of the optimized gains has to be known in advance so that it can provide a search space for the optimization algorithms. Simulation results on two dynamic systems demonstrate the superiority of the proposed control scheme.
|
483 |
Selectively decentralized reinforcement learningNguyen, Thanh Minh 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The main contributions in this thesis include the selectively decentralized method in solving multi-agent reinforcement learning problems and the discretized Markov-decision-process (MDP) algorithm to compute the sub-optimal learning policy in completely unknown learning and control problems. These contributions tackle several challenges in multi-agent reinforcement learning: the unknown and dynamic nature of the learning environment, the difficulty in computing the closed-form solution of the learning problem, the slow learning performance in large-scale systems, and the questions of how/when/to whom the learning agents should communicate among themselves. Through this thesis, the selectively decentralized method, which evaluates all of the possible communicative strategies, not only increases the learning speed, achieves better learning goals but also could learn the communicative policy for each learning agent. Compared to the other state-of-the-art approaches, this thesis’s contributions offer two advantages. First, the selectively decentralized method could incorporate a wide range of well-known algorithms, including the discretized MDP, in single-agent reinforcement learning; meanwhile, the state-of-the-art approaches usually could be applied for one class of algorithms. Second, the discretized MDP algorithm could compute the sub-optimal learning policy when the environment is described in general nonlinear format; meanwhile, the other state-of-the-art approaches often assume that the environment is in limited format, particularly in feedback-linearization form. This thesis also discusses several alternative approaches for multi-agent learning, including Multidisciplinary Optimization. In addition, this thesis shows how the selectively decentralized method could successfully solve several real-worlds problems, particularly in mechanical and biological systems.
|
484 |
Developmental Learning of Control in Modular Robotic Systems with Increasing ComplexificationDeshpande, Aditya Milind January 2021 (has links)
No description available.
|
485 |
Reinforcement Learning enabled hummingbird-like extreme maneuvers of a dual-motor at-scale flapping wing robotFan Fei (7461581) 31 January 2022 (has links)
<div>Insects and hummingbirds exhibit extraordinary flight capabilities and can simultaneously master seemingly conflicting goals: stable hovering and aggressive maneuvering, unmatched by small-scale man-made vehicles. Given a sudden looming visual stimulus at hover, a hummingbird initiates a fast backward translation coupled with a 180-degree yaw turn, which is followed by instant posture stabilization in just under 10 wingbeats. Considering the wingbeat frequency of 40Hz, this aggressive maneuver is accomplished in just 0.2 seconds. Flapping Wing Micro Air Vehicles (FWMAVs) hold great promise for closing this performance gap given its agility. However, the design and control of such systems remain challenging due to various constraints.</div><div><br></div><div>First, the design, optimization and system integration of a high performance at-scale biologically inspired tail-less hummingbird robot is presented. Designing such an FWMAV is a challenging task under the constraints of size, weight, power, and actuation limitations. It is even more challenging to design such a vehicle with independently controlled wings equipped with a total of only two actuators and be able to achieve animal-like flight performance. The detailed systematic solution for the design is presented, including system modeling and analysis of the wing-actuation system, body dynamics, and control and sensing requirements. Optimization is conducted to search for the optimal system parameters, and a hummingbird robot is built and validated experimentally.</div><div><br></div><div>An open-source high fidelity dynamic simulation for FWMAVs is developed to serve as a testbed for the onboard sensing and flight control algorithm, as well as design, and optimization of FWMAVs. For simulation validation, the hummingbird robot was recreated in the simulation. System identification was performed to obtain the dynamics parameters. The force generation, open-loop and closed-loop dynamic response between simulated and experimental flights were compared and validated. The unsteady aerodynamics and the highly nonlinear flight dynamics present challenging control problems for conventional and learning control algorithms such as Reinforcement Learning.</div><div><br></div><div>For robust transient and steady-state flight performance, a robust adaptive controller is developed to achieve stable hovering and fast maneuvering. The model-based nonlinear controller can stabilize the system and adapt to system parameter changes such as wear and tear, thermo effect on the actuator or strong disturbance such as ground effect. The controller is tuned in simulation and experimentally verified by hovering, point-to-point fast traversing, and following by rapid figure-of-eight trajectory. The experimental result demonstrates the state-of-the-art performance of the FWMAV in stationary hovering and fast trajectory tracking tasks, with minimum transient and steady-state error.</div><div><br></div><div>To achieve animal level maneuvering performance, especially the hummingbirds' near-maximal performance during rapid escape maneuvers, we developed a hybrid flight control strategy for aggressive maneuvers. The proposed hybrid control policy combines model-based nonlinear control with model-free reinforcement learning. The model-based nonlinear control stabilizes the system's closed-loop dynamics under disturbance and parameter variation. With the stabilized system, a model-free reinforcement learning policy trained in simulation can be optimized to achieve the desirable fast movement by temporarily "destabilizing" the system during flight. Two test cases were demonstrated to show the effectiveness of the hybrid control method: 1)a rapid escape maneuver observed in real hummingbird, 2) a drift-free fast 360-degree body flip. Direct simulation-to-real transfers are achieved, demonstrating the hummingbird-like fast evasive maneuvers on the at-scale hummingbird robot.</div>
|
486 |
Inteligentní řídící člen aktivního magnetického ložiska / Inteligent Controller of Active Magnetic BearingTurek, Milan January 2011 (has links)
The PhD thesis describes control design of active magnetic bearing. Active magnetic bearing is nonlinear unstable system. This means it is not possible to use classic methods of control design for linear time invariant systems. Also methods of nonlinear control design are not universal and theirs application is not easy task. The thesis describes usage of simple nonlinear compensation which linearizes response of active magnetic bearing and allows usage of classic methods of control design for linear time invariant systems. It is shown that CARLA method can significantly improve parameters of designed controller. First part of thesis describes derivation of model of controlled active magnetic bearing and nonlinear compensation which linearizes response of controlled active magnetic bearing on input signal. Following part contains description of methods of state control design methods, selected methods of robust control design and most common methods of artificial intelligence used for control design and implementation. Next part describes hardware of used experimental device and its parameters. It also contains experimental derivation of model of electromagnetic force because the parameters are not available from manufacturer. Last part describes control design of active magnetic bearing. Several different approaches are described here. The approaches vary from completely experimental approach, through using Ziegler-Nichols method, state control design to methods for robust control design. During design is heavily used CARLA method which is very suitable for usage for online learning in real controller due its principle.
|
487 |
Využití opakovaně posilovaného učení pro řízení čtyřnohého robotu / Using of Reinforcement Learning for Four Legged Robot ControlOndroušek, Vít January 2011 (has links)
The Ph.D. thesis is focused on using the reinforcement learning for four legged robot control. The main aim is to create an adaptive control system of the walking robot, which will be able to plan the walking gait through Q-learning algorithm. This aim is achieved using the design of the complex three layered architecture, which is based on the DEDS paradigm. The small set of elementary reactive behaviors forms the basis of proposed solution. The set of composite control laws is designed using simultaneous activations of these behaviors. Both types of controllers are able to operate on the plain terrain as well as on the rugged one. The model of all possible behaviors, that can be achieved using activations of mentioned controllers, is designed using an appropriate discretization of the continuous state space. This model is used by the Q-learning algorithm for finding the optimal strategies of robot control. The capabilities of the control unit are shown on solving three complex tasks: rotation of the robot, walking of the robot in the straight line and the walking on the inclined plane. These tasks are solved using the spatial dynamic simulations of the four legged robot with three degrees of freedom on each leg. Resulting walking gaits are evaluated using the quantitative standardized indicators. The video files, which show acting of elementary and composite controllers as well as the resulting walking gaits of the robot, are integral part of this thesis.
|
488 |
Policy Hyperparameter Exploration for Behavioral Learning of Smartphone Robots / スマートフォンロボットの行動学習のための方策ハイパーパラメータ探索法Wang, Jiexin 23 March 2017 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第20519号 / 情博第647号 / 新制||情||112(附属図書館) / 京都大学大学院情報学研究科システム科学専攻 / (主査)教授 石井 信, 教授 杉江 俊治, 教授 大塚 敏之, 銅谷 賢治 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
489 |
Neural Approaches for Syntactic and Semantic Analysis / 構文・意味解析に対するニューラルネットワークを利用した手法Kurita, Shuhei 25 March 2019 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第21911号 / 情博第694号 / 新制||情||119(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 黒橋 禎夫, 教授 鹿島 久嗣, 准教授 河原 大輔 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
|
490 |
Abusive and Hate Speech Tweets Detection with Text GenerationNalamothu, Abhishek 06 September 2019 (has links)
No description available.
|
Page generated in 0.1064 seconds