Reinforcement Learning of Repetitive Tasks for Autonomous Heavy-Duty Vehicles

Lindesvik Warma, Simon January 2020 (has links)
Many industrial applications of heavy-duty autonomous vehicles include repetitive manoeuvres, such as, vehicle parking, hub-to-hub transportation etc. This thesis explores the possibility to use the information from previous executions, via reinforcement learning, of specific manoeuvres to improve the performance for future iterations. The manoeuvres are; one straight line path, and one constantly curved path. A proportional-integrative control strategy is designed to control the vehicle and the controller is updated, between each iteration, using a policy gradient method. A rejection sampling procedure is introduced to impose the stability of the control system. This is necessary since the general reinforcement learning framework and policy gradient framework do not consider stability. The performance of the rejection sampling procedure is improved using the ideas of simulated annealing. The performance improvement of the vehicle is evaluated through simulations. Linear and nonlinear vehicle models are evaluated on a straight line path and a constantly curved path. The simulations show that the vehicle improves its ability to track the reference path for all evaluation models and scenarios. Finally, the simulations also show that the controlled system is kept stable throughout the learning process. / Autonoma fordon är en viktig pusselbit i framtidens transportlösningar och industriella miljöer, både klimat- och säkerhetsmässigt. Många manövrar industriella fordon utför är repetetiva, exempelvis parkering. Det här arbetet utforskar möjligheten att lära sig av tidigare försök av manövern för att förbättra fordonets förmåga att utföra den. En proportionelig-integrerande reglerstruktur används för att styra fordonet. Reglerstrukturen är en tillståndsåterkoppling där regulatorn består av två proportionelig-integrerende regulatorer. Reglersystemet är initialiserat stabilt och fordonet låts utföra en iteration av manövern. Regulatorn updateras mellan varje iteration av manövern med hjälp av förstärkningsinlärning. Förstärkningslärning innebär att man använder informationen från tidigare försök av manövern för att förbättra fordonets förmåga att följa referensbanan. Förstärkningslärningen ger alltså instruktioner om hur regulatorn ska uppdateras baserat på hur fordonet presterade under förra iterationen. En samplings procedur implementeras för att försäkra stabiliteten av reglersystemet eftersom förstärkningslärandet inte tar hänsyn till detta. Syftet med samplings proceduren är också att minimera de negativa effekterna på lärningsprocessen. Algoritmen är analyserad genom att simulera fordonet med hjälp av både linjära- och olinjära utvärderingsmodeller på två olika scenarion; en rak bana och en bana med konstant kurvatur. Simuleringarna visar att fordonet förbättrar sin förmåga att följa referensbanorna för alla utvärderingsmodeller av fordonet. Simuleringarna visar också att reglersystemet hålls stabilt under lärningsprocessen.

Maneuver-Based Motion Control of a Miniature Helicopter

Rogers, Christopher Michael 30 December 2010 (has links)
This thesis deals with the control of a highly maneuverable miniature helicopter about trajectories, generated online, from a library of prespecified maneuvers. Linearizing the nonlinear equations describing the helicopter dynamics about the prespecified, library maneuvers results in a hybrid linear time-varying (LTV) model. Two control approaches are used to design controllers corresponding to each library maneuver: the standard L2-induced norm approach and an approach which also uses the L2-induced norm as a performance measure while accounting for uncertain initial states. Each control approach is evaluated in closed-loop simulation with a nonlinear helicopter model. The controllers are set to drive the helicopter model to track desired trajectories in the presence of disturbances such as wind gusts, turbulence, sensor noise, and uncertain initial conditions. For the specific plant formulations and trajectories presented, performance is comparable for both control approaches; however, it is possible to improve controller performance by exploiting some of the features of the approach accounting for uncertain initial states. These improvements in performance are topics for future work along with implementation of the presented approaches and results on a remote control helicopter. / Master of Science


Trivedi, Chintan 27 May 2011 (has links)
No description available.

Nonlinear MPC for Motion Control and Thruster Allocation of Ships

Bärlund, Alexander January 2019 (has links)
Critical automated maneuvers for ships typically require a redundant set of thrusters. The motion control system hierarchy is commonly separated into several layers using a high-level motion controller and a thruster allocation (TA) algorithm. This allows for a modular design of the software where the high-level controller can be designed without comprehensive information on the thrusters, while detailed issues such as input saturation and rate limits are handled by the TA. However, for a certain set of thruster configurations this decoupling may result in poor control performance due to the limited knowledge in the high-level controller about the physical limitations of the ship and the behavior of the TA. This thesis investigates different approaches of improving the control performance, using nonlinear Model Predictive Control (MPC) as a foundation for the developed motion controllers due to its optimized solution and capability of satisfying constraints. First, a decoupled system is implemented and results are provided for two simple motion tasks showing problems related to the decoupling. Thereafter, two different approaches are taken to remedy the observed drawbacks. A nonlinear MPC controller is developed combining the motion controller and thruster allocation resulting in a more robust control system. Then, in order to keep the control system modularized, an investigation of possible ways to augment the decoupled system so as to achieve similar performance as the combined system is carried out. One proposed solution is a nonlinear MPC controller with time-varying constraints accounting for the current limitations of the thruster system. However, this did not always improve the control performance since the behavior of the TA still is unknown to the MPC controller.

Unconstrained Motion And Constrained Force And Motion Control Of Robots With Flexible Links

Kilicaslan, Sinan 01 February 2005 (has links) (PDF)
New control methods are developed for the unconstrained motion and constrained force and motion control of flexible robots. The dynamic equations of the flexible robots are partitioned as pseudostatic equilibrium equations and deviations from them. The pseudostatic equilibrium considered here is defined as a hypothetical state where the tip point variables have their desired values while the modal variables are instantaneously constant. Then, the control torques for the pseudostatic equilibrium and for the stabilization of the deviation equations are formed in terms of tip point coordinates, modal variables and contact force components. The performances of the proposed methods are illustrated on a planar two-link robot and on a spatial three-link robot. Unmodeled dynamics and measurement noises are also taken into consideration. Performance of the proposed motion control method is compared with the computed torque method.

Distributed control of electric drives via Ehernet

Samaranayake, Lilantha January 2003 (has links)
<p>This report presents the work carried out aiming towardsdistributed control of electric drives through a networkcommunication medium with temporal constraints, i.e, Ethernet.A general analysis on time delayed systems is carried out,using state space representation of systems in the discretetime domain. The effect of input time delays is identified andis used in the preceding controller designs. The main hardwareapplication focused in this study is a Brushless DC servomotor, whose speed control loop is closed via a 10 MbpsSwitched Ethernet network. The speed control loop, which isapproximately a decade slower than the current control loop, isopened and interfaced to the network at the sensor/actuatornode. It is closed at the speed controller end at another nodein the same local area network (LAN) forming a distributedcontrol system (DCS).</p><p>The Proportional Integral (PI) classical controller designtechnique with ample changes in parameter tuning suitable fortime delayed systems is used. Then the standard Smith Predictoris tested, modified with the algebraic design techniqueCoefficient Diagram Method (CDM), which increases the systemdegrees of freedom. Constant control delay is assumed in thelatter designs despite the slight stochastic nature in thetiming data observations. Hence the poor transient performanceof the system is the price for the robustness inherited to thespeed controllers at the design stage. The controllability andobservability of the DCS may be lost, depending on the range inwhich the control delay is varying. However a state feedbackcontroller deploying on-line delay data, obtained by means ofsynchronizing the sensor node and controller node systemclocks, results in an effective compensation scheme for thenetwork induced delays. Hence the full state feedbackcontroller makes he distributed system transient performanceacceptable for servo applications with the help of poleplacement controller design.</p><p>Further, speed synchronizing controllers have been designedsuch that a speed fluctuation caused by a mechanical loadtorque disturbance on one motor is followed effectively by anyother specified motor in the distributed control network with aminimum tracking or synchronizing error. This type ofperformance is often demanded in many industrial applicationssuch as printing, paper, bagging, pick and place and materialcutting.</p><p><b>Keywords:</b>Brushless DC Motor, Control Delay, DistributedMotion Control Systems, Proportional Integral Controller, SmithPredictor, Speed Synchronization, State Feedback Controller,Stochastic Systems, Switched-Ethernet, Synchronizing Error,Time Delayed Systems, Tracking Error</p>

Idrottsskador vid löpning, vilken betydelse har löparskons egenskaper? : En Litteraturstudie

Gesar, Fredrik January 2017 (has links)
Löpning är en av de största fysiska aktiviteterna runt om i världen. Det räknas med att 37-56 % av alla som löper någon gång drabbas av en skada i samband med löpningen. Studiens syfte är att undersöka effekten av olika dämpningsmaterial, drop samt motion-kontroll av löparskor på skadefrekvens i samband med pronation och supination under löpning. Studien gjordes som en litteraturstudie där 11 vetenskapliga artiklar ingick i studien. Resultatet visar att motion-kontroll skor rekommenderas till pronerande löpare och neutrala skor till supinerande eller neutrala löpare. Minskat drop leder till minskad skaderisk. Framfotslöpning är att föredra jämfört med häl till tå löpning. En mjuk sula är bättre vid kortdistans och en hårdare sula vid långdistans. EVA material visade på en bättre återhämtningseffekt än TPU.

Arquitetura de controle de movimento para um robô móvel sobre rodas visando otimização energética. / Motion control architecture for a wheeled mobile robot to energy optimization.

Serralheiro, Werther Alexandre de Oliveira 05 March 2018 (has links)
Este trabalho apresenta uma arquitetura de controle de movimento entre duas posturas distintas para um robô móvel sob rodas com acionamento diferencial em um ambiente estruturado e livre de obstáculos. O conceito clássico de eficiência foi utilizado para a definição das estratégias de controle: um robô se movimenta de forma eficiente quando realiza a tarefa determinada no menor tempo e utilizando menor quantidade energética. A arquitetura proposta é um recorte do modelo de Controle Hierárquico Aninhado (NHC), composto por três níveis de abstração: (i) Planejamento de Caminho, (ii) Planejamento de Trajetória e (iii) Rastreamento de Trajetória. O Planejamento de Caminho proposto suaviza uma geodésica Dubins - o caminho mais eficiente - por uma Spline Grampeada para que este caminho seja definido por uma curva duplamente diferenciável. Uma transformação do espaço de configuração do robô é realizada. O Planejamento de Trajetória é um problema de otimização convexa na forma de Programação Cônica de Segunda Ordem, cujo objetivo é uma função ponderada entre tempo e energia. Como o tempo de percurso e a energia total consumida pelo robô possui uma relação hiperbólica, um algoritmo de sintonia do coeficiente de ponderação entre estas grandezas é proposta. Por fim, um Rastreador de Trajetória de dupla malha baseado em linearização entrada-saída e controle PID é proposto, e obteve resultados satisfatórios no rastreamento do caminho pelo robô. / This work presents a motion control architecture between two different positions for a differential driven wheeled mobile robot in a obstacles free structured environment. The classic concept of efficiency was used to define the control strategies: a robot moves efficiently when it accomplishes the determined task in the shortest time and using less amount of energy. The proposed architecture is a clipping of the Nested Hierarchical Controller (NHC) model, composed of three levels of abstraction: (i) Path Planning, (ii) Trajectory Planning and (iii) Trajectory Tracking. The proposed Path Planning smoothes a geodesic Dubins - the most efficient path - by a Clamped Spline as this path is defined by a twice differentiable curve. A transformation of the robot configuration space is performed. The Trajectory Planning is a convex optimization problem in the form of Second Order Cone Programming, whose objective is a weighted function between time and energy. As the travel time and the total energy consumed by the robot has a hyperbolic relation, a tuning algorithm to the weighting is proposed. Finnaly, a dual-loop Trajectory Tracker based on input-output feedback linearization and PID control is proposed, which obtained satisfactory results in tracking the path by the robot.

Contribution à la commande des robots parallèles à câbles à redondance d'actionnement / Contribution to the control of redundantly actuated cable-driven parallel robots

Lamaury, Johann 08 October 2013 (has links)
Les Robots Parallèles à Câbles (RPC) sont particulièrement adaptés pour des applications telles que le transport de charges lourdes au travers de grands espaces de travail. Afin de contrôler l'ensemble des degrés de liberté de la plate-forme tout en optimisant la taille de l'espace de travail du robot par rapport au volume de sa structure, la redondance d'actionnement est nécessaire. Dans cette thèse, un algorithme de distribution des tensions des câbles compatible temps-réel est introduit. Il permet de calculer efficacement différentes solutions optimales au problème de la distribution des tensions des RPC à deux degrés de redondance. Des schémas de commande adaptés aux RPC, intégrant l'algorithme de distribution des tensions, sont ensuite proposés. Un schéma de commande en espace double est introduit pour compenser la dynamique de la plate-forme et des enrouleurs. Afin de pallier les incertitudes et les variations des paramètres des modèles, une commande adaptative en espace double est finalement proposée. Des résultats expérimentaux prouvent la compatibilité temps-réel des algorithmes et des lois de commande développés dans cette thèse, ainsi que leur stabilité le long de la trajectoire suivie. / Cable-driven parallel robots (CDPR) are particularly well adapted for some applications such as handling of heavy payloads over large workspaces. However, in order to fully control all the degrees of freedomof the mobile platformand to obtain large workspace to footprint ratios, redundant actuation may be required, which implies the determination of feasible cable tension distributions. In this thesis, in the case of CDPR with two degrees of actuation redundancy, real-time compatible algorithms capable of efficiently calculating various continuous tension distribution are introduced. Furthermore, efficient control schemes are proposed in order to increase the CDPR tracking performances. First, an dual-space feedforward control scheme is introduced to compensate for the plate-formeand whinches dynamics. In order to deal with parametric variations and incertainties in the models, an adaptive dual-space motion control scheme for CDPR is finally presented. Experimental results validate the reel-time efficiency of the proposed tension distribution algorithmand control schemes as well as their stability along the tracked trajectory.

Evaluation of motion compensated ADV measurements for quantifying velocity fluctuations

Unknown Date (has links)
This study assesses the viability of using a towfish mounted ADV for quantifying water velocity fluctuations in the Florida Current relevant to ocean current turbine performance. For this study a motion compensated ADV is operated in a test flume. Water velocity fluctuations are generated by a 1.3 cm pipe suspended in front of the ADV at relative current speeds of 0.9 m/s and 0.15 m/s, giving Reynolds numbers on the order of 1000. ADV pitching motion of +/- 2.5 [degree] at 0.3 Hz and a heave motion of 0.3 m amplitude at 0.2 Hz are utilized to evaluate the motion compensation approach. The results show correction for motion provides up to an order of magnitude reduction in turbulent kinetic energy at frequencies of motion while the IMU is found to generate 2% error at 1/30 Hz and 9% error at 1/60 Hz in turbulence intensity. / by James William Lovenbury. / Thesis (M.S.C.S.)--Florida Atlantic University, 2013. / Includes bibliography. / Mode of access: World Wide Web. / System requirements: Adobe Reader.

