Spelling suggestions: "subject:"0ptimal control"" "subject:"aptimal control""
521 |
Étude des solutions du transfert orbital avec une poussée faible dans le problème des deux et trois corps / Study of the solutions of low-thrust orbital transfers in the two and three body problemHenninger, Helen Clare 07 October 2015 (has links)
La technique de moyennation est un moyen efficace pour simplifier les transferts optimaux pour un satellite à faible poussée dans un problème à deux corps contrôlé. Cette thèse est une étude analytique et numérique du transferts orbital à poussée faible en temps optimal qui généralise l'application de la moyennation du problème à deux corps à des transferts dans le problème à deux corps perturbés et aux transfert d'une orbite proche de la Terre au point de Lagrange L1, dans le cadre du problème à quatre corps bi-circulaire où l’effet perturbatif de la Lune et du Soleil est modélisé. Dans le transfert à faible poussée à deux corps, nous comparons le cas du temps minimal et de l'énergie. Nous déterminons que le domaine elliptique pour les transferts orbitaux temps-minimal est géodésiquement convexe pour un transfert coplanaire et vers une orbite circulaire, contrairement au cas de l’énergie. Nous examinons ensuite l’effet la perturbation lunaire, nous montrons que dans ce cas le Hamiltonien moyenné se trouve être celui associé à un problème de navigation de Zermelo. Nous étudions numériquement à l’aide du code Hampath, les points conjugués pour caractériser l’optimalité globale des trajectoires. Enfin, nous construisons et réalisons numériquement un transfert d'une orbite terrestre au point de Lagrange L1, qui utilise la moyennation sur un arc (proche de la Terre) pour simplifier les calculs numériques. Dans ce dernier résultat nous voyons qu'un transfert concaténant une trajectoire moyennée avec une trajectoire temps minimal au voisinage du point de Lagrange est en effet proche d’un transfert de temps optimal calculé avec une méthode numérique de tir. / The technique of averaging is an effective way to simplify optimal low-thrust satellite transfers in a controlled two-body Kepler problem. This study takes the form of both an analytical and numerical investigation of low-thrust time-optimal transfers, extending the application of averaging from the two-body problem to transfers in the perturbed low-thrust two body problem and a low-thrust transfer from Earth orbit to the L1 Lagrange point in the bicircular four-body setting. In the low-thrust two-body transfer, we compare the time-minimal case with the energy-minimal case, and determine that the elliptic domain under time-minimal orbital transfers (reduced in some sense) is geodesically convex. We then consider the Lunar perturbation of an energy-minimal low-thrust satellite transfer, finding a representation of the optimal Hamiltonian that relates the problem to a Zermelo navigation problem and making a numerical study of the conjugate points. Finally, we construct and implement numerically a transfer from an Earth orbit to the L1 Lagrange point, using averaging on one (near-Earth) arc in order to simplify analytic and numerical computations. In this last result we see that such a `time-optimal' transfer is indeed comparable to a true time-optimal transfer (without averaging) in these coordinates.
|
522 |
Mathematical modelling of the HIV/AIDS epidemic and the effect of public health educationVyambwera, Sibaliwe Maku January 2014 (has links)
>Magister Scientiae - MSc / HIV/AIDS is nowadays considered as the greatest public health disaster of modern time.
Its progression has challenged the global population for decades. Through mathematical
modelling, researchers have studied different interventions on the HIV pandemic, such as treatment, education, condom use, etc. Our research focuses on different compartmental models with emphasis on the effect of public health education. From the point of view of statistics, it is well known how the public health educational programs contribute towards the reduction of the spread of HIV/AIDS epidemic. Many models have been studied towards understanding the dynamics of the HIV/AIDS epidemic. The impact of ARV treatment have been observed and analysed by many researchers. Our research studies and investigates a compartmental model of HIV with treatment and education campaign. We study the existence of equilibrium points and their stability. Original contributions of this dissertation are the modifications on the model of Cai et al. [1], which enables us to use optimal control theory to identify optimal roll-out of strategies to control the HIV/AIDS. Furthermore, we introduce randomness into the model and we study the almost sure exponential stability of the disease free equilibrium. The randomness is regarded as environmental perturbations in the system. Another contribution is the global stability analysis on the model of Nyabadza et al. in [3]. The stability thresholds are compared for the HIV/AIDS in the absence of any intervention to assess the possible community benefit of public health educational campaigns. We illustrate the results by way simulation The following papers form the basis of much of the content of this dissertation, [1 ] L. Cai, Xuezhi Li, Mini Ghosh, Boazhu Guo. Stability analysis of an HIV/AIDS epidemic model with treatment, 229 (2009) 313-323. [2 ] C.P. Bhunu, S. Mushayabasa, H. Kojouharov, J.M. Tchuenche. Mathematical Analysis of an HIV/AIDS Model: Impact of Educational Programs and Abstinence in Sub-Saharan Africa. J Math Model Algor 10 (2011),31-55. [3 ] F. Nyabadza, C. Chiyaka, Z. Mukandavire, S.D. Hove-Musekwa. Analysis of an HIV/AIDS model with public-health information campaigns and individual with-drawal. Journal of Biological Systems, 18, 2 (2010) 357-375. Through this dissertation the author has contributed to two manuscripts [4] and [5], which are currently under review towards publication in journals, [4 ] G. Abiodun, S. Maku Vyambwera, N. Marcus, K. Okosun, P. Witbooi. Control and sensitivity of an HIV model with public health education (under submission). [5 ] P.Witbooi, M. Nsuami, S. Maku Vyambwera. Stability of a stochastic model of HIV population dynamics (under submission).
|
523 |
A model predictive control strategy for load shifting in a water pumping scheme with maximum demand chargesVan Staden, Adam Jacobus 24 August 2010 (has links)
The aim of this research is to affirm the application of closed-loop optimal control for load shifting in plants with electricity tariffs that include time-of-use (TOU) and maximum demand (MD) charges. The water pumping scheme of the Rietvlei water purification plant in the Tshwane municipality (South Africa) is selected for the case study. The objective is to define and simulate a closed-loop load shifting (scheduling) strategy for the Rietvlei plant that yields the maximum potential cost saving under both TOU and MD charges. The control problem is firstly formulated as a discrete time linear open loop optimal control model. Thereafter, the open loop optimal control model is converted into a closedloop optimal control model using a model predictive control technique. Both the open and closed-loop optimal control models are then simulated and compared with the current (simulated) level based control model. The optimal control models are solved with integer programming optimization. The open loop optimal control model is also solved with linear programming optimization and the result is used as an optimal benchmark for comparisons. Various scenarios with different simulation timeouts, switching intervals, control horizons, model uncertainty and model disturbances are simulated and compared. The effect of MD charges is also evaluated by interchangeably excluding the TOU and MD charges. The results show a saving of 5.8% to 9% for the overall plant, depending on the simulated scenarios. The portion of this saving that is due to a reduction in MD varies between 69% and 92%. The results also shows that the closed-loop optimal control model matches the saving of the open loop optimal control model, and that the closed-loop optimal control model compensates for model uncertainty and model disturbances whilst the open loop optimal control model does not. AFRIKAANS : Die doel van hierdie navorsing is om die applikasie van geslote-lus optimale beheer vir las verskuiwing in aanlegte met elektrisiteit tariewe wat tyd-van-gebruik (TVG) en maksimum aanvraag (MA) kostes insluit te bevestig. Die water pomp skema van die Rietvlei water reiniging aanleg in die Tshwane munisipaliteit (Suid-Afrika) is gekies vir die gevalle studie. Die objektief is om 'n geslote-lus las verskuiwing (skedulering) strategie vir die Rietvlei aanleg te definieer en te simuleer wat die maksimum potensiaal vir koste besparing onder beide TVG en MA kostes lewer. Die beheer probleem is eerstens gevormuleer as 'n diskreet tyd lineêre ope-lus optimale beheer model. Daarna is die ope-lus optimale beheer model aangepas na ‘n geslote-lus optimale beheer model met behulp van 'n model voorspellende beheer tegniek. Beide die ope- en geslote-lus optimale beheer modelle is dan gesimuleer en vergelyk met die huidige (gesimuleerde) vlak gebaseerde beheer model. Die optimisering van optimale beheer modelle is opgelos met geheeltallige programmering. Die optimisering van die ope-lus optimale beheer model is ook opgelos met lineêre programmering en die resultaat is gebruik as 'n optimale doelwit vir vergelykings. Verskeie scenarios met verskillende simulasie stop tye, skakel intervalle, beheer horisonne, model onsekerheid en model versteurings is gesimuleer en vergelyk. Die effek van MA kostes is ook geevalueer deur inter uitruiling van die TVG en MA kostes. Die resultate toon 'n besparing van 5. 8% tot 9% vir die algehele aanleg, afhangend van die gesimuleerde scenarios. Die deel van die besparing wat veroorsaak is deur 'n vermindering in MA wissel tussen 69% en 92%. Die resultate toon ook dat die geslote-lus optimale beheer model se besparing dieselfde is as die besparing van die ope-lus optimale beheer model, en dat die geslote-lus optimale beheer model kompenseer vir model onsekerheid en model versteurings, terwyl die ope-lus optimale beheer model nie kompenseer nie. Copyright / Dissertation (MEng)--University of Pretoria, 2010. / Electrical, Electronic and Computer Engineering / unrestricted
|
524 |
Gestion optimisée des flux énergétiques dans le véhicule électrique / Optimization of energy flows in an electric vehicleFlorescu, Adrian 19 November 2012 (has links)
Ce travail a trait à la gestion des flux énergétiques électriques au sein du réseau embarqué d’un véhicule électrique. Les éléments constitutifs de la chaîne électrique ont été d’abord modélisés à des fins de commande et de simulation. Il est visé ici la minimisation du stress des batteries au plomb via une hybridation avec des supercondensateurs. Deux familles de lois de commande ont été conçues et développées, à savoir des lois de type « fréquentielles » et des lois optimales de type « Linéaires Quadratiques Gaussiennes ». Un banc de test temps réel hybride a été architecturé afin de tester ces lois. Ce banc de test a pour noyau deux simulateurs temps réel (RT-LAB et dSPACE). Une partie de la chaîne de puissance est soit émulée par des sources contrôlées ou réalisée via des maquettes à échelle réduite mais à facteur de similitude respecté. Les essais sur le banc de test ont permis d’obtenir des résultats satisfaisants et encourageants qui corroborent la théorie. / This work addresses the management of electrical energy flows within the embedded network of anelectric vehicle. The electrical system components were first modeled for purposes of control synthesisand simulation. It is aimed the minimization of lead-acid batteries stress via hybridization withultracapacitors. Two families of control laws have been conceived and developed, namely afrequency-domain-based law and an optimal Linear Quadratic Gaussian law. A real-time hybrid testbench has been built in order to test these laws. This test bench has two core real-time simulators (RTLABand dSPACE). A part of the power chain has been either emulated by using suitably-controlledsources or realized by using small-scale real hardware with the similarity factor being respected. estson the testbed have yielded satisfactory and encouraging results that corroborate the theory.
|
525 |
Optimal control problems for bioremediation of water resources / Problèmes de contrôle optimal pour la bioremédiation de ressources en eauRiquelme, Victor 26 September 2016 (has links)
Cette thèse se compose de deux parties. Dans la première partie, nous étudions les stratégies de temps minimum pour le traitement de la pollution dans de grandes ressources en eau, par exemple des lacs ou réservoirs naturels, à l'aide d'un bioréacteur continu qui fonctionne à un état quasi stationnaire. On contrôle le débit d'entrée d'eau au bioréacteur, dont la sortie revient à la ressource avec le même débit. Nous disposons de l'hypothèse d'homogénéité de la concentration de polluant dans la ressource en proposant trois modèles spatialement structurés. Le premier modèle considère deux zones connectées l'une à l'autre par diffusion et seulement une d'entre elles connectée au bioréacteur. Avec l'aide du Principe du Maximum de Pontryagin, nous montrons que le contrôle optimal en boucle fermée dépend seulement des mesures de pollution dans la zone traitée, sans influence des paramètres de volume, diffusion, ou la concentration dans la zone non traitée. Nous montrons que l'effet d'une pompe de recirculation qui aide à homogénéiser les deux zones est avantageux si opérée à vitesse maximale. Nous prouvons que la famille de fonctions de temps minimal en fonction du paramètre de diffusion est décroissante. Le deuxième modèle consiste en deux zones connectées l'une à l'autre par diffusion et les deux connectées au bioréacteur. Ceci est un problème dont l'ensemble des vitesses est non convexe, pour lequel il n'est pas possible de prouver directement l'existence des solutions. Nous surmontons cette difficulté et résolvons entièrement le problème étudié en appliquant le principe de Pontryagin au problème de contrôle relaxé associé, obtenant un contrôle en boucle fermée qui traite la zone la plus polluée jusqu'au l'homogénéisation des deux concentrations. Nous obtenons des limites explicites sur la fonction valeur via des techniques de Hamilton-Jacobi-Bellman. Nous prouvons que la fonction de temps minimal est non monotone par rapport au paramètre de diffusion. Le troisième modèle consiste en deux zones connectées au bioréacteur en série et une pompe de recirculation entre elles. L'ensemble des contrôles dépend de l'état, et nous montrons que la contrainte est active à partir d'un temps jusqu'à la fin du processus. Nous montrons que le contrôle optimal consiste à l'atteinte d'un temps à partir duquel il est optimal de recirculer à vitesse maximale et ensuite ré-polluer la deuxième zone avec la concentration de la première. Ce résultat est non intuitif. Des simulations numériques illustrent les résultats théoriques, et les stratégies optimales obtenues sont testées sur des modèles hydrodynamiques, en montrant qu'elles sont de bonnes approximations de la solution du problème inhomogène. La deuxième partie consiste au développement et l'étude d'un modèle stochastique de réacteur biologique séquentiel. Le modèle est obtenu comme une limite des processus de naissance et de mort. Nous établissons l'existence et l'unicité des solutions de l'équation contrôlée qui ne satisfait pas les hypothèses habituelles. Nous prouvons que pour n'importe quelle loi de contrôle la probabilité d'extinction de la biomasse est positive. Nous étudions le problème de la maximisation de la probabilité d'atteindre un niveau de pollution cible, avec le réacteur à sa capacité maximale, avant l'extinction. Ce problème ne satisfait aucune des suppositions habituelles (la dynamique n'est pas lipschitzienne, diffusion dégénérée localement hölderienne, contraintes d'état, ensembles cible et absorbant s'intersectent), donc le problème doit être étudié dans deux étapes: en premier lieu, nous prouvons la continuité de la fonction de coût non contrôlée pour les conditions initiales avec le volume maximal et ensuite nous développons un principe de programmation dynamique pour une modification du problème original comme un problème de contrôle optimal avec coût final sans contrainte sur l'état. / This thesis consists of two parts. In the first part we study minimal time strategies for the treatment of pollution in large water volumes, such as lakes or natural reservoirs, using a single continuous bioreactor that operates in a quasi-steady state. The control consists of feeding the bioreactor from the resource, with clean output returning to the resource with the same flow rate. We drop the hypothesis of homogeneity of the pollutant concentration in the water resource by proposing three spatially structured models. The first model considers two zones connected to each other by diffusion and only one of them treated by the bioreactor. With the help of the Pontryagin Maximum Principle, we show that the optimal state feedback depends only on the measurements of pollution in the treated zone, with no influence of volume, diffusion parameter, or pollutant concentration in the untreated zone. We show that the effect of a recirculation pump that helps to mix the two zones is beneficial if operated at full speed. We prove that the family of minimal time functions depending on the diffusion parameter is decreasing. The second model consists of two zones connected to each other by diffusion and each of them connected to the bioreactor. This is a problem with a non convex velocity set for which it is not possible to directly prove the existence of its solutions. We overcome this difficulty and fully solve the studied problem applying Pontryagin's principle to the associated problem with relaxed controls, obtaining a feedback control that treats the most polluted zone up to the homogenization of the two concentrations. We also obtain explicit bounds on its value function via Hamilton-Jacobi-Bellman techniques. We prove that the minimal time function is nonmonotone as a function of the diffusion parameter. The third model consists of a system of two zones connected to the bioreactor in series, and a recirculation pump between them. The control set depends on the state variable; we show that this constraint is active from some time up to the final time. We show that the optimal control consists of waiting up to a time from which it is optimal the mixing at maximum speed, and then to repollute the second zone with the concentration of the first zone. This is a non intuitive result. Numerical simulations illustrate the theoretical results, and the obtained optimal strategies are tested in hydrodynamic models, showing to be good approximations of the solution of the inhomogeneous problem. The second part consists of the development and study of a stochastic model of sequencing batch reactor. We obtain the model as a limit of birth and death processes. We establish the existence and uniqueness of solutions of the controlled equation that does not satisfy the usual assumptions. We prove that with any control law the probability of extinction is positive, which is a non classical result. We study the problem of the maximization of the probability of attaining a target pollution level, with the reactor at maximum capacity, prior to extinction. This problem does not satisfy any of the usual assumptions (non Lipschitz dynamics, degenerate locally H"older diffusion parameter, restricted state space, intersecting reach and avoid sets), so the problem must be studied in two stages: first, we prove the continuity of the uncontrolled cost function for initial conditions with maximum volume, and then we develop a dynamic programming principle for a modification of the problem as an optimal control problem with final cost and without state constraint.
|
526 |
Controle ótimo: custos no controle de propagações populacionaisFerreira, Eliza Maria 27 February 2015 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-01-13T12:47:04Z
No. of bitstreams: 1
elizamariaferreira.pdf: 2774756 bytes, checksum: 61a94ee557cce6b0fb63b18f2d57e8e4 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-01-25T17:29:48Z (GMT) No. of bitstreams: 1
elizamariaferreira.pdf: 2774756 bytes, checksum: 61a94ee557cce6b0fb63b18f2d57e8e4 (MD5) / Made available in DSpace on 2016-01-25T17:29:48Z (GMT). No. of bitstreams: 1
elizamariaferreira.pdf: 2774756 bytes, checksum: 61a94ee557cce6b0fb63b18f2d57e8e4 (MD5)
Previous issue date: 2015-02-27 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / O objetivo deste trabalho é estudar algumas aplicações da teoria de controle ótimo para
problemas biológicos. Assim, apresentamos inicialmente o estudo de dois modelos diferentes:
“Optimal Control of Biological Invasions in Lake Network”, proposto por Potapov et al.
[13], e “Simulating Optimal Vaccination Times during Cholera Outbreaks” proposto por
Modnak et al. [9]. Os modelos têm suas dinâmicas baseadas em equações diferenciais
ordinárias e neles foi minimizado um funcional, com uma única e com várias restrições,
respectivamente. No primeiro modelo a teoria de controle ótimo é usada para minimizar os
custos com a prevenção juntamente com os custos gerados pelos danos da invasão biológica
em estudo, e no segundo modelo aplica-se o controle ótimo para minimizar os custos da
vacinação e tratamento dos indivíduos infectados durante um surto de cólera. Com base
nos modelos propostos por Vieira e Takahashi em “A Sobrevivência do Vírus varicelazoster”,
[16], e por Shulgin et al. em “Pulse vaccination strategy in the SIR epidemic
model”, [14], propomos um modelo matemático que considera a vacinação da população
como uma estratégia de controle da varicela. Nós usamos a teoria de controle ótimo para
definir as condições necessárias para minimizar os custos da vacinação e tratamento dos
indivíduos infectados com catapora ou com herpes zoster. A dinâmica é baseada em
equações diferenciais ordinárias, que são as restrições sob as quais queremos minimizar o
funcional utilizando a teoria de controle ótimo. / The goal of this work is to study some applications of the theory of optimal control for
biological problems. Thus, initially we present the study of two different models: “Optimal
Control of Biological Invasions in Lake Network” proposed by Potapov et al. [13], and
“Simulating optimal Vaccination Times During Cholera Outbreaks” proposed by Modnak
et al. [9]. The models have their dynamics based on ordinary differential equations and
are minimizing the functional with a single and with several restrictions, respectively. The
first model uses optimal control theory to minimize costs with prevention and together
with the costs generated by the damage of the invasion, the second model applies optimal
control to minimize costs in the vaccination and treatment of infected individuals during
cholera outbreak. Based on models proposed by Vieira and Takahashi on “The Virus
Survival varicella-zoster”, [16], and by Shulgin et al. in “Pulse vaccination strategy in the
SIR epidemic model”, [14], we propose a mathematical model that considers a vaccination
of the population as a varicella control strategy. We use the optimal control theory to
define necessary conditions to minimize the costs of vaccination and treatment of infected
individuals with chickenpox or with herpes zoster. The dynamics is based on ordinary
differential equations which are the constraints under which we want to minimize the
functional in the optimal control theory.
|
527 |
Otimização de consumo de combustível em veículos usando um modelo simplificado de trânsito e sistemas com saltos markovianos / Optimization of fuel consumption in vehicles using a simplified traffic model and Markov jump system.Diogo Henrique de Melo 25 November 2016 (has links)
Esta dissertação aborda o problema de redução do consumo de combustível para veículos. Com esse objetivo, realiza-se o levantamento de um modelo estocástico e de seus parâmetros, o desenvolvimento de um controlador para o veículo, e análise dos resultados. O problema considera a interação com o trânsito de outros veículos, que limita a aplicação de resultados antes disponíveis. Para isto, propõe-se modelar a dinâmica do problema de maneira aproximada, usando sistemas com saltos markovianos, e levantar as probabilidades de transição dos estados da cadeia através de um modelo mais completo para o trânsito no percurso. / This dissertation deals with control of vehicles aiming at the fuel consumption optimization, taking into account the interference of traffic. Stochastic interferences like this and other real world phenomena prevents us from directly applying available results. We propose to employ a relatively simple system with Markov jumping parameters as a model for the vehicle subject to traffic interference, and to obtain the transition probabilities from a separate model for the traffic. This dissertation presents the model identification, the solution of the new problem using dynamic programming, and simulation of the obtained control.
|
528 |
Averaging level control in the presence of frequent inlet flow upsetsRosander, Peter January 2012 (has links)
Buffer tanks are widely used within the process industry to prevent flow variations from being directly propagated throughout a plant. The capacity of the tank is used to smoothly transfer inlet flow upsets to the outlet. Ideally, the tank thus works as a low pass filter where the available tank capacity limits the achievable flow smoothing. For infrequently occurring upsets, where the system has time to reach steady state between flow changes, the averaging level control problem has been extensively studied. After an inlet flow change, flow filtering has traditionally been obtained by letting the tank level deviate from its nominal value while slowly adapting the outlet to cancel out the flow imbalance and eventually bringing back the level to its set-point. The system is then again in steady state and ready to surge the next upset. By ensuring that the single largest upset can be handled without violating the level constraints, satisfactory flow smoothing is obtained. In this thesis, the smoothing of frequently changing inlet flows is addressed. In this case, standard level controllers struggle to obtain acceptable flow smoothing since the system rarely is in steady state and flow upsets can thus not be treated as separate events. To obtain a control law that achieves optimal filtering while directly accounting for future upsets, the averaging level control problem was approached using robust model predictive control (MPC). The robust MPC differs in the way it obtains flow smoothing by not returning the tank level to a fixed set-point. Instead, it lets the steady state tank level depend on the current value of the inlet flow. This insight was then used to propose a linear control structure, designed to filter frequent upsets optimally. Analyses and simulation results indicate that the proposed linear and robust MPC controller obtain flow smoothing comparable to the standard optimal averaging level controllers for infrequent upsets while handling frequent upsets considerably better.
|
529 |
Multi-Hypothesis Motion Planning under Uncertainty Using Local OptimizationHellander, Anja January 2020 (has links)
Motion planning is defined as the problem of computing a feasible trajectory for an agent to follow. It is a well-studied problem with applications in fields such as robotics, control theory and artificial intelligence. In the last decade there has been an increased interest in algorithms for motion planning under uncertainty where the agent does not know the state of the environment due to, e.g. motion and sensing uncertainties. One approach is to generate an initial feasible trajectory using for example an algorithm such as RRT* and then improve that initial trajectory using local optimization. This thesis proposes a new modification of the RRT* algorithm that can be used to generate initial paths from which initial trajectories for the local optimization step can be generated. Unlike standard RRT*, the modified RRT* generates multiple paths at the same time, all belonging to different families of solutions (homotopy classes). Algorithms for motion planning under uncertainty that rely on local optimization of trajectories can use trajectories generated from these paths as initial solutions. The modified RRT* is implemented and its performance with respect to computation time and number of paths found is evaluated on simple scenarios. The evaluations show that the modified RRT* successfully computes solutions in multiple homotopy classes. Two methods for motion planning under uncertainty, Trajectory-optimized LQG (T-LQG), and a belief space variant of iterative LQG (iLQG) are implemented and combined with the modified RRT*. The performance with respect to cost function improvement, computation time and success rate when following the optimized trajectories for the two methods are evaluated in a simulation study. The results from the simulation studies show that it is advantageous to generate multiple initial trajectories. Some initial trajectories, due to for example passing through narrow passages or through areas with high uncertainties, can only be slightly improved by trajectory optimization or results in trajectories that are hard to follow or with a high collision risk. If multiple initial trajectories are generated the probability is higher that at least one of them will result in an optimized trajectory that is easy to follow, with lower uncertainty and lower collision risk than the initial trajectory. The results also show that iLQG is much more computationally expensive than T-LQG, but that it is better at computing control policies to follow the optimized trajectories.
|
530 |
Data-Efficient Reinforcement Learning Control of Robotic Lower-Limb Prosthesis With Human in the LoopJanuary 2020 (has links)
abstract: Robotic lower limb prostheses provide new opportunities to help transfemoral amputees regain mobility. However, their application is impeded by that the impedance control parameters need to be tuned and optimized manually by prosthetists for each individual user in different task environments. Reinforcement learning (RL) is capable of automatically learning from interacting with the environment. It becomes a natural candidate to replace human prosthetists to customize the control parameters. However, neither traditional RL approaches nor the popular deep RL approaches are readily suitable for learning with limited number of samples and samples with large variations. This dissertation aims to explore new RL based adaptive solutions that are data-efficient for controlling robotic prostheses.
This dissertation begins by proposing a new flexible policy iteration (FPI) framework. To improve sample efficiency, FPI can utilize either on-policy or off-policy learning strategy, can learn from either online or offline data, and can even adopt exiting knowledge of an external critic. Approximate convergence to Bellman optimal solutions are guaranteed under mild conditions. Simulation studies validated that FPI was data efficient compared to several established RL methods. Furthermore, a simplified version of FPI was implemented to learn from offline data, and then the learned policy was successfully tested for tuning the control parameters online on a human subject.
Next, the dissertation discusses RL control with information transfer (RL-IT), or knowledge-guided RL (KG-RL), which is motivated to benefit from transferring knowledge acquired from one subject to another. To explore its feasibility, knowledge was extracted from data measurements of able-bodied (AB) subjects, and transferred to guide Q-learning control for an amputee in OpenSim simulations. This result again demonstrated that data and time efficiency were improved using previous knowledge.
While the present study is new and promising, there are still many open questions to be addressed in future research. To account for human adaption, the learning control objective function may be designed to incorporate human-prosthesis performance feedback such as symmetry, user comfort level and satisfaction, and user energy consumption. To make the RL based control parameter tuning practical in real life, it should be further developed and tested in different use environments, such as from level ground walking to stair ascending or descending, and from walking to running. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2020
|
Page generated in 0.0938 seconds