1 |
Modélisation et simulation d'une station mono-opérateur pour le contrôle de drones et la planification de trajectoire / Modeling and simulation of a UAV ground control station for single-operator and path planningAjami, Alain 03 October 2013 (has links)
Ce travail s’inscrit dans le projet plus global SHARE dont l’objectif principal est de concevoir une station de contrôle sol universelle mono-opérateur de nouvelle génération pour le contrôle et la commande de drones à voilure fixe et voilure tournante.L’objectif de cette thèse est de développer un simulateur générique de la station de contrôle capable de simuler en temps réels les différents types de drones, les capteurs embarqués (caméra), l’environnement et les différentes missions militaires définies par le standard STANAG 4586. Après une modélisation des différentes parties de la station, nous présentons l’architecture adoptée pour le simulateur et le module de contrôle. Ce dernier est divisé en plusieurs niveaux hiérarchiques, dont le niveau supérieur contient les algorithmes de planification de trajectoire pour les drones à voilure fixe HALE (haute altitude, longue endurance). Ces algorithmes servent à calculer un chemin admissible entre un point de départ et un point d’arrivée en minimisant une fonction de coût.Enfin nous avons développé un système d’aide à la décision pour la gestion en ligne des missions, capable de réaliser une sélection d’objectifs, et une sélection du meilleur chemin proposé par les algorithmes de planification de trajectoire. Cet outil a pour objectif d’aider l’opérateur de la station à prendre la meilleure décision en maximisant les récompenses obtenues lors de la réalisation des objectifs et en minimisant certains critères tels que la consommation des ressources, le danger, les conditions météorologiques, etc. / The presented work is part of a larger project called SHARE, which consists in developing a universal new generation ground control station for the monitoring and the control of fixed and rotary wing UAVs (Unmanned Aerial Vehicle).The objective of this PhD thesis is to develop a generic ground control station simulator capable of simulating in real time different types of UAVs, onboard sensors, several flight environments, and various military missions which are defined according to the STANAG 4586 standard. First, we introduce the model of the different parts of the station, and then we present the architecture adopted for the simulator and the control module. The latter is divided into several hierarchical levels; the upper level contains the path planning algorithms for fixed wing HALE (High Altitude, Long Endurance) UAV. These algorithms are used to calculate an admissible path between initial and final position by minimizing a cost function.Finally, in order to manage missions online, we developed a decision support system that is capable of performing a variety of objectives. This system also supplies the operator the best paths proposed by planning algorithms. This tool aims to help the station operator to make the decision by maximizing the rewards obtained during the achieving the objectives and minimizing certain criteria (resource consumption, danger, weather,..).
|
2 |
Multi-Agent Cooperative Control via a Unified Optimal Control ApproachWang, Jianan 09 December 2011 (has links)
Recent rapid advances in computing, communication, sensing, and actuation, together with miniaturization technologies, have offered unprecedented opportunities to employ large numbers of autonomous vehicles (air, ground, and water) working cooperatively to accomplish a mission. Cooperative control of such multi-agent dynamical systems has potential impact on numerous civilian, homeland security, and military applications. Compared with single-agent control problems, new theoretical and practical challenges emerge and need to be addressed in cooperative control of multiagent dynamical systems, including but not limited to problem size, task coupling, limited computational resources at individual agent level, communication constraints, and the need for real-time obstacle/collision avoidance. In order to address these challenges, a unified optimal multi-agent cooperative control strategy is proposed to formulate the multi-objective cooperative control problem into one unified optimal control framework. Many cooperative behaviors, such as consensus, cooperative tracking, formation, obstacle/collision avoidance, or flocking with cohesion and repulsion, can be treated in one optimization process. An innovative inverse optimal control approach is utilized to include these cooperative objectives in derived cost functions such that a closedorm cooperative control law can be obtained. In addition, the control law is distributed and only depends on the local neighboring agents’ information. Therefore, this new method does not demand intensive computational load and is easy for real-time onboard implementation. Furthermore, it is very scalable to large multi-agent cooperative dynamical systems. The closed-loop asymptotic stability and optimality are theoretically proved. Simulations based on MATLAB are conducted to validate the cooperative behaviors including consensus, Rendezvous, formation flying, and flocking, as well as the obstacle avoidance performance.
|
3 |
Dynamic Optimization for Agent-Based Systems and Inverse Optimal ControlLi, Yibei January 2019 (has links)
This dissertation is concerned with three problems within the field of optimization for agent--based systems. Firstly, the inverse optimal control problem is investigated for the single-agent system. Given a dynamic process, the goal is to recover the quadratic cost function from the observation of optimal control sequences. Such estimation could then help us develop a better understanding of the physical system and reproduce a similar optimal controller in other applications. Next, problems of optimization over networked systems are considered. A novel differential game approach is proposed for the optimal intrinsic formation control of multi-agent systems. As for the credit scoring problem, an optimal filtering framework is utilized to recursively improve the scoring accuracy based on dynamic network information. In paper A, the problem of finite horizon inverse optimal control problem is investigated, where the linear quadratic (LQ) cost function is required to be estimated from the optimal feedback controller. Although the infinite-horizon inverse LQ problem is well-studied with numerous results, the finite-horizon case is still an open problem. To the best of our knowledge, we propose the first complete result of the necessary and sufficient condition for the existence of corresponding LQ cost functions. Under feasible cases, the analytic expression of the whole solution space is derived and the equivalence of weighting matrices is discussed. For infeasible problems, an infinite dimensional convex problem is formulated to obtain a best-fit approximate solution with minimal control residual, where the optimality condition is solved under a static quadratic programming framework to facilitate the computation. In paper B, the optimal formation control problem of a multi-agent system is studied. The foraging behavior of N agents is modeled as a finite-horizon non-cooperative differential game under local information, and its Nash equilibrium is studied. The collaborative swarming behaviour derived from non-cooperative individual actions also sheds new light on understanding such phenomenon in the nature. The proposed framework has a tutorial meaning since a systematic approach for formation control is proposed, where the desired formation can be obtained by only intrinsically adjusting individual costs and network topology. In contrast to most of the existing methodologies based on regulating formation errors to the pre-defined pattern, the proposed method does not need to involve any information of the desired pattern beforehand. We refer to this type of formation control as intrinsic formation control. Patterns of regular polygons, antipodal formations and Platonic solids can be achieved as Nash equilibria of the game while inter-agent collisions are naturally avoided. Paper C considers the credit scoring problem by incorporating dynamic network information, where the advantages of such incorporation are investigated in two scenarios. Firstly, when the scoring publishment is merely individual--dependent, an optimal Bayesian filter is designed for risk prediction, where network observations are utilized to provide a reference for the bank on future financial decisions. Furthermore, a recursive Bayes estimator is proposed to improve the accuracy of score publishment by incorporating the dynamic network topology as well. It is shown that under the proposed evolution framework, the designed estimator has a higher precision than all the efficient estimators, and the mean square errors are strictly smaller than the Cramér-Rao lower bound for clients within a certain range of scores. / I denna avhandling behandlas tre problem inom optimering för agentbaserade system. Inledningsvis undersöks problemet rörande invers optimal styrning för ett system med en agent. Målet är att, givet en dynamisk process, återskapa den kvadratiska kostnadsfunktionen från observationer av sekvenser av optimal styrning. En sådan uppskattning kan ge ökad förståelse av det underliggande fysikaliska systemet, samt vara behjälplig vid konstruktion av en liknande optimal regulator för andra tillämpningar. Vidare betraktas problem rörande optimering över nätverkssystem. Ett nytt angreppssätt, baserat på differentialspel, föreslås för optimal intrinsisk formationsstyrning av system med fler agenter. För kreditutvärderingsproblemet utnyttjas ett filtreringsramverk för att rekursivt förbättra kreditvärderingens noggrannhet baserat på dynamisk nätverksinformation. I artikel A undersöks problemet med invers optimal styrning med ändlig tidshorisont, där den linjärkvadratiska (LQ) kostnadsfunktionen måste uppskattas från den optimala återkopplingsregulatorn. Trots att det inversa LQ-problemet med oändlig tidshorisont är välstuderat och med flertalet resultat, är fallet med ändlig tidshorisont fortfarande ett öppet problem. Så vitt vi vet presenterar vi det första kompletta resultatet med både tillräckliga och nödvändiga villkor för existens av en motsvarande LQ-kostnadsfunktion. I fallet med lösbara problem härleds ett analytiskt uttryck för hela lösningsrummet och frågan om ekvivalens med viktmatriser behandlas. För de olösbara problemen formuleras ett oändligtdimensionellt konvext optimeringsproblem för att hitta den bästa approximativa lösningen med den minsta styrresidualen. För att underlätta beräkningarna löses optimalitetsvillkoren i ett ramverk för statisk kvadratisk programmering. I artikel B studeras problemet rörande optimal formationsstyrning av ett multiagentsystem. Agenternas svärmbeteende modelleras som ett icke-kooperativt differentialspel med ändlig tidshorisont och enbart lokal information. Vi studerar detta spels Nashjämvikt. Att, ur icke-kooperativa individuella handlingar, härleda ett kollaborativt svärmbeteende kastar nytt ljus på vår förståelse av sådana, i naturen förekommande, fenomen. Det föreslagna ramverket är vägledande i den meningen att det är ett systematiskt tillvägagångssätt för formationsstyrning, där den önskade formeringen kan erhållas genom att endast inbördes justera individuella kostnader samt nätverkstopologin. I motstat till de flesta befintliga metoder, vilka baseras på att reglera felet i formeringen relativt det fördefinierade mönstret, så behöver den föreslagna metoden inte på förhand ta hänsyn till det önskade mönstret. Vi kallar denna typ av formationsstyrning för intrinsisk formationsstyrning. Mönster så som regelbundna polygoner, antipodala formeringar och Platonska kroppar kan uppnås som Nashjämvikter i spelet, samtidigt som kollisioner mellan agenter undviks på ett naturligt sätt. Artikel C behandlar kreditutvärderingsproblemet genom att lägga till dynamisk nätverksinformation. Fördelarna med en sådan integrering undersöks i två scenarier. Då kreditvärdigheten enbart är individberoende utformas ett optimalt Bayesiskt filter för riskvärdering, där observationer från nätverket används för att tillhandahålla en referens för banken på framtida finansiella beslut. Vidare föreslås en rekursiv Bayesisk estimator (stickprovsvariabel) för att förbättra noggrannheten på den skattade kreditvärdigheten genom att integrera även den dynamiska nätverkstopologin. Inom den föreslagna ramverket för tidsutveckling kan vi visa att, för kunder inom ett visst intervall av värderingar, har den utformade estimatorn högre precision än alla effektiva estimatorer och medelkvadrafelet är strikt mindre än den nedre gränsen från Cramér-Raos olikhet. / <p>QC 20190603</p>
|
4 |
Enabling Motion Planning and Execution for Tasks Involving Deformation and UncertaintyPhillips-Grafflin, Calder 07 June 2017 (has links)
"A number of outstanding problems in robotic motion and manipulation involve tasks where degrees of freedom (DoF), be they part of the robot, an object being manipulated, or the surrounding environment, cannot be accurately controlled by the actuators of the robot alone. Rather, they are also controlled by physical properties or interactions - contact, robot dynamics, actuator behavior - that are influenced by the actuators of the robot. In particular, we focus on two important areas of poorly controlled robotic manipulation: motion planning for deformable objects and in deformable environments; and manipulation with uncertainty. Many everyday tasks we wish robots to perform, such as cooking and cleaning, require the robot to manipulate deformable objects. The limitations of real robotic actuators and sensors result in uncertainty that we must address to reliably perform fine manipulation. Notably, both areas share a common principle: contact, which is usually prohibited in motion planners, is not only sometimes unavoidable, but often necessary to accurately complete the task at hand. We make four contributions that enable robot manipulation in these poorly controlled tasks: First, an efficient discretized representation of elastic deformable objects and cost function that assess a ``cost of deformation' for a specific configuration of a deformable object that enables deformable object manipulation tasks to be performed without physical simulation. Second, a method using active learning and inverse-optimal control to build these discretized representations from expert demonstrations. Third, a motion planner and policy-based execution approach to manipulation with uncertainty which incorporates contact with the environment and compliance of the robot to generate motion policies which are then adapted during execution to reflect actual robot behavior. Fourth, work towards the development of an efficient path quality metric for paths executed with actuation uncertainty that can be used inside a motion planner or trajectory optimizer."
|
5 |
Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal EntropyZiebart, Brian D. 01 December 2010 (has links)
Predicting human behavior from a small amount of training examples is a challenging machine learning problem. In this thesis, we introduce the principle of maximum causal entropy, a general technique for applying information theory to decision-theoretic, game-theoretic, and control settings where relevant information is sequentially revealed over time. This approach guarantees decision-theoretic performance by matching purposeful measures of behavior (Abbeel & Ng, 2004), and/or enforces game-theoretic rationality constraints (Aumann, 1974), while otherwise being as uncertain as possible, which minimizes worst-case predictive log-loss (Gr¨unwald & Dawid, 2003).
We derive probabilistic models for decision, control, and multi-player game settings using this approach. We then develop corresponding algorithms for efficient inference that include relaxations of the Bellman equation (Bellman, 1957), and simple learning algorithms based on convex optimization. We apply the models and algorithms to a number of behavior prediction tasks. Specifically, we present empirical evaluations of the approach in the domains of vehicle route preference modeling using over 100,000 miles of collected taxi driving data, pedestrian motion modeling from weeks of indoor movement data, and robust prediction of game play in stochastic multi-player games.
|
6 |
Learning Preference Models for Autonomous Mobile Robots in Complex DomainsSilver, David 01 December 2010 (has links)
Achieving robust and reliable autonomous operation even in complex unstructured environments is a central goal of field robotics. As the environments and scenarios to which robots are applied have continued to grow in complexity, so has the challenge of properly defining preferences and tradeoffs between various actions and the terrains they result in traversing. These definitions and parameters encode the desired behavior of the robot; therefore their correctness is of the utmost importance. Current manual approaches to creating and adjusting these preference models and cost functions have proven to be incredibly tedious and time-consuming, while typically not producing optimal results except in the simplest of circumstances.
This thesis presents the development and application of machine learning techniques that automate the construction and tuning of preference models within complex mobile robotic systems. Utilizing the framework of inverse optimal control, expert examples of robot behavior can be used to construct models that generalize demonstrated preferences and reproduce similar behavior. Novel learning from demonstration approaches are developed that offer the possibility of significantly reducing the amount of human interaction necessary to tune a system, while also improving its final performance. Techniques to account for the inevitability of noisy and imperfect demonstration are presented, along with additional methods for improving the efficiency of expert demonstration and feedback.
The effectiveness of these approaches is confirmed through application to several real world domains, such as the interpretation of static and dynamic perceptual data in unstructured environments and the learning of human driving styles and maneuver preferences. Extensive testing and experimentation both in simulation and in the field with multiple mobile robotic systems provides empirical confirmation of superior autonomous performance, with less expert interaction and no hand tuning. These experiments validate the potential applicability of the developed algorithms to a large variety of future mobile robotic systems.
|
7 |
Manipulation robotique à deux mains inspirée des aptitudes humaines / Dual-arm robotic manipulation inspired by human skillsTomic, Marija 04 July 2018 (has links)
Le nombre de robot humanoïde s’est accru ces dernières années pour pouvoir collaborer avec l’homme ou le remplacer dans des tâches fastidieuses. L’objectif de cette thèse est de transférer aux robots humanoïdes, des habilités ou compétences humaines, en particulier pour des mouvements impliquant une coordination entre les deux bras. Dans la première partie de la thèse, un processus de conversion d’un mouvement humain vers un mouvement de robot, dans un objectif d’imitation est proposé. Comme les humains possèdent beaucoup plus de degrés de liberté qu’un robot humanoïde, les mouvements identiques ne peuvent pas être produits, les caractéristiques(longueurs des corps) peuvent aussi être différentes. Notre processus de conversion prend en compte l’enregistrement des localisations de marqueurs attachés aux corps de l’humain et des articulations pour améliorer les processus d’imitation. La deuxième partie de la thèse vise à analyser les stratégies de génération du mouvement utilisées par l’homme. Les mouvements humains sont supposés optimaux et notre objectif est de trouver un critère à minimiser pendant les manipulations. Nous faisons l’hypothèse que ce critère est une combinaison de critères classiquement utilisés en robotique et nous recherchons les poids de chaque critère qui représente au mieux le mouvement humain. De cette façon, une approche de commande cinématique optimale peut ensuite être utilisée pour générer des mouvements du robot humanoïde. / The number of humanoid robots has increased in recent years to be able to collaborate with humans or replace them in tedious tasks. The objective of this thesis is to transfer to humanoid robots, skills or human competences, in particular for movements involving coordination between the two arms. In the first part of the thesis, a process of conversion from a human movement to a robot movement, with the aim of imitation is proposed. Since humans have much more freedom than a humanoid robot, identical movements cannot be produced, the characteristics (body lengths) canal so be different. Our conversion process takes into account the recording of marker locations attached to human bodies and joints to improve the imitation processes. The second part of the thesis aims at analyzing the strategies used by humans to generate movement. Human movements are assumed to be optimal and our goal is to find criteria minimized during manipulations. We hypothesize that this criterion is a combination of classical criteria used in robotics and we look for the weights of each criterion that best represents human movement. In this way, an optimal kinematic control approach can then be used to generate movements of the humanoid robot.
|
8 |
Experimental Analysis on Collaborative Human Behavior in a Physical Interaction EnvironmentJanuary 2020 (has links)
abstract: Daily collaborative tasks like pushing a table or a couch require haptic communication between the people doing the task. To design collaborative motion planning algorithms for such applications, it is important to understand human behavior. Collaborative tasks involve continuous adaptations and intent recognition between the people involved in the task. This thesis explores the coordination between the human-partners through a virtual setup involving continuous visual feedback. The interaction and coordination are modeled as a two-step process: 1) Collecting data for a collaborative couch-pushing task, where both the people doing the task have complete information about the goal but are unaware of each other's cost functions or intentions and 2) processing the emergent behavior from complete information and fitting a model for this behavior to validate a mathematical model of agent-behavior in multi-agent collaborative tasks. The baseline model is updated using different approaches to resemble the trajectories generated by these models to human trajectories. All these models are compared to each other. The action profiles of both the agents and the position and velocity of the manipulated object during a goal-oriented task is recorded and used as expert-demonstrations to fit models resembling human behaviors. Analysis through hypothesis teasing is also performed to identify the difference in behaviors when there are complete information and information asymmetry among agents regarding the goal position. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2020
|
9 |
LEARNING OBJECTIVE FUNCTIONS FOR AUTONOMOUS SYSTEMSZihao Liang (18966976) 03 July 2024 (has links)
<p dir="ltr">In recent years, advancements in robotics and computing power have enabled robots to master complex tasks. Nevertheless, merely executing tasks isn't sufficient for robots. To achieve higher robot autonomy, learning the objective function is crucial. Autonomous systems can effectively eliminate the need for explicit programming by autonomously learning the control objective and deriving their control policy through the observation of task demonstrations. Hence, there's a need to develop a method for robots to learn the desired objective functions. In this thesis, we address several challenges in objective learning for autonomous systems, enhancing the applicability of our method in real-world scenarios. The ultimate objective of the thesis is to create a universal objective learning approach capable of addressing a range of existing challenges in the field while emphasizing data efficiency and robustness. Hence, building upon the previously mentioned intuition, we present a framework for autonomous systems to address a variety of objective learning tasks in real-time, even in the presence of noisy data. In addition to objective learning, this framework is capable of handling various other learning and control tasks.</p><p dir="ltr">The first part of this thesis concentrates on objective learning methods, specifically inverse optimal control (IOC). Within this domain, we have made three significant contributions aimed at addressing three existing challenges in IOC: 1) learning from minimal data, 2) learning without prior knowledge of system dynamics, and 3) learning with system outputs. </p><p dir="ltr">The second part of this thesis aims to develop a unified IOC framework to address all the challenges previously mentioned. It introduces a new paradigm for autonomous systems, referred to as Online Control-Informed Learning. This paradigm aims to tackle various of learning and control tasks online with data efficiency and robustness to noisy data. Integrating optimal control theory, online state estimation techniques, and machine learning methods, our proposed paradigm offers an online learning framework capable of tackling a diverse array of learning and control tasks. These include online imitation learning, online system identification, and policy tuning on-the-fly, all with efficient use of data and computation resources while ensuring robust performance.</p>
|
10 |
Inverse optimal control for redundant systems of biological motion / Contrôle optimal inverse de systèmes de mouvements biologiques redondantsPanchea, Adina 10 December 2015 (has links)
Cette thèse aborde les problèmes inverses de contrôle optimal (IOCP) pour trouver les fonctions de coûts pour lesquelles les mouvements humains sont optimaux. En supposant que les observations de mouvements humains sont parfaites, alors que le processus de commande du moteur humain est imparfait, nous proposons un algorithme de commande approximative optimale. En appliquant notre algorithme pour les observations de mouvement humaines collectées: mouvement du bras humain au cours d'une tâche de vissage industrielle, une tâche de suivi visuel d’une cible et une tâche d'initialisation de la marche, nous avons effectué une analyse en boucle ouverte. Pour les trois cas, notre algorithme a trouvé les fonctions de coût qui correspondent mieux ces données, tout en satisfaisant approximativement les Karush-Kuhn-Tucker (KKT) conditions d'optimalité. Notre algorithme offre un beau temps de calcul pour tous les cas, fournir une opportunité pour son utilisation dans les applications en ligne. Pour la tâche de suivi visuel d’une cible, nous avons étudié une modélisation en boucle fermée avec deux boucles de rétroaction PD. Avec des données artificielles, nous avons obtenu des résultats cohérents en termes de tendances des gains et les critères trouvent par notre algorithme pour la tâche de suivi visuel d’une cible. Dans la seconde partie de notre travail, nous avons proposé une nouvelle approche pour résoudre l’IOCP, dans un cadre d'erreur bornée. Dans cette approche, nous supposons que le processus de contrôle moteur humain est parfait tandis que les observations ont des erreurs et des incertitudes d'agir sur eux, étant imparfaite. Les erreurs sont délimitées avec des limites connues, sinon inconnu. Notre approche trouve l'ensemble convexe de de fonction de coût réalisables avec la certitude qu'il comprend la vraie solution. Nous numériquement garanties en utilisant des outils d'analyse d'intervalle. / This thesis addresses inverse optimal control problems (IOCP) to find the cost functions for which the human motions are optimal. Assuming that the human motion observations are perfect, while the human motor control process is imperfect, we propose an approximately optimal control algorithm. By applying our algorithm to the human motion observations collected for: the human arm trajectories during an industrial screwing task, a postural coordination in a visual tracking task and a walking gait initialization task, we performed an open loop analysis. For the three cases, our algorithm returned the cost functions which better fit these data, while approximately satisfying the Karush-Kuhn-Tucker (KKT) optimality conditions. Our algorithm offers a nice computational time for all cases, providing an opportunity for its use in online applications. For the visual tracking task, we investigated a closed loop modeling with two PD feedback loops. With artificial data, we obtained consistent results in terms of feedback gains’ trends and criteria exhibited by our algorithm for the visual tracking task. In the second part of our work, we proposed a new approach to solving the IOCP, in a bounded error framework. In this approach, we assume that the human motor control process is perfect while the observations have errors and uncertainties acting on them, being imperfect. The errors are bounded with known bounds, otherwise unknown. Our approach finds the convex hull of the set of feasible cost function with a certainty that it includes the true solution. We numerically guaranteed this using interval analysis tools.
|
Page generated in 0.0923 seconds