Global ETD Search

1	Synthetic Data Generation and Training Pipeline for General Object Detection Using Domain Randomization Arnestrand, Hampus, Mark, Casper January 2024 (has links) The development of high-performing object detection models requires extensive and varied datasets with accurately annotated images, a process that is traditionally labor-intensive and prone to errors. To address these challenges, this report explores the generation of synthetic data using domain randomization techniques to train object detection models. We present a pipeline that integrates synthetic data creation in Unity, and the training of YOLOv8 object detection models. Our approach uses the Unity Perception package to produce diverse and precisely annotated datasets, overcoming the domain gap typically associated with synthetic data. The pipeline was evaluated through a series of experiments, analyzing the impact of various parameters such as background textures, and training arguments on model performance. The results demonstrate that models trained with our synthetic data can achieve high accuracy and generalize well to real-world scenarios, offering a scalable and efficient alternative to manual data annotation. object detection synthetic data domain randomization machine learning Computer Sciences Datavetenskap (datalogi)
2	On learning and generalization in unstructured taskspaces Mehta, Bhairav 08 1900 (has links) L'apprentissage robotique est incroyablement prometteur pour l'intelligence artificielle incarnée, avec un apprentissage par renforcement apparemment parfait pour les robots du futur: apprendre de l'expérience, s'adapter à la volée et généraliser à des scénarios invisibles. Cependant, notre réalité actuelle nécessite de grandes quantités de données pour former la plus simple des politiques d'apprentissage par renforcement robotique, ce qui a suscité un regain d'intérêt de la formation entièrement dans des simulateurs de physique efficaces. Le but étant l'intelligence incorporée, les politiques formées à la simulation sont transférées sur du matériel réel pour évaluation; cependant, comme aucune simulation n'est un modèle parfait du monde réel, les politiques transférées se heurtent à l'écart de transfert sim2real: les erreurs se sont produites lors du déplacement des politiques des simulateurs vers le monde réel en raison d'effets non modélisés dans des modèles physiques inexacts et approximatifs. La randomisation de domaine - l'idée de randomiser tous les paramètres physiques dans un simulateur, forçant une politique à être robuste aux changements de distribution - s'est avérée utile pour transférer des politiques d'apprentissage par renforcement sur de vrais robots. En pratique, cependant, la méthode implique un processus difficile, d'essais et d'erreurs, montrant une grande variance à la fois en termes de convergence et de performances. Nous introduisons Active Domain Randomization, un algorithme qui implique l'apprentissage du curriculum dans des espaces de tâches non structurés (espaces de tâches où une notion de difficulté - tâches intuitivement faciles ou difficiles - n'est pas facilement disponible). La randomisation de domaine active montre de bonnes performances sur le pourrait utiliser zero shot sur de vrais robots. La thèse introduit également d'autres variantes de l'algorithme, dont une qui permet d'incorporer un a priori de sécurité et une qui s'applique au domaine de l'apprentissage par méta-renforcement. Nous analysons également l'apprentissage du curriculum dans une perspective d'optimisation et tentons de justifier les avantages de l'algorithme en étudiant les interférences de gradient. / Robotic learning holds incredible promise for embodied artificial intelligence, with reinforcement learning seemingly a strong candidate to be the \textit{software} of robots of the future: learning from experience, adapting on the fly, and generalizing to unseen scenarios. However, our current reality requires vast amounts of data to train the simplest of robotic reinforcement learning policies, leading to a surge of interest of training entirely in efficient physics simulators. As the goal is embodied intelligence, policies trained in simulation are transferred onto real hardware for evaluation; yet, as no simulation is a perfect model of the real world, transferred policies run into the sim2real transfer gap: the errors accrued when shifting policies from simulators to the real world due to unmodeled effects in inaccurate, approximate physics models. Domain randomization - the idea of randomizing all physical parameters in a simulator, forcing a policy to be robust to distributional shifts - has proven useful in transferring reinforcement learning policies onto real robots. In practice, however, the method involves a difficult, trial-and-error process, showing high variance in both convergence and performance. We introduce Active Domain Randomization, an algorithm that involves curriculum learning in unstructured task spaces (task spaces where a notion of difficulty - intuitively easy or hard tasks - is not readily available). Active Domain Randomization shows strong performance on zero-shot transfer on real robots. The thesis also introduces other variants of the algorithm, including one that allows for the incorporation of a safety prior and one that is applicable to the field of Meta-Reinforcement Learning. We also analyze curriculum learning from an optimization perspective and attempt to justify the benefit of the algorithm by studying gradient interference. robotics reinforcement learning simulation domain randomization
3	Domain Adaptation to Meet the Reality-Gap from Simulation to Reality Forsberg, Fanny January 2022 (has links) Being able to train machine learning models on simulated data can be of great interest in several applications, one of them being for autonomous driving of cars. The reason is that it is easier to collect large labeled datasets as well as performing reinforcement learning in simulations. However, transferring these learned models to the real-world environment can be hard due to differences between the simulation and the reality; for example, differences in material, textures, lighting and content. One approach is to use domain adaptation, by making the simulations as similar as possible to the reality. The thesis's main focus is to investigate domain adaptation as a way to meet the reality-gap, and also compare it to an alternative method, domain randomization. Two different methods of domain adaptation; one adapting the simulated data to reality, and the other adapting the test data to simulation, are compared to using domain randomization. These are evaluated with a classifier making decisions for a robot car while driving in reality. The evaluation consists of a quantitative evaluation on real-world data and a qualitative evaluation aiming to observe how well the robot is driving and avoiding obstacles. The results show that the reality-gap is very large and that the examined methods reduce it, with the two using domain adaptation resulting in the largest decrease. However, none of them led to satisfactory driving. domain adaptation reality gap domain randomization deep learning autonomous robot
4	Bayesian Off-policy Sim-to-Real Transfer for Antenna Tilt Optimization Larsson Forsberg, Albin January 2021 (has links) Choosing the correct angle of electrical tilt in a radio base station is essential when optimizing for coverage and capacity. A reinforcement learning agent can be trained to make this choice. If the training of the agent in the real world is restricted or even impossible, alternative methods can be used. Training in simulation combined with an approximation of the real world is one option that comes with a set of challenges associated with the reality gap. In this thesis, a method based on Bayesian optimization is implemented to tune the environment in which domain randomization is performed to improve the quality of the simulation training. The results show that using Bayesian optimization to find a good subset of parameters works even when access to the real world is constrained. Two off- policy estimators based on inverse propensity scoring and direct method evaluation in combination with an offline dataset of previously collected cell traces were tested. The method manages to find an isolated subspace of the whole domain that optimizes the randomization while still giving good performance in the target domain. / Rätt val av elektrisk antennvinkel för en radiobasstation är avgörande när täckning och kapacitetsoptimering (eng. coverage and capacity optimization) görs för en förstärkningsinlärningsagent. Om träning av agenten i verkligheten är besvärlig eller till och med omöjlig att genomföra kan olika alternativa metoder användas. Simuleringsträning kombinerad med en skattningsmodell av verkligheten är ett alternativ som har olika utmaningar kopplade till klyftan mellan simulering och verkligheten (eng. reality gap). I denna avhandling implementeras en lösning baserad på Bayesiansk Optimering med syftet att anpassa miljön som domänrandomisering sker i för att förbättra kvaliteten på simuleringsträningen. Resultatet visar att Bayesiansk Optimering kan användas för att hitta ett urval av fungerande parametrar även när tillgången till den faktiska verkligheten är begränsad. Två skattningsmodeller baserade på invers propensitetsviktning och direktmetodutvärdering i kombination med ett tidigare insamlat dataset av nätverksdata testades. Den tillämpade metoden lyckas hitta ett isolerat delrum av parameterrymden som optimerar randomiseringen samtidigt som prestationen i verkligheten hålls på en god nivå. Simulation to Reality Coverage and Capacity Optimization Remote Electrical Tilt Reinforcement Learning Bayesian Optimization Domain Randomization Off- policy Estimation Simulering till Verklighet Täckning och Kapacitetsoptimering Fjärrstyrning av Elektrisk Lutning Förstärkningsinlärning Bayesiansk Optimering Domänrandomisering Off- policyskattning Computer and Information Sciences Data- och informationsvetenskap

1

Page generated in 0.0686 seconds