Spelling suggestions: "subject:"sim2real"" "subject:"bireal""
1 |
Synthetic Data Generation for 6D Object Pose and Grasping EstimationMartínez González, Pablo 16 March 2023 (has links)
Teaching a robot how to behave so it becomes completely autonomous is not a simple task. When robotic systems become truly intelligent, interactions with them will feel natural and easy, but nothing could be further from truth. Make a robot understand its surroundings is a huge task that the computer vision field tries to address, and deep learning techniques are bringing us closer. But at the cost of the data. Synthetic data generation is the process of generating artificial data that is used to train machine learning models. This data is generated using computer algorithms and simulations, and is designed to resemble real-world data as closely as possible. The use of synthetic data has become increasingly popular in recent years, particularly in the field of deep learning, due to the shortage of high-quality annotated real-world data and the high cost of collecting it. For that reason, in this thesis we are addressing the task of facilitating the generation of synthetic data with the creation of a framework which leverages advances in modern rendering engines. In this context, the generated synthetic data can be used to train models for tasks such as 6D object pose estimation or grasp estimation. 6D object pose estimation refers to the problem of determining the position and orientation of an object in 3D space, while grasp estimation involves predicting the position and orientation of a robotic hand or gripper that can be used to pick up and manipulate the object. These are important tasks in robotics and computer vision, as they enable robots to perform complex manipulation and grasping tasks. In this work we propose a way of extracting grasping information from hand-object interactions in virtual reality, so that synthetic data can also boost research in that area. Finally, we use this synthetically generated data to test the proposal of applying 6D object pose estimation architectures to grasping region estimation. This idea is based on both problems sharing several underlying concepts such as object detection and orientation. / Enseñar a un robot a ser completamente autónomo no es tarea fácil. Cuando los sistemas robóticos sean realmente inteligentes, las interacciones con ellos parecerán naturales y fáciles, pero nada más lejos de la realidad. Hacer que un robot comprenda y asimile su entorno es una difícil cruzada que el campo de la visión por ordenador intenta abordar, y las técnicas de aprendizaje profundo nos están acercando al objetivo. Pero el precio son los datos. La generación de datos sintéticos es el proceso de generar datos artificiales que se utilizan para entrenar modelos de aprendizaje automático. Estos datos se generan mediante algoritmos informáticos y simulaciones, y están diseñados para parecerse lo más posible a los datos del mundo real. El uso de datos sintéticos se ha vuelto cada vez más popular en los últimos años, especialmente en el campo del aprendizaje profundo, debido a la escasez de datos reales anotados de alta calidad y al alto coste de su recopilación. Por ello, en esta tesis abordamos la tarea de facilitar la generación de datos sintéticos con la creación de una herramienta que aprovecha los avances de los motores modernos de renderizado. En este contexto, los datos sintéticos generados pueden utilizarse para entrenar modelos para tareas como la estimación de la pose 6D de objetos o la estimación de agarres. La estimación de la pose 6D de objetos se refiere al problema de determinar la posición y orientación de un objeto en el espacio 3D, mientras que la estimación del agarre implica predecir la posición y orientación de una mano robótica o pinza que pueda utilizarse para coger y manipular el objeto. Se trata de tareas importantes en robótica y visión por computador, ya que permiten a los robots realizar tareas complejas de manipulación y agarre. En este trabajo proponemos una forma de extraer información de agarres a partir de interacciones mano-objeto en realidad virtual, de modo que los datos sintéticos también puedan impulsar la investigación en esa área. Por último, utilizamos estos datos generados sintéticamente para poner a prueba la propuesta de aplicar arquitecturas de estimación de pose 6D de objetos a la estimación de regiones de agarre. Esta propuesta se basa en que ambos problemas comparten varios conceptos subyacentes, como la detección y orientación de objetos. / This thesis has been funded by the Spanish Ministry of Education [FPU17/00166]
|
2 |
Classifying Metal Scrap Piles Using Synthetic Data : Evaluating image classification models trained on synthetic data / Klassificering av metallskrothögar med hjälp av syntetiska dataPedersen, Stian Lockhart January 2024 (has links)
Modern deep learning models require large amounts of data to train, and the acquisition of data can be challenging. Synthetic data provides an alternative to manually collecting real data, alleviating problems associated with real data acquisition. For recycling processes, classifying metal scrap piles containing hazardous objects is important, where hazardous objects can be damaging and costly if handled incorrectly. Automatically detecting hazardous objects in metal scrap piles using image classification models requires large amounts of data, and metal scrap piles contain large variations in objects, textures, and lighting. Furthermore, data acquisition can be challenging in the recycling domain, where positive objects can be scarce and manual acquisition setup can be challenging. In this thesis, synthetic images of metal scrap piles in a recycling process are created, intended for training image classification models to detect metal scrap piles containing fire extinguishers or hydraulic cylinders. Synthetic images are created with physically based rendering and domain randomization, rendered with either rasterization or ray tracing engines. Ablation studies are conducted to investigate the effect of using domain randomization. The performance of models trained on purely synthetic datasets is compared to models trained on datasets containing only real images. Furthermore, photorealistic rendering with ray tracing rendering is evaluated by comparing F1 scores between models trained on data sets created with rasterization or ray tracing. The F1 scores show that models trained on purely synthetic data outperform those trained solely on real data when classifying images containing fire extinguishers or hydraulic cylinders. Ablation studies show that domain randomization of textures is beneficial both for the classification of fire extinguishers and for the classification of hydraulic cylinders in metal scrap piles. High dynamic range image lighting randomization does not provide benefits when classifying metal scrap piles containing fire extinguishers, suggesting that other lighting randomization techniques may be more effective. The F1 scores show that synthetically created images using rasterization perform better when classifying metal scrap piles containing fire extinguishers. However, when classifying metal scrap piles containing hydraulic cylinders, images created with ray tracing achieve higher F1 scores. This thesis highlights the potential of synthetic data as an alternative to manually acquiring real data, particularly in domains where data collection is challenging and time-consuming. The results show the effectiveness of domain randomization and physically based rendering techniques in creating realistic and diverse synthetic datasets.
|
Page generated in 0.0177 seconds