Haimin Ku (12457464) 26 April 2022 (has links)
<p>With the exponential growth and diversity of Internet of Things (IoT) devices, computational-intensive and delay-sensitive applications, such as object detection, smart homes, and smart grids, are emerging constantly. We can adopt the paradigm of cloud computing to offload computation-heavy tasks from IoT devices to a cloud server which can break through the limitation of IoT devices with more powerful resources. However, cloud computing architecture can cause high latency which is not suitable for IoT devices that have limited computing and storage capabilities. Edge computing has been introduced to improve this situation by deploying an edge device nearby IoT devices that can provide IoT devices computing resources with low latency compared to cloud computing. Nevertheless, the edge server may not be able to complete all the offloaded tasks from the devices in time when the requests are flooding. In such cases, the edge server can offload some of the requested tasks to a cloud server to further speed up the offloading process with more powerful cloud resources. In this paper, we aim to minimize the average completion time of tasks in an IoT edge-cloud environment, by optimizing the task offloading ratio from edge to cloud, based on Deep Deterministic Policy Gradient (DDPG), a type of Reinforcement Learning (RL) approach. We propose a dynamic task offloading decision mechanism deployed on the edge that can determine the amounts of computational resources to be processed in the cloud server considering multiple factors to complete a task. Simulation results demonstrate that our dynamic task offloading decision mechanism can improve the overall completion time of tasks than naïve approaches. </p>

Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks

McDowell, Journey 01 November 2019 (has links)
Two controller performances are assessed for generalization in the path following task of autonomously backing up a tractor-trailer. Starting from random locations and orientations, paths are generated to loading docks with arbitrary pose using Dubins Curves. The combination vehicles can be varied in wheelbase, hitch length, weight distributions, and tire cornering stiffness. The closed form calculation of the gains for the Linear Quadratic Regulator (LQR) rely heavily on having an accurate model of the plant. However, real-world applications cannot expect to have an updated model for each new trailer. Finding alternative robust controllers when the trailer model is changed was the motivation of this research. Reinforcement learning, with neural networks as their function approximators, can allow for generalized control from its learned experience that is characterized by a scalar reward value. The Linear Quadratic Regulator and the Deep Deterministic Policy Gradient (DDPG) are compared for robust control when the trailer is changed. This investigation quantifies the capabilities and limitations of both controllers in simulation using a kinematic model. The controllers are evaluated for generalization by altering the kinematic model trailer wheelbase, hitch length, and velocity from the nominal case. In order to close the gap from simulation and reality, the control methods are also assessed with sensor noise and various controller frequencies. The root mean squared and maximum errors from the path are used as metrics, including the number of times the controllers cause the vehicle to jackknife or reach the goal. Considering the runs where the LQR did not cause the trailer to jackknife, the LQR tended to have slightly better precision. DDPG, however, controlled the trailer successfully on the paths where the LQR jackknifed. Reinforcement learning was found to sacrifice a short term reward, such as precision, to maximize the future expected reward like reaching the loading dock. The reinforcement learning agent learned a policy that imposed nonlinear constraints such that it never jackknifed, even when it wasn't the trailer it trained on.

Domain Transfer for End-to-end Reinforcement Learning / Domain Transfer for End-to-end Reinforcement Learning

Olsson, Anton, Rosberg, Felix January 2020 (has links)
In this master thesis project a LiDAR-based, depth image-based and semantic segmentation image-based reinforcement learning agent is investigated and compared forlearning in simulation and performing in real-time. The project utilize the Deep Deterministic Policy Gradient architecture for learning continuous actions and was designed to control a RC car. One of the first project to deploy an agent in a real scenario after training in a similar simulation. The project demonstrated that with a proper reward function and by tuning driving parameters such as restricting steering, maximum velocity, minimum velocity and performing input data scaling a LiDAR-based agent could drive indefinitely on a simple but completely unseen track in real-time.

Robust longitudinal velocity control for advanced vehicles: A deep reinforcement learning approach

Islam, Fahmida 13 August 2024 (has links) (PDF)
Longitudinal velocity control, or adaptive cruise control (ACC), is a common advanced driving feature aimed at assisting the driver and reducing fatigue. It maintains the velocity of a vehicle and ensures a safe distance from the preceding vehicle. Many models for ACC are available, such as Proportional, Integral, and Derivative (PID) and Model Predictive Control (MPC). However, conventional models have some limitations as they are designed for simplified driving scenarios. Artificial intelligence (AI) and machine learning (ML) have made robust navigation and decision-making possible in complex environments. Recent approaches, such as reinforcement learning (RL), have demonstrated remarkable performance in terms of faster processing and effective navigation through unknown environments. This dissertation explores an RL approach, deep deterministic policy gradient (DDPG), for longitudinal velocity control. The baseline DDPG model has been modified in two different ways. In the first method, an attention mechanism has been applied to the neural network (NN) of the DDPG model. Integrating the attention mechanism into the DDPG model helps in decreasing focus on less important features and enhances overall model effectiveness. In the second method, the inputs of the actor and critic networks of DDPG are replaced with outputs of the self-supervised network. The self-supervised learning process allows the model to accurately predict future states from current states and actions. A custom reward function has been designed for the RL algorithm considering overall safety, efficiency, and comfort. The proposed models have been trained with human car-following data, and evaluated on multiple datasets, including publicly available data, simulated data, and sensor data collected from real-world environments. The analyses demonstrate that the new architectures can maintain strong robustness across various datasets and outperform the current state-of-the-art models.

Unleashing Technological Collaboration: AI, 5G, and Mobile Robotics for Industry 4.0 Advancements

Palacios Morocho, Maritza Elizabeth 02 November 2024 (has links)
[ES] La Industria 4.0 se enfrenta a importantes retos a la hora de perseguir la transformación digital y la eficiencia operativa. La creciente complejidad de los entornos industriales modernos lleva a la necesidad de desplegar tecnologías digitales y, sobre todo, la automatización de la Industria. Sin embargo, este camino hacia la innovación va acompañado de numerosos obstáculos, ya que el entorno cambia constantemente. Por lo tanto, para adaptarse a esta evolución, es necesario emplear planteamientos más flexibles. Estos planteamientos están estrechamente relacionados con el uso de la AI y RL, ya que surgen como soluciones clave para abordar los retos cruciales de la navegación cooperativa de agentes dentro de entornos dinámicos. Mientras tanto, los algoritmos RL se enfrentan a las complejidades que implica la transmisión y el procesamiento de grandes cantidades de datos, para hacer frente a este desafío, la tecnología 5G emerge como un habilitador clave para las soluciones de escenarios de problemas evolutivos. Entre las principales ventajas de la 5G están que ofrece una transmisión rápida y segura de grandes volúmenes de datos con una latencia mínima. Al ser la única tecnología hasta ahora capaz de ofrecer estas capacidades, 5G se convierte en un componente esencial para desplegar servicios en tiempo real como la navegación cooperativa. Además, otra ventaja es que proporciona la infraestructura necesaria para intercambios de datos robustos y contribuye a la eficiencia del sistema y a la seguridad de los datos en entornos industriales dinámicos. A la vista de lo anterior, es evidente que la complejidad de los entornos industriales conduce a la necesidad de proponer sistemas basados en nuevas tecnologías como las redes AI y 5G, ya que su combinación proporciona una potente sinergia. Además, aparte de abordar los retos identificados en la navegación cooperativa, también abre la puerta a la implementación de fábricas inteligentes, dando lugar a mayores niveles de automatización, seguridad y productividad en las operaciones industriales. Es importante destacar que la aplicación de técnicas de AI conlleva la necesidad de utilizar software de simulación para probar los algoritmos propuestos en entornos virtuales. Esto permite abordar cuestiones esenciales sobre la validez de los algoritmos, reducir los riesgos de daños en el hardware y, sobre todo, optimizar las soluciones propuestas. Con el fin de proporcionar una solución a los retos fundamentales en la automatización de fábricas, esta Tesis se centra en la integración de la robótica móvil en la nube, especialmente en el contexto de la Industria 4.0. También abarca la investigación de las capacidades de las redes 5G, la evaluación de la viabilidad de simuladores como ROS y Gazebo, y la fusión de datos de sensores y el diseño de algoritmos de planificación de trayectorias basados en RL. En otras palabras, esta Tesis no solo identifica y aborda los retos clave de la Industria 4.0, sino que también presenta soluciones innovadoras e hipótesis concretas para la investigación. Además, promueve la combinación de AI y 5G para desplegar servicios en tiempo real, como la navegación cooperativa. Así, aborda retos críticos y demuestra que la colaboración tecnológica redefine la eficiencia y la adaptabilidad en la industria moderna. / [CA] La Indústria 4.0 s'enfronta a importants reptes a l'hora de perseguir la transformació digital i l'eficiència operativa. La creixent complexitat dels entorns industrials moderns porta a la necessitat de desplegar tecnologies digitals i, sobretot, l'automatització de la Indústria. No obstant això, este camí cap a la innovació va acompanyat de nombrosos obstacles, ja que l'entorn canvia constantment. Per tant, per a adaptar-se a esta evolució, és necessari emprar plantejaments més flexibles. Estos plantejaments estan estretament relacionats amb l'ús de l'AI i RL, ja que sorgixen com a solucions clau per a abordar els reptes crucials de la navegació cooperativa d'agents dins d'entorns dinàmics. Mentrestant, els algorismes RL s'enfronten a les complexitats que implica la transmissió i el processament de grans quantitats de dades, per a fer front a este desafiament, la tecnologia 5G emergix com un habilitador clau per a les solucions d'escenaris de problemes evolutius. Entre els principals avantatges de la 5G estan que oferix una transmissió ràpida i segura de grans volums de dades amb una latència mínima. A l'ésser l'única tecnologia fins ara capaç d'oferir estes capacitats, 5G es convertix en un component essencial per a desplegar servicis en temps real com la navegació cooperativa. A més, un altre avantatge és que proporciona la infraestructura necessària per a intercanvis de dades robustes i contribuïx a l'eficiència del sistema i a la seguretat de les dades en entorns industrials dinàmics. A la vista de l'anterior, és evident que la complexitat dels entorns industrials conduïx a la necessitat de proposar sistemes basats en noves tecnologies com les xarxes AI i 5G, ja que la seua combinació proporciona una potent sinergia. A més, a part d'abordar els reptes identificats en la navegació cooperativa, també obri la porta a la implementació de fabriques intel·ligents, donant lloc a majors nivells d'automatització, seguretat i productivitat en les operacions industrials. És important destacar que l'aplicació de tècniques d'AI comporta la necessitat d'utilitzar programari de simulació per a provar els algorismes proposats en entorns virtuals. Això permet abordar qüestions essencials sobre la validesa dels algorismes, reduir els riscos de dona'ns en el maquinari i, sobretot, optimitzar les solucions proposades. Amb la finalitat de proporcionar una solució als reptes fonamentals en l'automatització de fabriques, esta Tesi se centra en la integració de la robòtica mòbil en el núvol, especialment en el context de la Indústria 4.0. També abasta la investigació de les capacitats de les xarxes 5G, l'avaluació de la viabilitat de simuladors com ROS i Gazebo, i la fusió de dades de sensors i el disseny d'algorismes de planificació de trajectòries basats en RL. En altres paraules, esta Tesi no sols identifica i aborda els reptes clau de la Indústria 4.0, sinó que també presenta solucions innovadores i hipòtesis concretes per a la investigació. A més, promou la combinació d'AI i 5G per a desplegar servicis en temps real, com la navegació cooperativa. Així, aborda reptes crítics i demostra que la col·laboració tecnològica redefinix l'eficiència i l'adaptabilitat en la indústria moderna. / [EN] Industry 4.0 faces significant challenges in pursuing digital transformation and operational efficiency. The increasing complexity of modern industrial environments leads to the need to deploy digital technologies and, above all, Industry automation. However, this path to innovation is accompanied by numerous obstacles, as the environment constantly changes. Therefore, to adapt to this evolution, it is necessary to employ more flexible approaches. These approaches are closely linked to the use of Artificial Intelligence (AI) and Reinforcement Learning (RL), as they emerge as pivotal solutions to address the crucial challenges of cooperative agent navigation within dynamic environments. Meanwhile, RL algorithms face the complexities involved in transmitting and processing large amounts of data. To address this challenge, Fifth Generation (5G) technology emerges as a key enabler for evolutionary problem scenario solutions. Among the main advantages of 5G is that it offers fast and secure transmission of large volumes of data with minimal latency. As the only technology so far capable of delivering these capabilities, 5G becomes an essential component for deploying real-time services such as cooperative navigation. Furthermore, another advantage is that it provides the necessary infrastructure for robust data exchanges and contributes to system efficiency and data security in dynamic industrial environments. In view of the above, it is clear that the complexity of industrial environments leads to the need to propose systems based on new technologies such as AI and 5G networks, as their combination provides a powerful synergy. Moreover, aside from tackling the challenges identified in cooperative navigation, it also opens the door to the implementation of smart factories, leading to higher levels of automation, safety, and productivity in industrial operations. It is important to note that the application of AI techniques entails the need to use simulation software to test the proposed algorithms in virtual environments. This makes it possible to address essential questions about the validity of the algorithms, reduce the risks of damage to the hardware, and, above all, optimize the proposed solutions. In order to provide a solution to the fundamental challenges in factory automation, this Thesis focuses on integrating mobile robotics in the cloud, especially in the context of Industry 4.0. It also covers the investigation of the capabilities of 5G networks, the evaluation of the feasibility of simulators such as Robot Operating System (ROS) and Gazebo, and the fusion of sensor data and the design of path planning algorithms based on RL. In other words, this Thesis not only identifies and addresses the key challenges of Industry 4.0 but also presents innovative solutions and concrete hypotheses for research. Furthermore, it promotes the combination of AI and 5G to deploy real-time services, such as cooperative navigation. Thus, it addresses critical challenges and demonstrates that technological collaboration redefines efficiency and adaptability in modern industry. / This research was funded by the Research and Development Grants Program (PAID-01-19) of the Universitat Politècnica de València. The research stay of the author at Technischen Universit¨at Darmstadt (Germany) was funded by the Program of Grants for Student Mobility of doctoral students at the Universitat Politècnica de València in 2022 from Spain and by Erasmus+ Student Mobility for Traineeship 2022 / Palacios Morocho, ME. (2024). Unleashing Technological Collaboration: AI, 5G, and Mobile Robotics for Industry 4.0 Advancements [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/204748


RENATA GARCIA OLIVEIRA 01 February 2022 (has links)
[pt] Este trabalho busca usar o comitê de algoritmos de aprendizado por reforço profundo (deep reinforcement learning) sob uma nova perspectiva. Na literatura, a técnica de comitê é utilizada para melhorar o desempenho, mas, pela primeira vez, esta pesquisa visa utilizar comitê para minimizar a dependência do desempenho de aprendizagem por reforço profundo no ajuste fino de hiperparâmetros, além de tornar o aprendizado mais preciso e robusto. Duas abordagens são pesquisadas; uma considera puramente a agregação de ação, enquanto que a outra também leva em consideração as funções de valor. Na primeira abordagem, é criada uma estrutura de aprendizado online com base no histórico de escolha de ação contínua do comitê com o objetivo de integrar de forma flexível diferentes métodos de ponderação e agregação para as ações dos agentes. Em essência, a estrutura usa o desempenho passado para combinar apenas as ações das melhores políticas. Na segunda abordagem, as políticas são avaliadas usando seu desempenho esperado conforme estimado por suas funções de valor. Especificamente, ponderamos as funções de valor do comitê por sua acurácia esperada, calculada pelo erro da diferença temporal. As funções de valor com menor erro têm maior peso. Para medir a influência do esforço de ajuste do hiperparâmetro, grupos que consistem em uma mistura de diferentes quantidades de algoritmos bem e mal parametrizados foram criados. Para avaliar os métodos, ambientes clássicos como o pêndulo invertido, cart pole e cart pole duplo são usados como benchmarks. Na validação, os ambientes de simulação Half Cheetah v2, um robô bípede, e o Swimmer v2 apresentaram resultados superiores e consistentes demonstrando a capacidade da técnica de comitê em minimizar o esforço necessário para ajustar os hiperparâmetros dos algoritmos. / [en] This work seeks to use ensembles of deep reinforcement learning algorithms from a new perspective. In the literature, the ensemble technique is used to improve performance, but, for the first time, this research aims to use ensembles to minimize the dependence of deep reinforcement learning performance on hyperparameter fine-tuning, in addition to making it more precise and robust. Two approaches are researched; one considers pure action aggregation, while the other also takes the value functions into account. In the first approach, an online learning framework based on the ensemble s continuous action choice history is created, aiming to flexibly integrate different scoring and aggregation methods for the agents actions. In essence, the framework uses past performance to only combine the best policies actions. In the second approach, the policies are evaluated using their expected performance as estimated by their value functions. Specifically, we weigh the ensemble s value functions by their expected accuracy as calculated by the temporal difference error. Value functions with lower error have higher weight. To measure the influence on the hyperparameter tuning effort, groups consisting of a mix of different amounts of well and poorly parameterized algorithms were created. To evaluate the methods, classic environments such as the inverted pendulum, cart pole and double cart pole are used as benchmarks. In validation, the Half Cheetah v2, a biped robot, and Swimmer v2 simulation environments showed superior and consistent results demonstrating the ability of the ensemble technique to minimize the effort needed to tune the the algorithms.

Generation and Detection of Adversarial Attacks for Reinforcement Learning Policies

Drotz, Axel, Hector, Markus January 2021 (has links)
In this project we investigate the susceptibility ofreinforcement rearning (RL) algorithms to adversarial attacks.Adversarial attacks have been proven to be very effective atreducing performance of deep learning classifiers, and recently,have also been shown to reduce performance of RL agents.The goal of this project is to evaluate adversarial attacks onagents trained using deep reinforcement learning (DRL), aswell as to investigate how to detect these types of attacks. Wefirst use DRL to solve two environments from OpenAI’s gymmodule, namely Cartpole and Lunarlander, by using DQN andDDPG (DRL techniques). We then evaluate the performanceof attacks and finally we also train neural networks to detectattacks. The attacks was successful at reducing performancein the LunarLander environment and CartPole environment.The attack detector was very successful at detecting attacks onthe CartPole environment, but performed not quiet as well onLunarLander.We hypothesize that continuous action space environmentsmay pose a greater difficulty for attack detectors to identifypotential adversarial attacks. / I detta projekt undersöker vikänsligheten hos förstärknings lärda (RL) algotritmerför attacker mot förstärknings lärda agenter. Attackermot förstärknings lärda agenter har visat sig varamycket effektiva för att minska prestandan hos djuptförsärknings lärda klassifierare och har nyligen visat sigockså minska prestandan hos förstärknings lärda agenter.Målet med detta projekt är att utvärdera attacker motdjupt förstärknings lärda agenter och försöka utföraoch upptäcka attacker. Vi använder först RL för attlösa två miljöer från OpenAIs gym module CartPole-v0och ContiniousLunarLander-v0 med DQN och DDPG.Vi utvärderar sedan utförandet av attacker och avslutarslutligen med ett möjligt sätt att upptäcka attacker.Attackerna var mycket framgångsrika i att minskaprestandan i både CartPole-miljön och LunarLandermiljön. Attackdetektorn var mycket framgångsrik medatt upptäcka attacker i CartPole-miljön men presteradeinte lika bra i LunarLander-miljön.Vi hypotiserar att miljöer med kontinuerligahandlingsrum kan innebära en större svårighet fören attack identifierare att upptäcka attacker mot djuptförstärknings lärda agenter. / Kandidatexjobb i elektroteknik 2021, KTH, Stockholm

Uncontrolled intersection coordination of the autonomous vehicle based on multi-agent reinforcement learning.

McSey, Isaac Arnold January 2023 (has links)
This study explores the application of multi-agent reinforcement learning (MARL) to enhance the decision-making, safety, and passenger comfort of Autonomous Vehicles (AVs)at uncontrolled intersections. The research aims to assess the potential of MARL in modeling multiple agents interacting within a shared environment, reflecting real-world situations where AVs interact with multiple actors. The findings suggest that AVs trained using aMARL approach with global experiences can better navigate intersection scenarios than AVs trained on local (individual) experiences. This capability is a critical precursor to achieving Level 5 autonomy, where vehicles are expected to manage all aspects of the driving task under all conditions. The research contributes to the ongoing discourse on enhancing autonomous vehicle technology through multi-agent reinforcement learning and informs the development of sophisticated training methodologies for autonomous driving.

