Global ETD Search

1	Cooperative Perception for Connected Vehicles Mehr, Goodarz 31 May 2024 (has links) Doctor of Philosophy / Self-driving cars promise a future with safer roads and reduced traffic incidents and fatalities. This future hinges on the car's accurate understanding of its surrounding environment; however, the reliability of the algorithms that form this perception is not always guaranteed and adverse traffic and environmental conditions can significantly diminish the performance of these algorithms. To solve this problem, this research builds on the idea that enabling cars to share and exchange information via communication allows them to extend the range and quality of their perception beyond their capability. To that end, this research formulates a robust and flexible framework for cooperative perception, explores how connected vehicles can learn to collaborate to improve their perception, and introduces an affordable, experimental vehicle platform for connected autonomy research. cooperative perception connected vehicles map fusion multi-agent reinforcement learning
2	Random Access Control In Massive Cellular Internet of Things: A Multi-Agent Reinforcement Learning Approach Bai, Jianan 14 January 2021 (has links) Internet of things (IoT) is envisioned as a promising paradigm to interconnect enormous wireless devices. However, the success of IoT is challenged by the difficulty of access management of the massive amount of sporadic and unpredictable user traffics. This thesis focuses on the contention-based random access in massive cellular IoT systems and introduces two novel frameworks to provide enhanced scalability, real-time quality of service management, and resource efficiency. First, a local communication based congestion control framework is introduced to distribute the random access attempts evenly over time under bursty traffic. Second, a multi-agent reinforcement learning based preamble selection framework is designed to increase the access capacity under a fixed number of preambles. Combining the two mechanisms provides superior performance under various 3GPP-specified machine type communication evaluation scenarios in terms of achieving much lower access latency and fewer access failures. / Master of Science / In the age of internet of things (IoT), massive amount of devices are expected to be connected to the wireless networks in a sporadic and unpredictable manner. The wireless connection is usually established by contention-based random access, a four-step handshaking process initiated by a device through sending a randomly selected preamble sequence to the base station. While different preambles are orthogonal, preamble collision happens when two or more devices send the same preamble to a base station simultaneously, and a device experiences access failure if the transmitted preamble cannot be successfully received and decoded. A failed device needs to wait for another random access opportunity to restart the aforementioned process and hence the access delay and resource consumption are increased. The random access control in massive IoT systems is challenged by the increased access intensity, which results in higher collision probability. In this work, we aim to provide better scalability, real-time quality of service management, and resource efficiency in random access control for such systems. Towards this end, we introduce 1) a local communication based congestion control framework by enabling a device to cooperate with neighboring devices and 2) a multi-agent reinforcement learning (MARL) based preamble selection framework by leveraging the ability of MARL in forming the decision-making policy through the collected experience. The introduced frameworks are evaluated under the 3GPP-specified scenarios and shown to outperform the existing standard solutions in terms of achieving lower access delays with fewer access failures. internet-of-things multi-agent reinforcement learning massive connectivity random access
3	Towards a Deep Reinforcement Learning based approach for real-time decision making and resource allocation for Prognostics and Health Management applications Ludeke, Ricardo Pedro João January 2020 (has links) Industrial operational environments are stochastic and can have complex system dynamics which introduce multiple levels of uncertainty. This uncertainty leads to sub-optimal decision making and resource allocation. Digitalisation and automation of production equipment and the maintenance environment enable predictive maintenance, meaning that equipment can be stopped for maintenance at the optimal time. Resource constraints in maintenance capacity could however result in further undesired downtime if maintenance cannot be performed when scheduled. In this dissertation the applicability of using a Multi-Agent Deep Reinforcement Learning based approach for decision making is investigated to determine the optimal maintenance scheduling policy in a fleet of assets where there are maintenance resource constraints. By considering the underlying system dynamics of maintenance capacity, as well as the health state of individual assets, a near-optimal decision making policy is found that increases equipment availability while also maximising maintenance capacity. The implemented solution is compared to a run-to-failure corrective maintenance strategy, a constant interval preventive maintenance strategy and a condition based predictive maintenance strategy. The proposed approach outperformed traditional maintenance strategies across several asset and operational maintenance performance metrics. It is concluded that Deep Reinforcement Learning based decision making for asset health management and resource allocation is more effective than human based decision making. / Dissertation (MEng (Mechanical Engineering))--University of Pretoria, 2020. / Mechanical and Aeronautical Engineering / MEng (Mechanical Engineering) / Unrestricted UCTD Maintenance Policy Optimisation Deep Reinforcement Learning Multi-agent Reinforcement Learning
4	Adaptive manufacturing: dynamic resource allocation using multi-agent reinforcement learning Heik, David, Bahrpeyma, Fouad, Reichelt, Dirk 13 February 2024 (has links) The global value creation networks have experienced increased volatility and dynamic behavior in recent years, resulting in an acceleration of a trend already evident in the shortening of product and technology cycles. In addition, the manufacturing industry is demonstrating a trend of allowing customers to make specific adjustments to their products at the time of ordering. Not only do these changes require a high level of flexibility and adaptability from the cyber-physical systems, but also from the employees and the supervisory production planning. As a result, the development of control and monitoring mechanisms becomes more complex. It is also necessary to adjust the production process dynamically if there are unforeseen events (disrupted supply chains, machine breakdowns, or absences of staff) in order to make the most effective and efficient use of the available production resources. In recent years, reinforcement learning (RL) research has gained increasing popularity in strategic planning as a result of its ability to handle uncertainty in dynamic environments in real time. RL has been extended to include multiple agents cooperating on complex tasks as a solution to complex problems. Despite its potential, the real-world application of multi-agent reinforcement learning (MARL) to manufacturing problems, such as flexible job-shop scheduling, has been less frequently approached. The main reason for this is most of the applications in this field are frequently subject to specific requirements as well as confidentiality obligations. Due to this, it is difficult for the research community to obtain access to them, which presents substantial challenges for the implementation of these tools. ...
5	Decentralized Integration of Distributed Energy Resources into Energy Markets with Physical Constraints Chen Feng (18556528) 29 May 2024 (has links) <p dir="ltr">With the growing installation of distributed energy resources (DERs) at homes, more residential households are able to reduce the overall energy cost by storing unused energy in the storage battery when there is abundant renewable energy generation, and using the stored energy when there is insufficient renewable energy generation and high demand. It could be even more economical for the household if energy can be traded and shared among neighboring households. Despite the great economic benefit of DERs, they could also make it more challenging to ensure the stability of the grid due to the decentralization of agents' activities.</p><p><br></p><p dir="ltr">This thesis presents two approaches that combine market and control mechanisms to address these challenges. In the first work, we focus on the integration of DERs into local energy markets. We introduce a peer-to-peer (P2P) local energy market and propose a consensus multi-agent reinforcement learning (MARL) framework, which allows agents to develop strategies for trading and decentralized voltage control within the P2P market. It is compared to both the fully decentralized and centralized training & decentralized execution (CTDE) framework. Numerical results reveal that under each framework, the system is able to converge to a dynamic balance with the guarantee of system stability as each agent gradually learns the approximately optimal strategy. Theoretical results also prove the convergence of the consensus MARL algorithm under certain conditions.</p><p dir="ltr">In the second work, we introduce a mean-field game framework for the integration of DERs into wholesale energy markets. This framework helps DER owners automatically learn optimal decision policies in response to market price fluctuations and their own variable renewable energy outputs. We prove the existence of a mean-field equilibrium (MFE) for the wholesale energy market, and we develop a heuristic decentralized mean-field learning algorithm to converge to an MFE, taking into consideration the demand/supply shock and flexible demand. Our numerical experiments point to convergence to an MFE and show that our framework effectively reduces peak load and price fluctuations, especially during exogenous demand or supply shocks.</p> Industrial engineering Operations research Energy market Game theory Mean-field games
6	Multi-Task Reinforcement Learning: From Single-Agent to Multi-Agent Systems Trang, Matthew Luu 06 January 2023 (has links) Generalized collaborative drones are a technology that has many potential benefits. General purpose drones that can handle exploration, navigation, manipulation, and more without having to be reprogrammed would be an immense breakthrough for usability and adoption of the technology. The ability to develop these multi-task, multi-agent drone systems is limited by the lack of available training environments, as well as deficiencies of multi-task learning due to a phenomenon known as catastrophic forgetting. In this thesis, we present a set of simulation environments for exploring the abilities of multi-task drone systems and provide a platform for testing agents in incremental single-agent and multi-agent learning scenarios. The multi-task platform is an extension of an existing drone simulation environment written in Python using the PyBullet Physics Simulation Engine, with these environments incorporated. Using this platform, we present an analysis of Incremental Learning and detail the beneficial impacts of using the technique for multi-task learning, with respect to multi-task learning speed and catastrophic forgetting. Finally, we introduce a novel algorithm, Incremental Learning with Second-Order Approximation Regularization (IL-SOAR), to mitigate some of the effects of catastrophic forgetting in multi-task learning. We show the impact of this method and contrast the performance relative to a multi-agent multi-task approach using a centralized policy sharing algorithm. / Master of Science / Machine Learning techniques allow drones to be trained to achieve tasks which are otherwise time-consuming or difficult. The goal of this thesis is to facilitate the work of creating these complex drone machine learning systems by exploring Reinforcement Learning (RL), a field of machine learning which involves learning the correct actions to take through experience. Currently, RL methods are effective in the design of drones which are able to solve one particular task. The next step in this technology is to develop RL systems which are able to handle generalization and perform well across multiple tasks. In this thesis, simulation environments for drones to learn complex tasks are created, and algorithms which are able to train drones in multiple hard tasks are developed and tested. We explore the benefits of using a specific multi-task training technique known as Incremental Learning. Additionally, we consider one of the prohibitive factors of multi-task machine learning-based solutions, the degradation problem of agent performance on previously learned tasks, known as catastrophic forgetting. We create an algorithm that aims to prevent the impact of forgetting when training drones sequentially on new tasks. We contrast this approach with a multi-agent solution, where multiple drones learn simultaneously across the tasks. Reinforcement Learning Drones Catastrophic Forgetting Multi-Agent Reinforcement Learning Multi-Task Reinforcement Learning
7	Non-Reciprocating Sharing Methods in Cooperative Q-Learning Environments Cunningham, Bryan 28 August 2012 (has links) Past research on multi-agent simulation with cooperative reinforcement learning (RL) for homogeneous agents focuses on developing sharing strategies that are adopted and used by all agents in the environment. These sharing strategies are considered to be reciprocating because all participating agents have a predefined agreement regarding what type of information is shared, when it is shared, and how the participating agent's policies are subsequently updated. The sharing strategies are specifically designed around manipulating this shared information to improve learning performance. This thesis targets situations where the assumption of a single sharing strategy that is employed by all agents is not valid. This work seeks to address how agents with no predetermined sharing partners can exploit groups of cooperatively learning agents to improve learning performance when compared to Independent learning. Specifically, several intra-agent methods are proposed that do not assume a reciprocating sharing relationship and leverage the pre-existing agent interface associated with Q-Learning to expedite learning. The other agents' functions and their sharing strategies are unknown and inaccessible from the point of view of the agent(s) using the proposed methods. The proposed methods are evaluated on physically embodied agents in the multi-agent cooperative robotics field learning a navigation task via simulation. The experiments conducted focus on the effects of the following factors on the performance of the proposed non-reciprocating methods: scaling the number of agents in the environment, limiting the communication range of the agents, and scaling the size of the environment. / Master of Science Multi-Agent Reinforcement Learning Agent Interaction Protocols Cooperative Learning
8	Communication approaches in Multi-Agent Reinforcement Learning Nechai, Vladyslav 22 October 2024 (has links) In decentralised multi-agent reinforcement learning communication can be used as a measure to increase coordination among the agents. At the same time, the essence of message exchange and its contribution to successful goal achievement can only be established with the domain knowledge of a given environment. This thesis focuses on understanding the impact of communication on a decentralised multi-agent system. To achieve this, communication is employed and studied in the context of Urban Air Mobility, in particular- to the vertiport terminal area control problem. A proposed in this work experimental framework, that promotes different information exchange protocols, allows to investigate if and how the agents leverage their communication capabilities. Acquired simulation results show that in the terminal area of a vertiport the aircrafts, controlled in a decentralised way, are capable of proper self-organisation, similar to the structured technique formulated in [Bertram and Wei(2020)]. A study of their communication mechanisms indicates that through different protocols the agents learn to signal their intent to enter a vertiport regardless of environment settings. info:eu-repo/classification/ddc/380 ddc:380
9	Beyond Monte Carlo: leveraging temporal difference learning for superior performance in dynamic resource allocation Heik, David, Bahrpeyma, Fouad, Reichelt, Dirk 19 February 2025 (has links) The application of reinforcement learning to dynamic industrial scheduling has gained increasing attention due to its capability to optimize complex manufacturing processes. With the advent of Industry 4.0 and the rise of smart manufacturing, new challenges arise that require innovative approaches, particularly in environments where there is a high degree of variability and uncertainty. Previous research has demonstrated that reinforcement learning, in particular Monte Carlo methods, is highly effective in optimizing resource allocation in job-shop scheduling scenarios. Even though Monte Carlo methods are effective where reward functions are clear and retrospective, real-world manufacturing systems often require more dynamic decision-making capabilities in real-time, for which temporaldifference methods are more appropriate. Despite the effectiveness of reinforcement learning in this area, there is a gap in understanding how different reward functions affect the learning process. In this study, we systematically examined multiple reward functions within a temporal difference system, applying a sensitivity analysis to assess their effects during the training and evaluation phases. Our results demonstrated that the overall performance of the production line improved despite the inherent complexity and challenges posed by temporal difference methods. Our findings demonstrate the effectiveness of multi-agent reinforcement learning for improving manufacturing efficiency, and provide implications for future research on scalable, real-time industrial scheduling.
10	YoloRL: simplifying dynamic scheduling through efficient action selection based on multi-agent reinforcement learning Heik, David, Bahrpeyma, Fouad, Reichelt, Dirk 19 February 2025 (has links) In modern manufacturing environments, it is essential to be able to react autonomously and dynamically to unpredictable events in an automated manner in order to schedule production in a cost-effective manner. One of the prerequisites for the development of this technology is the progressive integration of cyberphysical systems into industrial sectors. Data generated by the industry constitutes the basis for operative and strategic decision-making in this context. Collecting these data in real time, transforming it if necessary, and analyzing it in order to ensure time-critical decision-making is a major challenge. This paper presents a novel approach that simplifies dynamic scheduling through efficient action selection. YoloRL, the method presented in this paper, which is based on reinforcement learning, which allows for a reduction in the complexity of the training process in a substantial way. For the purpose of identifying promising action sequences, YoloRL does not take into consideration all of the state information of an episode; it only takes into account the initial state. As a result, training complexity is significantly reduced while at the same time robust and adaptive control can be achieved. This study improves the manufacturing system’s performance by minimizing the overall completion time (for any given order). Experimental results indicate that the proposed method results in a faster generalization of the domain knowledge and provides for a powerful policy that is both efficient and reliable in dynamic environments. With YoloRL, overall completion time is reduced by a moderate but quantifiable amount compared with the traditional approach. In accordance with our experimental results, the proposed methodology has the ability to accelerate and stabilize the training process. Thus, a reliable and generalizable policy network is established, which can nevertheless respond dynamically to unforeseen events and changing environmental conditions due to its adaptability. The policy ...

Search results