Global ETD Search

41	Information Freshness: How To Achieve It and Its Impact On Low- Latency Autonomous Systems Choudhury, Biplav 03 June 2022 (has links) In the context of wireless communications, low latency autonomous systems continue to grow in importance. Some applications of autonomous systems where low latency communication is essential are (i) vehicular network's safety performance depends on how recently the vehicles are updated on their neighboring vehicle's locations, (ii) updates from IoT devices need to be aggregated appropriately at the monitoring station before the information gets stale to extract temporal and spatial information from it, and (iii) sensors and controllers in a smart grid need to track the most recent state of the system to tune system parameters dynamically, etc. Each of the above-mentioned applications differs based on the connectivity between the source and the destination. First, vehicular networks involve a broadcast network where each of the vehicles broadcasts its packets to all the other vehicles. Secondly, in the case of UAV-assisted IoT networks, packets generated at multiple IoT devices are transmitted to a final destination via relays. Finally for the smart grid and generally for distributed systems, each source can have varying and unique destinations. Therefore in terms of connectivity, they can be categorized into one-to-all, all-to-one, and variable relationship between the number of sources and destinations. Additionally, some of the other major differences between the applications are the impact of mobility, the importance of a reduced AoI, centralized vs distributed manner of measuring AoI, etc. Thus the wide variety of application requirements makes it challenging to develop scheduling schemes that universally address minimizing the AoI. All these applications involve generating time-stamped status updates at a source which are then transmitted to their destination over a wireless medium. The timely reception of these updates at the destination decides the operating state of the system. This is because the fresher the information at the destination, the better its awareness of the system state for making better control decisions. This freshness of information is not the same as maximizing the throughput or minimizing the delay. While ideally throughput can be maximized by sending data as fast as possible, this may saturate the receiver resulting in queuing, contention, and other delays. On the other hand, these delays can be minimized by sending updates slowly, but this may cause high inter-arrival times. Therefore, a new metric called the Age of Information (AoI) has been proposed to measure the freshness of information that can account for many facets that influence data availability. In simple terms, AoI is measured at the destination as the time elapsed since the generation time of the most recently received update. Therefore AoI is able to incorporate both the delay and the inter-packet arrival time. This makes it a much better metric to measure end-to-end latency, and hence characterize the performance of such time-sensitive systems. These basic characteristics of AoI are explained in detail in Chapter 1. Overall, the main contribution of this dissertation is developing scheduling and resource allocation schemes targeted at improving the AoI of various autonomous systems having different types of connectivity, namely vehicular networks, UAV-assisted IoT networks, and smart grids, and then characterizing and quantifying the benefits of a reduced AoI from the application perspective. In the first contribution, we look into minimizing AoI for the case of broadcast networks having one-to-all connectivity between the source and destination devices by considering the case of vehicular networks. While vehicular networks have been studied in terms of AoI minimization, the impact of mobility and the benefit of a reduced AoI from the application perspective has not been investigated. The mobility of the vehicles is realistically modeled using the Simulation of Urban Mobility (SUMO) software to account for overtaking, lane changes, etc. We propose a safety metric that indicates the collision risk of a vehicle and do a simulation-based study on the ns3 simulator to study its relation to AoI. We see that the broadcast rate in a Dedicated Short Range Network (DSRC) that minimizes the system AoI also has the least collision risk, therefore signifying that reducing AoI improves the on-road safety of the vehicles. However, we also show that this relationship is not universally true and the mobility of the vehicles becomes a crucial aspect. Therefore, we propose a new metric called the Trackability-aware AoI (TAoI) which ensures that vehicles with unpredictable mobility broadcast at a faster rate while vehicles that are predicable are broadcasting at a reduced rate. The results obtained show that minimizing TAoI provides much better on-road safety as compared to plain AoI minimizing, which points to the importance of mobility in such applications. In the second contribution, we focus on networks with all-to-one connectivity where packets from multiple sources are transmitted to a single destination by taking an example of IoT networks. Here multiple IoT devices measure a physical phenomenon and transmit these measurements to a central base station (BS). However, under certain scenarios, the BS and IoT devices are unable to communicate directly and this necessitates the use of UAVs as relays. This creates a two-hop scenario that has not been studied for AoI minimization in UAV networks. In the first hop, the packets have to be sampled from the IoT devices to the UAV and then updated from the UAVs to the BS in the second hop. Such networks are called UAV-assisted IoT networks. We show that under ideal conditions with a generate-at-will traffic generation model and lossless wireless channels, the Maximal Age Difference (MAD) scheduler is the optimal AoI minimizing scheduler. When the ideal conditions are not applicable and more practical conditions are considered, a reinforcement learning (RL) based scheduler is desirable that can account for packet generation patterns and channel qualities. Therefore we propose to use a Deep-Q-Network (DQN)-based scheduler and it outperforms MAD and all other schedulers under general conditions. However, the DQN-based scheduler suffers from scalability issues in large networks. Therefore, another type of RL algorithm called Proximal Policy Optimization (PPO) is proposed to be used for larger networks. Additionally, the PPO-based scheduler can account for changes in the network conditions which the DQN-based scheduler was not able to do. This ensures the trained model can be deployed in environments that might be different than the trained environment. In the final contribution, AoI is studied in networks with varying connectivity between the source and destination devices. A typical example of such a distributed network is the smart grid where multiple devices exchange state information to ensure the grid operates in a stable state. To investigate AoI minimization and its impact on the smart grid, a co-simulation platform is designed where the 5G network is modeled in Python and the smart grid is modeled in PSCAD/MATLAB. In the first part of the study, the suitability of 5G in supporting smart grid operations is investigated. Based on the encouraging results that 5G can support a smart grid, we focus on the schedulers at the 5G RAN to minimize the AoI. It is seen that the AoI-based schedulers provide much better stability compared to traditional 5G schedulers like the proportional fairness and round-robin. However, the MAD scheduler which has been shown to be optimal for a variety of scenarios is no longer optimal as it cannot account for the connectivity among the devices. Additionally, distributed networks with heterogeneous sources will, in addition to the varying connectivity, have different sized packets requiring a different number of resource blocks (RB) to transmit, packet generation patterns, channel conditions, etc. This motivates an RL-based approach. Hence we propose a DQN-based scheduler that can take these factors into account and results show that the DQN-based scheduler outperforms all other schedulers in all considered conditions. / Doctor of Philosophy / Age of information (AoI) is an exciting new metric as it is able to characterize the freshness of information, where freshness means how representative the information is of the current system state. Therefore it is being actively investigated for a variety of autonomous systems that rely on having the most up-to-date information on the current state. Some examples are vehicular networks, UAV networks, and smart grids. Vehicular networks need the real-time location of their neighbor vehicles to make maneuver decisions, UAVs have to collect the most recent information from IoT devices for monitoring purposes, and devices in a smart grid need to ensure that they have the most recent information on the desired system state. From a communication point of view, each of these scenarios presents a different type of connectivity between the source and the destination. First, the vehicular network is a broadcast network where each vehicle broadcasts its packets to every other vehicle. Secondly, in the UAV network, multiple devices transmit their packets to a single destination via a relay. Finally, with the smart grid and the generally distributed networks, every source can have different and unique destinations. In these applications, AoI becomes a natural choice to measure the system performance as the fresher the information at the destination, the better its awareness of the system state which allows it to take better control decisions to reach the desired objective. Therefore in this dissertation, we use mathematical analysis and simulation-based approaches to investigate different scheduling and resource allocation policies to improve the AoI for the above-mentioned scenarios. We also show that the reduced AoI improves the system performance, i.e., better on-road safety for vehicular networks and better stability for smart grid applications. The results obtained in this dissertation show that when designing communication and networking protocols for time-sensitive applications requiring low latency, they have to be optimized to improve AoI. This is in contrast to most modern-day communication protocols that are targeted at improving the throughput or minimizing the delay. wireless communications age of information deep reinforcement learning IoT networks vehicular networks smart grid
42	Deep Reinforcement Learning for Building Control : A comparative study for applying Deep Reinforcement Learning to Building Energy Management / Djup förstärkningsinlärning för byggnadskontroll : En jämförande studie för att tillämpa djup förstärkningsinlärning på byggnadsenergihushållning Zheng, Wanfu January 2022 (has links) Energy and environment have become hot topics in the world. The building sector accounts for a high proportion of energy consumption, with over one-third of energy use globally. A variety of optimization methods have been proposed for building energy management, which are mainly divided into two types: model-based and model-free. Model Predictive Control is a model-based method but is not widely adopted by the building industry as it requires too much expertise and time to develop a model. Model-free Deep Reinforcement Learning(DRL) has successful applications in game-playing and robotics control. Therefore, we explored the effectiveness of the DRL algorithms applied to building control and investigated which DRL algorithm performs best. Three DRL algorithms were implemented, namely, Deep Deterministic Policy Gradient(DDPG), Double Deep Q learning(DDQN) and Soft Actor Critic(SAC). We used the building optimization testing (BOPTEST) framework, a standardized virtual testbed, to test the DRL algorithms. The performance is evaluated by two Key Performance Indicators(KPIs): thermal discomfort and operational cost. The results show that the DDPG agent performs best, and outperforms the baseline with the saving of thermal discomfort by 91.5% and 18.3%, and the saving of the operational cost by 11.0% and 14.6% during the peak and typical heating periods, respectively. DDQN and SAC agents do not show a clear advantage of performance over the baseline. This research highlights the excellent control performance of the DDPG agent, suggesting that the application of DRL in building control can achieve a better performance than the conventional control method. / Energi och miljö blir heta ämnen i världen. Byggsektorn står för en hög andel av energiförbrukningen, med över en tredjedel av energianvändningen globalt. En mängd olika optimeringsmetoder har föreslagits för Building Energy Management, vilka huvudsakligen är uppdelade i två typer: modellbaserade och modellfria. Model Predictive Control är en modellbaserad metod men är inte allmänt antagen av byggbranschen eftersom det kräver för mycket expertis och tid för att utveckla en modell. Modellfri Deep Reinforcement Learning (DRL) har framgångsrika tillämpningar inom spel och robotstyrning. Därför undersökte vi effektiviteten av DRL-algoritmerna som tillämpas på byggnadskontroll och undersökte vilken DRL-algoritm som presterar bäst. Tre DRL-algoritmer implementerades, nämligen Deep Deterministic Policy Gradient (DDPG), Double Deep Q Learning (DDQN) och Soft Actor Critic (SAC). Vi använde ramverket Building Optimization Testing (BOPTEST), en standardiserad virtuell testbädd, för att testa DRL-algoritmerna. Prestandan utvärderas av två Key Performance Indicators (KPIs): termiskt obehag och driftskostnad. Resultaten visar att DDPG-medlet presterar bäst och överträffar baslinjen med besparingen av termiskt obehag med 91.5% och 18.3%, och besparingen av driftskostnaden med 11.0% och 14.6% under topp och typisk uppvärmning perioder, respektive. DDQN- och SAC-agenter visar inte en klar fördel i prestanda jämfört med baslinjen. Denna forskning belyser DDPG-medlets utmärkta prestanda, vilket tyder på att tillämpningen av DRL i byggnadskontroll kan uppnå bättre prestanda än den konventionella metoden. Deep Reinforcement Learning Building Control Building Energy Management Optimization Thermal Discomfort Operational Cost Deep Reinforcement Learning byggnadskontroll Building Energy Management optimering termiskt obehag driftskostnader Computer and Information Sciences Data- och informationsvetenskap
43	Reinforcement learning for EV charging optimization : A holistic perspective for commercial vehicle fleets Cording, Enzo Alexander January 2023 (has links) Recent years have seen an unprecedented uptake in electric vehicles, driven by the global push to reduce carbon emissions. At the same time, intermittent renewables are being deployed increasingly. These developments are putting flexibility measures such as dynamic load management in the spotlight of the energy transition. Flexibility measures must consider EV charging, as it has the ability to introduce grid constraints: In Germany, the cumulative power of all EV onboard chargers amounts to ca. 120 GW, while the German peak load only amounts to 80 GW. Commercial operations have strong incentives to optimize charging and flatten peak loads in real-time, given that the highest quarter-hour can determine the power-related energy bill, and that a blown fuse due to overloading can halt operations. Increasing research efforts have therefore gone into real-time-capable optimization methods. Reinforcement Learning (RL) has particularly gained attention due to its versatility, performance and realtime capabilities. This thesis implements such an approach and introduces FleetRL as a realistic RL environment for EV charging, with a focus on commercial vehicle fleets. Through its implementation, it was found that RL saved up to 83% compared to static benchmarks, and that grid overloading was entirely avoided in some scenariosby sacrificing small portions of SOC, or by delaying the charging process. Linear optimization with one year of perfect knowledge outperformed RL, but reached its practical limits in one use-case, where a feasible solution could not be found by thesolver. Overall, this thesis makes a strong case for RL-based EV charging. It further provides a foundation which can be built upon: a modular, open-source software framework that integrates an MDP model, schedule generation, and non-linear battery degradation. / Elektrifieringen av transportsektorn är en nödvändig men utmanande uppgift. I kombination med ökande solcellsproduktion och förnybara energikällor skapar det ett dilemma för elnätet som kräver omfattande flexibilitetsåtgärder. Dessa åtgärder måste inkludera laddning av elbilar, ett fenomen som har lett till aldrig tidigare skådade belastningstoppar. Ur ett kommersiellt perspektiv är incitamentet att optimera laddningsprocessen och säkerställa drifttid. Forskningen har fokuserat på realtidsoptimeringsmetoder som Deep Reinforcement Learning (DRL). Denna avhandling introducerar FleetRL som en ny RL-miljö för EV-laddning av kommersiella flottor. Genom att tillämpa ramverket visade det sig att RL sparade upp till 83% jämfört med statiska riktmärken, och att överbelastning av nätet helt kunde undvikas i de flesta scenarier. Linjär optimering överträffade RL men nådde sina gränser i snävt begränsade användningsfall. Efter att ha funnit ett positivt business case förvarje kommersiellt användningsområde, ger denna avhandling ett starkt argument för RL-baserad laddning och en grund för framtida arbete via praktiska insikter och ett modulärt mjukvaruramverk med öppen källkod. Deep Reinforcement Learning EV charging optimization Artificial Intelligence Commercial vehicle fleets Electric vehicles Deep Reinforcement Learning optimering av elbilsladdning artificiell intelligens kommersiella fordonsflottor Elektriska fordon Engineering and Technology Teknik och teknologier
44	Intelligent autoscaling in Kubernetes : the impact of container performance indicators in model-free DRL methods / Intelligent autoscaling in Kubernetes : påverkan av containerprestanda-indikatorer i modellfria DRL-metoder Praturlon, Tommaso January 2023 (has links) A key challenge in the field of cloud computing is to automatically scale software containers in a way that accurately matches the demand for the services they run. To manage such components, container orchestrator tools such as Kubernetes are employed, and in the past few years, researchers have attempted to optimise its autoscaling mechanism with different approaches. Recent studies have showcased the potential of Actor-Critic Deep Reinforcement Learning (DRL) methods in container orchestration, demonstrating their effectiveness in various use cases. However, despite the availability of solutions that integrate multiple container performance metrics to evaluate autoscaling decisions, a critical gap exists in understanding how model-free DRL algorithms interact with a state space based on those metrics. Thus, the primary objective of this thesis is to investigate the impact of the state space definition on the performance of model-free DRL methods in the context of horizontal autoscaling within Kubernetes clusters. In particular, our findings reveal distinct behaviours associated with various sets of metrics. Notably, those sets that exclusively incorporate parameters present in the reward function demonstrate superior effectiveness. Furthermore, our results provide valuable insights when compared to related works, as our experiments demonstrate that a careful metric selection can lead to remarkable Service Level Agreement (SLA) compliance, with as low as 0.55% violations and even surpassing baseline performance in certain scenarios. / En viktig utmaning inom området molnberäkning är att automatiskt skala programvarubehållare på ett sätt som exakt matchar efterfrågan för de tjänster de driver. För att hantera sådana komponenter, container orkestratorverktyg som Kubernetes används, och i det förflutna några år har forskare försökt optimera dess autoskalning mekanism med olika tillvägagångssätt. Nyligen genomförda studier har visat potentialen hos Actor-Critic Deep Reinforcement Learning (DRL) metoder i containerorkestrering, som visar deras effektivitet i olika användningsfall. Men trots tillgången på lösningar som integrerar flera behållarprestandamått att utvärdera autoskalningsbeslut finns det ett kritiskt gap när det gäller att förstå hur modellfria DRLalgoritmer interagerar med ett tillståndsutrymme baserat på dessa mätvärden. Det primära syftet med denna avhandling är alltså att undersöka vilken inverkan statens rymddefinition har på prestandan av modellfria DRL-metoder i samband med horisontell autoskalning inom Kubernetes-kluster. I synnerhet visar våra resultat distinkta beteenden associerade med olika uppsättningar mätvärden. Särskilt de set som uteslutande innehåller parametrar som finns i belöningen funktion visar överlägsen effektivitet. Dessutom våra resultat ge värdefulla insikter jämfört med relaterade verk, som vår experiment visar att ett noggrant urval av mätvärden kan leda till anmärkningsvärt Service Level Agreement (SLA) efterlevnad, med så låg som 0, 55% överträdelser och till och med överträffande baslinjeprestanda i vissa scenarier. Cloud computing container autoscaling resource optimisation Deep Reinforcement Learning Actor-Critic Kubernetes service mesh Cloud computing container autoscaling Optimering av resurser Deep Reinforcement Learning Actor-Critic Kubernetes service mesh Elektroteknik och elektronik
45	Robust Deep Reinforcement Learning for Portfolio Management Masoudi, Mohammad Amin 27 September 2021 (has links) In Finance, the use of Automated Trading Systems (ATS) on markets is growing every year and the trades generated by an algorithm now account for most of orders that arrive at stock exchanges (Kissell, 2020). Historically, these systems were based on advanced statistical methods and signal processing designed to extract trading signals from financial data. The recent success of Machine Learning has attracted the interest of the financial community. Reinforcement Learning is a subcategory of machine learning and has been broadly applied by investors and researchers in building trading systems (Kissell, 2020). In this thesis, we address the issue that deep reinforcement learning may be susceptible to sampling errors and over-fitting and propose a robust deep reinforcement learning method that integrates techniques from reinforcement learning and robust optimization. We back-test and compare the performance of the developed algorithm, Robust DDPG, with UBAH (Uniform Buy and Hold) benchmark and other RL algorithms and show that the robust algorithm of this research can reduce the downside risk of an investment strategy significantly and can ensure a safer path for the investor’s portfolio value. Deep Reinforcement Learning Computational Finance Robust Optimization Algorithmic Trading Portfolio Management Automated Trading Systems Deep Deterministic Policy Gradients Risk Management
46	Nuclear Renewable Integrated Energy System Power Dispatch Optimization forTightly Coupled Co-Simulation Environment using Deep Reinforcement Learning Sah, Suba January 2021 (has links) No description available. Energy Sustainability Computer Engineering Computer Science
47	Slice-Aware Radio Resource Management for Future Mobile Networks Khodapanah, Behnam 05 June 2023 (has links) The concept of network slicing has been introduced in order to enable mobile networks to accommodate multiple heterogeneous use cases that are anticipated to be served within a single physical infrastructure. The slices are end-to-end virtual networks that share the resources of a physical network, spanning the core network (CN) and the radio access network (RAN). RAN slicing can be more challenging than CN slicing as the former deals with the distribution of radio resources, where the capacity is not constant over time and is hard to extend. The main challenge in RAN slicing is to simultaneously improve multiplexing gains while assuring enough isolation between slices, meaning one of the slices cannot negatively influence other slices. In this work, a flexible and configurable framework for RAN slicing is provided, where diverse requirements of slices are taken into account, and slice management algorithms adjust the control parameters of different radio resource management (RRM) mechanisms to satisfy the slices' service level agreements (SLAs). A new entity that translates the key performance indicator (KPI) targets of the SLAs to the control parameters is introduced and is called RAN slice orchestrator. Diverse algorithms governing this entity are introduced, which range from heuristics-based to model-free methods. Besides, a protection mechanism is constructed to prevent the negative influences of slices on each other's performances. The simulation-based analysis demonstrates the feasibility of slicing the RAN with multiplexing gains and slice isolation. info:eu-repo/classification/ddc/621.3 ddc:621.3
48	Adversarial Reinforcement Learning for Control System Design: A Deep Reinforcement Learning Approach Yang, Zhaoyuan, Yang 15 August 2018 (has links) No description available. Computer Science Electrical Engineering Artificial Intelligence Engineering deep reinforcement learning control system adversarial reinforcement learning machine learning
49	Deep Reinforcement Learning for Open Multiagent System Zhu, Tianxing 20 September 2022 (has links) No description available. Computer Science Artificial Intelligence Reinforcement learning Multiagnet systems Artificial intelligence Open environment Deep reinforcement learning Neural networks Markov decision process
50	Towards provably safe and robust learning-enabled systems Fan, Jiameng 26 August 2022 (has links) Machine learning (ML) has demonstrated great success in numerous complicated tasks. Fueled by these advances, many real-world systems like autonomous vehicles and aircraft are adopting ML techniques by adding learning-enabled components. Unfortunately, ML models widely used today, like neural networks, lack the necessary mathematical framework to provide formal guarantees on safety, causing growing concerns over these learning-enabled systems in safety-critical settings. In this dissertation, we tackle this problem by combining formal methods and machine learning to bring provable safety and robustness to learning-enabled systems. We first study the robustness verification problem of neural networks on classification tasks. We focus on providing provable safety guarantees on the absence of failures under arbitrarily strong adversaries. We propose an efficient neural network verifier LayR to compute a guaranteed and overapproximated range for the output of a neural network given an input set which contains all possible adversarially perturbed inputs. LayR relaxes nonlinear units in neural networks using linear bounds and refines such relaxations with mixed integer linear programming (MILP) to iteratively improve the approximation precision, which achieves tighter output range estimations compared to prior neural network verifiers. However, the neural network verifier focuses more on analyzing a trained neural network but less on learning provably safe neural networks. To tackle this problem, we study verifiable training that incorporates verification into training procedures to train provably safe neural networks and scale to larger models and datasets. We propose a novel framework, AdvIBP, for combining adversarial training and provable robustness verification. We show that the proposed framework can learn provable robust neural networks at a sublinear convergence rate. In the second part of the dissertation, we study the verification of system-level properties in neural-network controlled systems (NNCS). We focus on proving bounded-time safety properties by computing reachable sets. We first introduce two efficient NNCS verifiers ReachNN* and POLAR that construct polynomial-based overapproximations of neural-network controllers. We transfer NNCSs to tractable closed-loop systems with approximated polynomial controllers for computing reachable sets using existing reachability analysis tools of dynamical systems. The combination of polynomial overapproximations and reachability analysis tools opens promising directions for NNCS verification. We also include a survey and experimental study of existing NNCS verification methods, including combining state-of-the-art neural network verifiers with reachability analysis tools, to discuss what overapproximation is suitable for NNCS reachability analysis. While these verifiers enable proving safety properties of NNCS, the nonlinearity of neural-network controllers is the main bottleneck that limits their efficiency and scalability. We propose a novel framework of knowledge distillation to control “the degree of nonlinearity” of NN controllers to ease NNCS verification which improves provable safety of NNCSs especially when they are safe but cannot be verified due to their complexity. For the verification community, this opens up the possibility of reducing verification complexity by influencing how a system is trained. Though NNCS verification can prove safety when system models are known, modern deep learning, e.g., deep reinforcement learning (DRL), often targets tasks with unknown system models, also known as the model-free setting. To tackle this issue, we first focus on safe exploration of DRL and propose a novel Lyapunov-inspired method. Our method uses Gaussian Process models to provide probabilistic guarantees on the policies, and guide the exploration of the unknown environment in a safe fashion. Then, we study learning robust visual control policies in DRL to enhance the robustness against visual changes that were not seen during training. We propose a method DRIBO, which can learn robust state representations for RL via a novel contrastive version of the Multi-View Information Bottleneck (MIB). This approach enables us to train high-performance visual policies that are robust to visual distractions, and can generalize well to unseen environments. Artificial intelligence Deep neural networks Deep reinforcement learning Formal verification Machine learning Provably safe training Reachability analysis

Search results