Global ETD Search

31	Towards Anatomically Plausible Streamline Tractography with Deep Reinforcement Learning / Mot anatomiskt plausibel strömlinje-traktografi med djup förstärkningsinlärning Bengtsdotter, Erika January 2022 (has links) Tractography is a tool that is often used to study structural brain connectivity from diffusion magnetic resonance imaging data. Despite its ability to visualize fibers in the white brain matter, it results in a high number of invalid streamlines. For the sake of research and clinical applications, it is of great interest to reduce this number and so improve the quality of tractography. Over the years, many solutions have been proposed, often with a need for ground truth data. As such data for tractography is very difficult to generate even with expertise, it is meaningful to instead use methods like reinforcement learning that does not have such a requirement. In 2021 a deep reinforcement learning tractography network was published: Track-To-Learn. There is however still room for improvement in the reward function of the framework and this is what we focused on in this thesis. Firstly we successfully reproduced some of the published results of Track-To-Learn and observed that almost 20 % of the streamlines were anatomically plausible. Continuously we modified the reward function by giving a reward boost to streamlines which started or terminated within a specified mask. This addition resulted in a small increase of plausible streamlines for a more realistic dataset. Lastly we attempted to include anatomical filtering in the reward function. The produced results were however not enough to draw any valid conclusions about the influence of the modification. Nonetheless, the work of this thesis showed that including further fiber specific anatomical constraints in the reward function of Track-To-Learn could possibly improve the quality of the generated tractograms and would be of interest in both research and clinical settings. Deep Reinforcement Learning Plausible Track-To-Learn Tractography Medical Engineering Medicinteknik
32	Training Multi-Agent Collaboration using Deep Reinforcement Learning in Game Environment / Träning av sambarbete mellan flera agenter i spelmiljö med hjälp av djup förstärkningsinlärning Deng, Jie January 2018 (has links) Deep Reinforcement Learning (DRL) is a new research area, which integrates deep neural networks into reinforcement learning algorithms. It is revolutionizing the field of AI with high performance in the traditional challenges, such as natural language processing, computer vision etc. The current deep reinforcement learning algorithms enable an end to end learning that utilizes deep neural networks to produce effective actions in complex environments from high dimensional sensory observations, such as raw images. The applications of deep reinforcement learning algorithms are remarkable. For example, the performance of trained agent playing Atari video games is comparable, or even superior to a human player. Current studies mostly focus on training single agent and its interaction with dynamic environments. However, in order to cope with complex real-world scenarios, it is necessary to look into multiple interacting agents and their collaborations on certain tasks. This thesis studies the state-of-the-art deep reinforcement learning algorithms and techniques. Through the experiments conducted in several 2D and 3D game scenarios, we investigate how DRL models can be adapted to train multiple agents cooperating with one another, by communications and physical navigations, and achieving their individual goals on complex tasks. / Djup förstärkningsinlärning (DRL) är en ny forskningsdomän som integrerar djupa neurala nätverk i inlärningsalgoritmer. Det har revolutionerat AI-fältet och skapat höga förväntningar på att lösa de traditionella problemen inom AI-forskningen. I detta examensarbete genomförs en grundlig studie av state-of-the-art inom DRL-algoritmer och DRL-tekniker. Genom experiment med flera 2D- och 3D-spelscenarion så undersöks hur agenter kan samarbeta med varandra och nå sina mål genom kommunikation och fysisk navigering. Deep Reinforcement Learning deep neural networks multi-agent collaboration games Computer Sciences Datavetenskap (datalogi)
33	Power Dispatch and Storage Configuration Optimization of an IntegratedEnergy System using Deep Reinforcement Learning and Hyperparameter Tuning Katikaneni, Sravya January 2022 (has links) No description available. Computer Science
34	The Development of Real-Time Fouling Monitoring and Control Systems for Reverse Osmosis Membrane Cleaning using Deep Reinforcement Learning Titus Glover, Kyle Ian Kwartei 11 July 2023 (has links) This dissertation investigates potential applications for Machine Learning (ML) and real-time fouling monitors in Reverse Osmosis (RO) desalination. The main objective was to develop a framework that minimizes the cost of membrane fouling by deploying AI-generated cleaning patterns and real-time fouling monitoring. Membrane manufacturers and researchers typically recommend cleaning (standard operating procedure – SOP) when normalized permeate flow, a performance metric tracking the decline of permeate flow/output from its initial baseline with respect to operating pressure, reaches 0.85-0.90 of baseline values. This study used estimates of production cost, internal profitability metrics, and permeate volume output to evaluate and compare the impact of time selection for cleaning intervention. The cleanings initiated when the normalized permeate flow reached 0.85 represented the control for cleaning intervention times. In deciding optimal times for cleaning intervention, a Deep Reinforcement Learning (RL) agent was trained to signal cleaning between 0.85-0.90 normalized with a cost-based reward system. A laboratory-scale RO flat membrane desalination system platform was developed as a model plant, and data from the platform and used to train the model and examine both simulated and actual control of when to trigger membrane cleaning, replacing the control operator's 0.85 cleaning threshold. Compared to SOP, the intelligent operator showed consistent savings in production costs at the expense of total permeate volume output. The simulated operation using the RL initiated yielded 9% less permeate water but reduced the cost per unit volume ($/m3) by 12.3%. When the RL agent was used to initiate cleaning on the laboratory-scale RO desalination system platform, the system produced 21% less permeate water but reduced production cost ($/m3) by 16.0%. These results are consistent with an RL agent that prioritizes production cost savings over product volume output. / Doctor of Philosophy / The decreasing supply of freshwater sources has made desalination technology an attractive solution. Desalination—or the removal of salt from water—provides an opportunity to produce more freshwater by treating saline sources and recycled water. One prominent form of desalination is Reverse Osmosis (RO), an energy intensive process in which freshwater is forced from a pressurized feed through a semipermeable membrane. A significant limiting cost factor for RO desalination is the maintenance and replacement of semipermeable RO membranes. Over time, unwanted particles accumulate on the membrane surface in a process known as membrane fouling. Significant levels of fouling can drive up costs, negatively affect product quality (permeate water), and decrease the useful lifetime of the membrane. As a result, operators employ various fouling control techniques, such as membrane cleaning, to mitigate its effects on production and minimize damage to the membrane. This dissertation investigates potential applications for Machine Learning (ML) and real-time fouling monitors in Reverse Osmosis (RO) desalination. The main objective was to develop a framework that minimizes the cost of membrane fouling by deploying AI-generated cleaning patterns and real-time fouling monitoring. Membrane manufacturers and researchers typically recommend cleaning (standard operating procedure – SOP) when normalized permeate flow, a performance metric tracking the decline of permeate flow/output from its initial baseline with respect to operating pressure, reaches 0.85-0.90 of baseline values. This study used estimates of production cost, internal profitability metrics, and permeate volume output to evaluate and compare the impact of time selection for cleaning intervention. The cleanings initiated when the normalized permeate flow reached 0.85 represented the control for cleaning intervention times. In deciding optimal times for cleaning intervention, a Deep Reinforcement Learning (RL) agent was trained to signal cleaning between 0.85-0.90 normalized with a cost-based reward system. A laboratory-scale RO flat membrane desalination system platform was developed as a model plant, and data from the platform and used to train the model and examine both simulated and actual control of when to trigger membrane cleaning, replacing the control operator's 0.85 cleaning threshold. Compared to SOP, the intelligent operator showed consistent savings in production costs at the expense of total permeate volume output. The simulated operation using the RL initiated yielded 9% less permeate water but reduced the cost per unit volume ($/m3) by 12.3%. When the RL agent was used to initiate cleaning on the laboratory-scale RO desalination system platform, the system produced 21% less permeate water but reduced production cost ($/m3) by 16.0%. These results are consistent with an RL agent that prioritizes production cost savings over product volume output. Reverse Osmosis Desalination Crossflow System Machine Learning Deep Reinforcement Learning Electrical Impedance Spectroscopy
35	Deep Reinforcement Learning of IoT System Dynamics for Optimal Orchestration and Boosted Efficiency Haowei Shi (16636062) 30 August 2023 (has links) <p>This thesis targets the orchestration challenge of the Wearable Internet of Things (IoT) systems, for optimal configurations of the system in terms of energy efficiency, computing, and data transmission activities. We have firstly investigated the reinforcement learning on the simulated IoT environments to demonstrate its effectiveness, and afterwards studied the algorithm on the real-world wearable motion data to show the practical promise. More specifically, firstly, challenge arises in the complex massive-device orchestration, meaning that it is essential to configure and manage the massive devices and the gateway/server. The complexity on the massive wearable IoT devices, lies in the diverse energy budget, computing efficiency, etc. On the phone or server side, it lies in how global diversity can be analyzed and how the system configuration can be optimized. We therefore propose a new reinforcement learning architecture, called boosted deep deterministic policy gradient, with enhanced actor-critic co-learning and multi-view state?transformation. The proposed actor-critic co-learning allows for enhanced dynamics abstraction through the shared neural network component. Evaluated on a simulated massive-device task, the proposed deep reinforcement learning framework has achieved much more efficient system configurations with enhanced computing capabilities and improved energy efficiency. Secondly, we have leveraged the real-world motion data to demonstrate the potential of leveraging reinforcement learning to optimally configure the motion sensors. We used paradigms in sequential data estimation to obtain estimated data for some sensors, allowing energy savings since these sensors no longer need to be activated to collect data for estimation intervals. We then introduced the Deep Deterministic Policy Gradient algorithm to learn to control the estimation timing. This study will provide a real-world demonstration of maximizing energy efficiency wearable IoT applications while maintaining data accuracy. Overall, this thesis will greatly advance the wearable IoT system orchestration for optimal system configurations. </p> Deep learning Neural networks Reinforcement learning Reinforcement Learning IoT deep reinforcement learning
36	MMF-DRL: Multimodal Fusion-Deep Reinforcement Learning Approach with Domain-Specific Features for Classifying Time Series Data Sharma, Asmita 01 June 2023 (has links) (PDF) This research focuses on addressing two pertinent problems in machine learning (ML) which are (a) the supervised classification of time series and (b) the need for large amounts of labeled images for training supervised classifiers. The novel contributions are two-fold. The first problem of time series classification is addressed by proposing to transform time series into domain-specific 2D features such as scalograms and recurrence plot (RP) images. The second problem which is the need for large amounts of labeled image data, is tackled by proposing a new way of using a reinforcement learning (RL) technique as a supervised classifier by using multimodal (joint representation) scalograms and RP images. The motivation for using such domain-specific features is that they provide additional information to the ML models by capturing domain-specific features (patterns) and also help in taking advantage of state-of-the-art image classifiers for learning the patterns from these textured images. Thus, this research proposes a multimodal fusion (MMF) - deep reinforcement learning (DRL) approach as an alternative technique to traditional supervised image classifiers for the classification of time series. The proposed MMF-DRL approach produces improved accuracy over state-of-the-art supervised learning models while needing fewer training data. Results show the merit of using multiple modalities and RL in achieving improved performance than training on a single modality. Moreover, the proposed approach yields the highest accuracy of 90.20% and 89.63% respectively for two physiological time series datasets with fewer training data in contrast to the state-of-the-art supervised learning model ChronoNet which gave 87.62% and 88.02% accuracy respectively for the two datasets with more training data. Deep Reinforcement Learning Time Series Data Multimodal Fusion DQN Feature Engineering Recurrence Plot
37	Autonomous Navigation with Deep Reinforcement Learning in Carla Simulator Wang, Peilin 08 December 2023 (has links) With the rapid development of autonomous driving and artificial intelligence technology, end-to-end autonomous driving technology has become a research hotspot. This thesis aims to explore the application of deep reinforcement learning in the realizing of end-to-end autonomous driving. We built a deep reinforcement learning virtual environment in the Carla simulator, and based on it, we trained a policy model to control a vehicle along a preplanned route. For the selection of the deep reinforcement learning algorithms, we have used the Proximal Policy Optimization algorithm due to its stable performance. Considering the complexity of end-to-end autonomous driving, we have also carefully designed a comprehensive reward function to train the policy model more efficiently. The model inputs for this study are of two types: firstly, real-time road information and vehicle state data obtained from the Carla simulator, and secondly, real-time images captured by the vehicle's front camera. In order to understand the influence of different input information on the training effect and model performance, we conducted a detailed comparative analysis. The test results showed that the accuracy and significance of the information has a significant impact on the learning effect of the agent, which in turn has a direct impact on the performance of the model. Through this study, we have not only confirmed the potential of deep reinforcement learning in the field of end-to-end autonomous driving, but also provided an important reference for future research and development of related technologies. info:eu-repo/classification/ddc/380 ddc:380
38	Cost-Effective Large-Scale Digital Twins Notification System with Prioritization Consideration Vrbaski, Mira 19 December 2023 (has links) Large-Scale Digital Twins Notification System (LSDTNS) monitors a Digital Twin (DT) cluster for a predefined critical state, and once it detects such a state, it sends a Notification Event (NE) to a predefined recipient. Additionally, the time from producing the DT's Complex Event (CE) to sending an alarm has to be less than a predefined deadline. However, addressing scalability and multi-objectives, such as deployment cost, resource utilization, and meeting the deadline, on top of process scheduling, presents a complex challenge. Therefore, this thesis presents a complex methodology consisting of three contributions that address system scalability, multi-objectivity and scheduling of CE processes using Reinforcement Learning (RL). The first contribution proposes the IoT Notification System Architecture based on a micro-service-based notification methodology that allows for running and seamlessly switching between various CE reasoning algorithms. Our proposed IoT Notification System architecture addresses the scalability issue in state-of-the-art CE Recognition systems. The second contribution proposes a novel methodology for multi-objective optimization for cloud provisioning (MOOP). MOOP is the first work dealing with multi-optimization objectives for microservice notification applications, where the notification load is variable and depends on the results of previous microservices subtasks. MOOP provides a multi-objective mathematical cloud resource deployment model and demonstrates effectiveness through the case study. Finally, the thesis presents a Scheduler for large-scale Critical Notification applications based on a Deep Reinforcement Learning (SCN-DRL) scheduling approach for LSDTNS using RL. SCN-DRL is the first work dealing with multi-objective optimization for critical microservice notification applications using RL. During the performance evaluation, SCN-DRL demonstrates better performance than state-of-the-art heuristics. SCN-DRL shows steady performance when the notification workload increases from 10% to 90%. In addition, SCN-DRL, tested with three neural networks, shows that it is resilient to sudden container resources drop by 10%. Such resilience to resource container failures is an important attribute of a distributed system. IoT Digital Twin cloud computing microservices containers multi-objective optimization deep reinforcement learning
39	End-to-End Autonomous Driving with Deep Reinforcement Learning in Simulation Environments Wang, Bingyu 10 April 2024 (has links) In the rapidly evolving field of autonomous driving, the integration of Deep Reinforcement Learning (DRL) promises significant advancements towards achieving reliable and efficient vehicular systems. This study presents a comprehensive examination of DRL’s application within a simulated autonomous driving context, with a focus on the nuanced impact of representation learning parameters on the performance of end-to-end models. An overview of the theoretical underpinnings of machine learning, deep learning, and reinforcement learning is provided, laying the groundwork for their application in autonomous driving scenarios. The methodology outlines a detailed framework for training autonomous vehicles in the Duckietown simulation environment, employing both non-end-to-end and end-to-end models to investigate the effectiveness of various reinforcement learning algorithms and representation learning techniques. At the heart of this research are extensive simulation experiments designed to evaluate the Proximal Policy Optimization (PPO) algorithm’s effectiveness within the established framework. The study delves into reward structures and the impact of representation learning parameters on the performance of end-to-end models. A critical comparison of the models in the validation chapter highlights the significant role of representation learning parameters in the outcomes of DRL-based autonomous driving systems. The findings reveal that meticulous adjustment of representation learning parameters markedly influences the end-to-end training process. Notably, image segmentation techniques significantly enhance feature recognizability and model performance.:Contents List of Figures List of Tables List of Abbreviations List of Symbols 1 Introduction 1.1 Autonomous Driving Overview 1.2 Problem Description 1.3 Research Structure 2 Research Background 2.1 Theoretical Basis 2.1.1 Machine Learning 2.1.2 Deep Learning 2.1.3 Reinforcement Learning 2.2 Related Work 3 Methodology 3.1 Problem Definition 3.2 Simulation Platform 3.3 Observation Space 3.3.1 Observation Space of Non-end-to-end model 3.3.2 Observation Space of end-to-end model 3.4 Action Space 3.5 Reward Shaping 3.5.1 speed penalty 3.5.2 position reward 3.6 Map and training dataset 3.6.1 Map Design 3.6.2 Training Dataset 3.7 Variational Autoencoder Structure 3.7.1 Mathematical fundation for VAE 3.8 Reinforcement Learning Framework 3.8.1 Actor-Critic Method 3.8.2 Policy Gradient 3.8.3 Trust Region Policy Optimization 3.8.4 Proximal Policy Optimization 4 Simulation Experiments 4.1 Experimental Setup 4.2 Representation Learning Model 4.3 End-to-end Model 5 Result 6 Validation and Evaluation 6.1 Validation of End-to-end Model 6.2 Evaluation of End-to-end Model 6.2.1 Comparison with Baselines 6.2.2 Comparison with Different Representation Learning Model 7 Conclusion and Future Work 7.1 Summary 7.2 Future Research info:eu-repo/classification/ddc/380 ddc:380
40	Real-Time Resource Optimization for Wireless Networks Huang, Yan 11 January 2021 (has links) Resource allocation in modern wireless networks is constrained by increasingly stringent real-time requirements. Such real-time requirements typically come from, among others, the short coherence time on a wireless channel, the small time resolution for resource allocation in OFDM-based radio frame structure, or the low-latency requirements from delay-sensitive applications. An optimal resource allocation solution is useful only if it can be determined and applied to the network entities within its expected time. For today's wireless networks such as 5G NR, such expected time (or real-time requirement) can be as low as 1 ms or even 100 μs. Most of the existing resource optimization solutions to wireless networks do not explicitly take real-time requirement as a constraint when developing solutions. In fact, the mainstream of research works relies on the asymptotic complexity analysis for designing solution algorithms. Asymptotic complexity analysis is only concerned with the growth of its computational complexity as the input size increases (as in the big-O notation). It cannot capture the real-time requirement that is measured in wall-clock time. As a result, existing approaches such as exact or approximate optimization techniques from operations research are usually not useful in wireless networks in the field. Similarly, many problem-specific heuristic solutions with polynomial-time asymptotic complexities may suffer from a similar fate, if their complexities are not tested in actual wall-clock time. To address the limitations of existing approaches, this dissertation presents novel real- time solution designs to two types of optimization problems in wireless networks: i) problems that have closed-form mathematical models, and ii) problems that cannot be modeled in closed-form. For the first type of problems, we propose a novel approach that consists of (i) problem decomposition, which breaks an original optimization problem into a large number of small and independent sub-problems, (ii) search intensification, which identifies the most promising problem sub-space and selects a small set of sub-problems to match the available GPU processing cores, and (iii) GPU-based large-scale parallel processing, which solves the selected sub-problems in parallel and finds a near-optimal solution to the original problem. The efficacy of this approach has been illustrated by our solutions to the following two problems. • Real-Time Scheduling to Achieve Fair LTE/Wi-Fi Coexistence: We investigate a resource optimization problem for the fair coexistence between LTE and Wi-Fi in the unlicensed spectrum. The real-time requirement for finding the optimal channel division and LTE resource allocation solution is on 1 ms time scale. This problem involves the optimal division of transmission time for LTE and Wi-Fi across multi- ple unlicensed bands, and the resource allocation among LTE users within the LTE's "ON" periods. We formulate this optimization problem as a mixed-integer linear pro- gram and prove its NP-hardness. Then by exploiting the unique problem structure, we propose a real-time solution design that is based on problem decomposition and GPU-based parallel processing techniques. Results from an implementation on the NVIDIA GPU/CUDA platform demonstrate that the proposed solution can achieve near-optimal objective and meet the 1 ms timing requirement in 4G LTE. • An Ultrafast GPU-based Proportional Fair Scheduler for 5G NR: We study the popular proportional-fair (PF) scheduling problem in a 5G NR environment. The real-time requirement for determining the optimal (with respect to the PF objective) resource allocation and MCS selection solution is 125 μs (under 5G numerology 3). In this problem, we need to allocate frequency-time resource blocks on an operating channel and assign modulation and coding scheme (MCS) for each active user in the cell. We present GPF+ — a GPU based real-time PF scheduler. With GPF+, the original PF optimization problem is decomposed into a large number of small and in- dependent sub-problems. We then employ a cross-entropy based search intensification technique to identify the most promising problem sub-space and select a small set of sub-problems to fit into a GPU. After solving the selected sub-problems in parallel using GPU cores, we find the best sub-problem solution and use it as the final scheduling solution. Evaluation results show that GPF+ is able to provide near-optimal PF performance in a 5G cell while meeting the 125 μs real-time requirement. For the second type of problems, where there is no closed-form mathematical formulation, we propose to employ model-free deep learning (DL) or deep reinforcement learning (DRL) techniques along with judicious consideration of timing requirement throughout the design. Under DL/DRL, we employ deep function approximators (neural networks) to learn the unknown objective function of an optimization problem, approximate an optimal algorithm to find resource allocation solutions, or discover important mapping functions related to the resource optimization. To meet the real-time requirement, we propose to augment DL or DRL methods with optimization techniques at the input or output of the deep function approximators to reduce their complexities and computational time. Under this approach, we study the following two problems: • A DRL-based Approach to Dynamic eMBB/URLLC Multiplexing in 5G NR: We study the problem of dynamic multiplexing of eMBB and URLLC on the same channel through preemptive resource puncturing. The real-time requirement for determining the optimal URLLC puncturing solution is 1 ms (under 5G numerology 0). A major challenge in solving this problem is that it cannot be modeled using closed-form mathematical expressions. To address this issue, we develop a model-free DRL approach which employs a deep neural network to learn an optimal algorithm to allocate the URLLC puncturing over the operating channel, with the objective of minimizing the adverse impact from URLLC traffic on eMBB. Our contributions include a novel learning method that exploits the intrinsic properties of the URLLC puncturing optimization problem to achieve a fast and stable learning convergence, and a mechanism to ensure feasibility of the deep neural network's output puncturing solution. Experimental results demonstrate that our DRL-based solution significantly outperforms state-of-the-art algorithms proposed in the literature and meets the 1 ms real-time requirement for dynamic multiplexing. • A DL-based Link Adaptation for eMBB/URLLC Multiplexing in 5G NR: We investigate MCS selection for eMBB traffic under the impact of URLLC preemptive puncturing. The real-time requirement for determining the optimal MCSs for all eMBB transmissions scheduled in a transmission interval is 125 μs (under 5G numerology 3). The objective is to have eMBB meet a given block-error rate (BLER) target under the adverse impact of URLLC puncturing. Since this problem cannot be mathematically modeled in closed-form, we proposed a DL-based solution design that uses a deep neural network to learn and predict the BLERs of a transmission under each MCS level. Then based on the BLER predictions, an optimal MCS can be found for each transmission that can achieve the BLER target. To meet the 5G real-time requirement, we implement this design through a hybrid CPU and GPU architecture to minimize the execution time. Extensive experimental results show that our design can select optimal MCS under the impact of preemptive puncturing and meet the 125 μs timing requirement. / Doctor of Philosophy / In modern wireless networks such as 4G LTE and 5G NR, the optimal allocation of radio resources must be performed within a real-time requirement of 1 ms or even 100 μs time scale. Such a real-time requirement comes from the physical properties of wireless channels, the short time resolution for resource allocation defined in the wireless communication standards, and the low-latency requirement from delay-sensitive applications. Real-time requirement, although necessary for wireless networks in the field, has hardly been considered as a key constraint for solution design in the research community. Existing solutions in the literature mostly consider theoretical computational complexities, rather than actual computation time as measured by wall clock. To address the limitations of existing approaches, this dissertation presents real-time solution designs to two types of optimization problems in wireless networks: i) problems that have mathematical models, and ii) problems that cannot be modeled mathematically. For the first type of problems, we propose a novel approach that consists of (i) problem decomposition, (ii) search intensification, and (iii) GPU-based large-scale parallel processing techniques. The efficacy of this approach has been illustrated by our solutions to the following two problems. • Real-Time Scheduling to Achieve Fair LTE/Wi-Fi Coexistence: We investigate a resource optimization problem for the fair coexistence between LTE and Wi-Fi users in the same (unlicensed) spectrum. The real-time requirement for finding the optimal LTE resource allocation solution is on 1 ms time scale. • An Ultrafast GPU-based Proportional Fair Scheduler for 5G NR: We study the popular proportional-fair (PF) scheduling problem in a 5G NR environment. The real-time requirement for determining the optimal resource allocation and modulation and coding scheme (MCS) for each user is 125 μs. For the second type of problems, where there is no mathematical formulation, we propose to employ model-free deep learning (DL) or deep reinforcement learning (DRL) techniques along with judicious consideration of timing requirement throughout the design. Under this approach, we study the following two problems: • A DRL-based Approach to Dynamic eMBB/URLLC Multiplexing in 5G NR: We study the problem of dynamic multiplexing of eMBB and URLLC on the same channel through preemptive resource puncturing. The real-time requirement for determining the optimal URLLC puncturing solution is 1 ms. • A DL-based Link Adaptation for eMBB/URLLC Multiplexing in 5G NR: We investigate MCS selection for eMBB traffic under the impact of URLLC preemptive puncturing. The real-time requirement for determining the optimal MCSs for all eMBB transmissions scheduled in a transmission interval is 125 μs. Wireless network resource allocation scheduling mathematical modeling Optimization real time GPU deep learning deep reinforcement learning

Search results