Global ETD Search

241	Hierarchical Bayesian Dataset Selection Zhou, Xiaona 05 1900 (has links) Despite the profound impact of deep learning across various domains, supervised model training critically depends on access to large, high-quality datasets, which are often challenging to identify. To address this, we introduce <b>H</b>ierarchical <b>B</b>ayesian <b>D</b>ataset <b>S</b>election (<b>HBDS</b>), the first dataset selection algorithm that utilizes hierarchical Bayesian modeling, designed for collaborative data-sharing ecosystems. The proposed method efficiently decomposes the contributions of dataset groups and individual datasets to local model performance using Bayesian updates with small data samples. Our experiments on two benchmark datasets demonstrate that HBDS not only offers a computationally lightweight solution but also enhances interpretability compared to existing data selection methods, by revealing deep insights into dataset interrelationships through learned posterior distributions. HBDS outperforms traditional non-hierarchical methods by correctly identifying all relevant datasets, achieving optimal accuracy with fewer computational steps, even when initial model accuracy is low. Specifically, HBDS surpasses its non-hierarchical counterpart by 1.8% on DIGIT-FIVE and 0.7% on DOMAINNET, on average. In settings with limited resources, HBDS achieves a 6.9% higher accuracy than its non-hierarchical counterpart. These results confirm HBDS's effectiveness in identifying datasets that improve the accuracy and efficiency of deep learning models when collaborative data utilization is essential. / Master of Science / Deep learning technologies have revolutionized many domains and applications, from voice recognition in smartphones to automated recommendations on streaming services. However, the success of these technologies heavily relies on having access to large and high-quality datasets. In many cases, selecting the right datasets can be a daunting challenge. To tackle this, we have developed a new method that can quickly figure out which datasets or groups of datasets contribute most to improving the performance of a model with only a small amount of data needed. Our tests prove that this method is not only effective and light on computation but also helps us understand better how different datasets relate to each other. Hierarchical Bayesian Data-Sharing Reinforcement Learning Dataset Selection
242	Use of Reinforcement Learning for Interference Avoidance or Efficient Jamming in Wireless Communications Schutz, Zachary Alexander 05 June 2024 (has links) We implement reinforcement learning in the context of wireless communications in two very different settings. In the first setting, we study the use of reinforcement learning in an underwater acoustic communications network to adapt its transmission frequencies to avoid interference and potential malicious jammers. To that effect, we implement a reinforcement learning algorithm called contextual bandits. The harsh environment of an underwater channel provides a challenging problem. The channel may induce multipath and time delays which lead to time-varying, frequency-selective attenuation. These factors are also influenced by the distance between the transmitter and receiver, the subbands the interference is located within, and the power of the transmitter. We show that the agent is effectively able to avoid frequency bands that have degraded channel quality or that contain interference, both of which are dynamic or time-varying . In the second setting, we study the use of reinforcement learning to adapt the modulation and power scheme of a jammer seeking to disrupt a wireless communications system. To achieve this, we make use of a linear contextual bandit to learn to jam the victim system. Prior work has shown that with the use of linear bandits, improved convergence is achieved to jam a single-carrier system using time-domain jamming schemes. However, communications systems today typically employ orthogonal frequency division multiplexing (OFDM) to transmit data, particularly in 4G/5G networks. This work explores the use of linear Thompson Sampling (TS) to jam OFDM-modulated signals. The jammer may select from both time-domain and frequency-domain jamming schemes. We demonstrate that the linear TS algorithm is able to perform better than a traditional reinforcement learning algorithm, upper confidence bound-1 (UCB-1), in terms of maximizing the victim's symbol error rate. We also draw novel insights by observing the action states, to which the reinforcement learning algorithm converges. We then investigate the design and modification of the context vector in the hope of in- creasing overall performance of the bandit, such as decreased learning period and increased symbol error rate caused to the victim. This includes running experiments on particular features and examining how the bandit weights the importance of the features in the context vector. Lastly, we study how to jam an OFDM-modulated signal which employs forward error correction coding. We extend this to leverage reinforcement learning to jam a 5G-based system implementing some aspects of the 5G protocol. This model is then modified to introduce unreliable reward feedback in the form of ACK/NACK observations to the jammer to understand the effect of how imperfect observations of errors can affect the jammer's ability to learn. We gain insights into the convergence time of the jammer and its ability to jam the victim, as well as improvements to the algorithm, and insights into the vulnerabilities of wireless communications for reinforcement learning based jamming. / Master of Science / In this thesis we implement a class of reinforcement learning known as contextual bandits in two different applications of communications systems and jamming. In the first setting, we study the use of reinforcement learning in an underwater acoustic communications network to adapt its transmission frequencies to avoid interference and potential malicious jammers. We show that the agent is effectively able to avoid frequency bands that have degraded channel quality or that contain interference, both of which are dynamic or time-varying. In the second setting, we study the use of reinforcement learning to adapt the jamming type, such as using additive white Gaussian noise, and power scheme of a jammer seeking to disrupt a wireless communications system. To achieve this, we make use of a linear contextual bandit which implies that the contexts that the jammer is able to observe and the sampled probability of each arm has a linear relationship with the reward function. We demonstrate that the linear algorithm is able to outperform a traditional reinforcement learning algorithm in terms of maximizing the victim's symbol error rate. We extend this work by examining the impact of the context feature vector design, LTE/5G-based protocol specifics (such as error correction coding), and imperfect reward feedback information. We gain insights into the convergence time of the jammer and its ability to jam the victim, as well as improvements to the algorithm, and insights into the vulnerabilities of wireless communications for reinforcement learning based jamming. Reinforcement Learning OFDM Interference Avoidance Jamming Underwater Channel 5G
243	Derivative-Free Meta-Blackbox Optimization on Manifold Sel, Bilgehan 06 1900 (has links) Solving a sequence of high-dimensional, nonconvex, but potentially similar optimization problems poses a significant computational challenge in various engineering applications. This thesis presents the first meta-learning framework that leverages the shared structure among sequential tasks to improve the computational efficiency and sample complexity of derivative-free optimization. Based on the observation that most practical high-dimensional functions lie on a latent low-dimensional manifold, which can be further shared among problem instances, the proposed method jointly learns the meta-initialization of a search point and a meta-manifold. This novel approach enables the efficient adaptation of the optimization process to new tasks by exploiting the learned meta-knowledge. Theoretically, the benefit of meta-learning in this challenging setting is established by proving that the proposed method achieves improved convergence rates and reduced sample complexity compared to traditional derivative-free optimization techniques. Empirically, the effectiveness of the proposed algorithm is demonstrated in two high-dimensional reinforcement learning tasks, showcasing its ability to accelerate learning and improve performance across multiple domains. Furthermore, the robustness and generalization capabilities of the meta-learning framework are explored through extensive ablation studies and sensitivity analyses. The thesis highlights the potential of meta-learning in tackling complex optimization problems and opens up new avenues for future research in this area. / Master of Science / Optimization problems are ubiquitous in various fields, from engineering to finance, where the goal is to find the best solution among a vast number of possibilities. However, solving these problems can be computationally challenging, especially when the search space is high-dimensional and the problem is nonconvex, meaning that there may be multiple locally optimal solutions. This thesis introduces a novel approach to tackle these challenges by leveraging the power of meta-learning, a technique that allows algorithms to learn from previous experiences and adapt to new tasks more efficiently. The proposed framework is based on the observation that many real-world optimization problems share similar underlying structures, even though they may appear different on the surface. By exploiting this shared structure, the meta-learning algorithm can learn a low-dimensional representation of the problem space, which serves as a guide for efficiently searching for optimal solutions in new, unseen problems. This approach is particularly useful when dealing with a sequence of related optimization tasks, as it allows the algorithm to transfer knowledge from one task to another, thereby reducing the computational burden and improving the overall performance. The effectiveness of the proposed meta-learning framework is demonstrated through rigorous theoretical analysis and empirical evaluations on challenging reinforcement learning tasks. These tasks involve high-dimensional search spaces and require the algorithm to adapt to changing environments. The results show that the meta-learning approach can significantly accelerate the learning process and improve the quality of the solutions compared to traditional optimization methods. Meta Optimization Zeroth Order Search Meta Reinforcement Learning
244	Reliable Low Latency Machine Learning for Resource Management in Wireless Networks Taleb Zadeh Kasgari, Ali 30 March 2022 (has links) Next-generation wireless networks must support a plethora of new applications ranging from the Internet of Things to virtual reality. Each one of these emerging applications have unique rate, reliability, and latency requirements that substantially differ from traditional services such as video streaming. Hence, there is a need for designing an efficient resource management framework that is taking into account different components that can affect the resource usage, including less obvious factors such as human behavior that contribute to the resource usage of the system. The use of machine learning for modeling mentioned components in a resource management system is a promising solution. This is because many hidden factors might contribute to the resource usage pattern of users or machine-type devices that can only be modeled using an end-to-end machine learning solution. Therefore, machine learning algorithms can be used either for modeling a complex factor such as the human brain's delay perception or for designing an end-to-end resource management system. The overarching goal of this dissertation is to develop and deploy machine learning frameworks that are suitable to model the various components of a wireless resource management system that must provide reliable and low latency service to the users. First, by explicitly modeling the limitations of the human brain, a concrete measure for the delay perception of human users in a wireless network is introduced. Then, a new probabilistic model for this delay perception is learned based on the brain features of a human user. Given the learned model for the delay perception of the human brain, a brain-aware resource management algorithm is proposed for allocating radio resources to human users while minimizing the transmit power and taking into account the reliability of both machine type devices and human users. Next, a novel experienced deep reinforcement learning (deep-RL) framework is proposed to provide model-free resource allocation for ultra reliable low latency communication (URLLC) in the downlink of a wireless network. The proposed, experienced deep-RL framework can guarantee high end-to-end reliability and low end-to-end latency, under explicit data rate constraints, for each wireless user without any models of or assumptions on the users' traffic. In particular, in order to enable the deep-RL framework to account for extreme network conditions and operate in highly reliable systems, a new approach based on generative adversarial networks (GANs) is proposed. After that, the problem of network slicing is studied in the context of a wireless system having a time-varying number of users that require two types of slices: reliable low latency (RLL) and self-managed (capacity limited) slices. To address this problem, a novel control framework for stochastic optimization is proposed based on the Lyapunov drift-plus-penalty method. This new framework enables the system to minimize power, maintain slice isolation, and provide reliable and low latency end-to-end communication for RLL slices. Then, a novel concept of three-dimensional (3D) cellular networks, that integrate drone base stations (drone-BS) and cellular-connected drone users (drone-UEs), is introduced. For this new 3D cellular architecture, a novel framework for network planning for drone-BSs as well as latency-minimal cell association for drone-UEs is proposed. For network planning, a tractable method for drone-BSs' deployment based on the notion of truncated octahedron shapes is proposed that ensures full coverage for a given space with minimum number of drone-BSs. In addition, to characterize frequency planning in such 3D wireless networks, an analytical expression for the feasible integer frequency reuse factors is derived. Subsequently, an optimal 3D cell association scheme is developed for which the drone-UEs' latency, considering transmission, computation, and backhaul delays, is minimized. Finally, the concept of super environments is introduced. After formulating this concept mathematically, it is shown that any two markov decision process (MDP) can be a member of a super environment if sufficient additional state space is added. Then the effect of this additional state space on model-free and model-based deep-RL algorithms is investigated. Next, the tradeoff caused by adding the extra state space on the speed of convergence and the optimality of the solution is discussed. In summary, this dissertation led to the development of machine learning algorithms for statistically modeling complex parts in the resource management system. Also, it developed a model-free controller that can control the resource management system reliably, with low latency, and optimally. / Doctor of Philosophy / Next-generation wireless networks must support a plethora of new applications ranging from the Internet of Things to virtual reality. Each one of these emerging applications have unique requirements that substantially differ from traditional services such as video streaming. Hence, there is a need for designing a new and efficient resource management framework that is taking into account different components that can affect the resource usage, including less obvious factors such as human behavior that contributes to the resource usage of the system. The use of machine learning for modeling mentioned components in a resource management system is a promising solution. This is because of the data-driven nature of machine learning algorithms that can help us to model many hidden factors that might contribute to the resource usage pattern of users or devices. These hidden factors can only be modeled using an end-to-end machine learning solution. By end-to-end, we mean the system only relies on its observation of the quality of service (QoS) for users. Therefore, machine learning algorithms can be used either for modeling a complex factor such as the human brain's delay perception or for designing an end-to-end resource management system. The overarching goal of this dissertation is to develop and deploy machine learning frameworks that are suitable to model the various components of a wireless resource management system that must provide reliable and low latency service to the users. Machine Learning Deep Reinforcement Learning Wireless Resource Management
245	Deep Reinforcement Learning for Next Generation Wireless Networks with Echo State Networks Chang, Hao-Hsuan 26 August 2021 (has links) This dissertation considers a deep reinforcement learning (DRL) setting under the practical challenges of real-world wireless communication systems. The non-stationary and partially observable wireless environments make the learning and the convergence of the DRL agent challenging. One way to facilitate learning in partially observable environments is to combine recurrent neural network (RNN) and DRL to capture temporal information inherent in the system, which is referred to as deep recurrent Q-network (DRQN). However, training DRQN is known to be challenging requiring a large amount of training data to achieve convergence. In many targeted wireless applications in the 5G and future 6G wireless networks, the available training data is very limited. Therefore, it is important to develop DRL strategies that are capable of capturing the temporal correlation of the dynamic environment that only requires limited training overhead. In this dissertation, we design efficient DRL frameworks by utilizing echo state network (ESN), which is a special type of RNNs where only the output weights are trained. To be specific, we first introduce the deep echo state Q-network (DEQN) by adopting ESN as the kernel of deep Q-networks. Next, we introduce federated ESN-based policy gradient (Fed-EPG) approach that enables multiple agents collaboratively learn a shared policy to achieve the system goal. We designed computationally efficient training algorithms by utilizing the special structure of ESNs, which have the advantage of learning a good policy in a short time with few training data. Theoretical analyses are conducted for DEQN and Fed-EPG approaches to show the convergence properties and to provide a guide to hyperparameter tuning. Furthermore, we evaluate the performance under the dynamic spectrum sharing (DSS) scenario, which is a key enabling technology that aims to utilize the precious spectrum resources more efficiently. Compared to a conventional spectrum management policy that usually grants a fixed spectrum band to a single system for exclusive access, DSS allows the secondary system to dynamically share the spectrum with the primary system. Our work sheds light on the real deployments of DRL techniques in next generation wireless systems. / Doctor of Philosophy / Model-free reinforcement learning (RL) algorithms such as Q-learning are widely used because it can learn the policy directly through interactions with the environment without estimating a model of the environment, which is useful when the underlying system model is complex. Q-learning performs poorly for large-scale models because the training has to updates every element in a large Q-table, which makes training difficult or even impossible. Therefore, deep reinforcement learning (DRL) exploits the powerful deep neural network to approximate the Q-table. Furthermore, a deep recurrent Q-network (DRQN) is introduced to facilitate learning in partially observable environments. However, DRQN training requires a large amount of training data and a long training time to achieve convergence, which is impractical in wireless systems with non-stationary environments and limited training data. Therefore, in this dissertation, we introduce two efficient DRL approaches: deep echo state Q-network (DEQN) and federated ESN-based policy gradient (Fed-EPG) approaches. Theoretical analyses of DEQN and Fed-EPG are conducted to provide the convergence properties and the guideline for designing hyperparameters. We evaluate and demonstrate the performance benefits of the DEQN and Fed-EPG under the dynamic spectrum sharing (DSS) scenario, which is a critical technology to efficiently utilize the precious spectrum resources in 5G and future 6G wireless networks. Deep Reinforcement Learning Echo State Network Dynamic Spectrum Sharing
246	Understanding social function in psychiatric illnesses through computational modeling and multiplayer games Cui, Zhuoya 26 May 2021 (has links) Impaired social functioning conferred by mental illnesses has been constantly implicated in previous literatures. However, studies of social abnormalities in psychiatric conditions are often challenged by the difficulties of formalizing dynamic social exchanges and quantifying their neurocognitive underpinnings. Recently, the rapid growth of computational psychiatry as a new field along with the development of multiplayer economic paradigms provide powerful tools to parameterize complex interpersonal processes and identify quantitative indicators of social impairments. By utilizing these methodologies, the current set of studies aimed to examine social decision making during multiplayer economic games in participants diagnosed with depression (study 1) and combat-related post-traumatic stress disorder (PTSD, study 2), as well as an online population with elevated symptoms of borderline personality disorder (BPD, study 3). We then quantified and disentangled the impacts of multiple latent decision-making components, mainly social valuation and social learning, on maladaptive social behavior via explanatory modeling. Different underlying alterations were revealed across diagnoses. Atypical social exchange in depression and BPD were found attributed to altered social valuation and social learning respectively, whereas both social valuation and social learning contributed to interpersonal dysfunction in PTSD. Additionally, model-derived indices of social abnormalities positively correlated with levels of symptom severity (study 1 and 2) and exhibited a longitudinal association with symptom change (study 1). Our findings provided mechanistic insights into interpersonal difficulties in psychiatric illnesses, and highlighted the importance of a computational understanding of social function which holds potential clinical implications in differential diagnosis and precise treatment. / Doctor of Philosophy / People with psychiatric conditions often suffer from impaired social relationships due to an inability to engage in everyday social interactions. As different illnesses can sometimes produce the same symptoms, social impairment can also have different causes. For example, individuals who constantly avoid social activities may find them less interesting or attempt to avoid potential negative experiences. While those who display elevated aggression may have a strong desire for social dominance or falsely believe that others are also aggressive. However, it is hard to infer what drives these alterations by just observing the behavior. To address this question, we enrolled people with three different kinds of psychopathology to play an interactive game together with another player and mathematically modeled their latent decision-making processes. By comparing their model parameters to those of the control population, we were able to infer how people with psychopathology made the decisions and which part of the decision-making processes went wrong that led to disrupted social interactions. We found altered model parameters differed among people with major depression, post-traumatic stress disorder and borderline personality disorder, suggesting different causes underlying impaired social behavior observed in the game, the extent of which also positively correlated with their psychiatric symptom severity. Understanding the reasons behind social dysfunctions associated with psychiatric illnesses can help us better differentiate people with different diagnoses and design more effective treatments to restore interpersonal relationships. computational psychiatry behavioral economics reinforcement learning social decision making
247	ACADIA: Efficient and Robust Adversarial Attacks Against Deep Reinforcement Learning Ali, Haider 05 January 2023 (has links) Existing adversarial algorithms for Deep Reinforcement Learning (DRL) have largely focused on identifying an optimal time to attack a DRL agent. However, little work has been explored in injecting efficient adversarial perturbations in DRL environments. We propose a suite of novel DRL adversarial attacks, called ACADIA, representing AttaCks Against Deep reInforcement leArning. ACADIA provides a set of efficient and robust perturbation-based adversarial attacks to disturb the DRL agent's decision-making based on novel combinations of techniques utilizing momentum, ADAM optimizer (i.e., Root Mean Square Propagation or RMSProp), and initial randomization. These kinds of DRL attacks with novel integration of such techniques have not been studied in the existing Deep Neural Networks (DNNs) and DRL research. We consider two well-known DRL algorithms, Deep-Q Learning Network (DQN) and Proximal Policy Optimization (PPO), under Atari games and MuJoCo where both targeted and non-targeted attacks are considered with or without the state-of-the-art defenses in DRL (i.e., RADIAL and ATLA). Our results demonstrate that the proposed ACADIA outperforms existing gradient-based counterparts under a wide range of experimental settings. ACADIA is nine times faster than the state-of-the-art Carlini and Wagner (CW) method with better performance under defenses of DRL. / Master of Science / Artificial Intelligence (AI) techniques such as Deep Neural Networks (DNN) and Deep Reinforcement Learning (DRL) are prone to adversarial attacks. For example, a perturbed stop sign can force a self-driving car's AI algorithm to increase the speed rather than stop the vehicle. There has been little work developing attacks and defenses against DRL. In DRL, a DNN-based policy decides to take an action based on the observation of the environment and gets the reward in feedback for its improvements. We perturb that observation to attack the DRL agent. There are two main aspects to developing an attack on DRL. One aspect is to identify an optimal time to attack (when-to-attack?). The second aspect is to identify an eﬀicient method to attack (how-to-attack?). To answer the second aspect, we propose a suite of novel DRL adversarial attacks, called ACADIA, representing AttaCks Against Deep reInforcement leArning. We consider two well-known DRL algorithms, Deep-Q Learning Network (DQN) and Proximal Policy Optimization (PPO), under DRL environments of Atari games and MuJoCo where both targeted and non-targeted attacks are considered with or without state-of-the-art defenses. Our results demonstrate that the proposed ACADIA outperforms state-of-the-art perturbation methods under a wide range of experimental settings. ACADIA is nine times faster than the state-of-the-art Carlini and Wagner (CW) method with better performance under the defenses of DRL. Secure AI Deep Reinforcement Learning Adversarial Learning Adversarial Attacks
248	A Reinforcement Learning-based Scheduler for Minimizing Casualties of a Military Drone Swarm Jin, Heng 14 July 2022 (has links) In this thesis, we consider a swarm of military drones flying over an unfriendly territory, where a drone can be shot down by an enemy with an age-based risk probability. We study the problem of scheduling surveillance image transmissions among the drones with the objective of minimizing the overall casualty. We present Hector, a reinforcement learning-based scheduling algorithm. Specifically, Hector only uses the age of each detected target, a piece of locally available information at each drone, as an input to a neural network to make scheduling decisions. Extensive simulations show that Hector significantly reduces casualties than a baseline round-robin algorithm. Further, Hector can offer comparable performance to a high-performing greedy scheduler, which assumes complete knowledge of global information. / Master of Science / Drones have been successfully deployed by the military. The advancement of machine learning further empowers drones to automatically identify, recognize, and even eliminate adversary targets on the battlefield. However, to minimize unnecessary casualties to civilians, it is important to introduce additional checks and control from the control center before lethal force is authorized. Thus, the communication between drones and the control center becomes critical. In this thesis, we study the problem of communication between a military drone swarm and the control center when drones are flying over unfriendly territory where drones can be shot down by enemies. We present Hector, an algorithm based on machine learning, to minimize the overall casualty of drones by scheduling data transmission. Extensive simulations show that Hector significantly reduces casualties than traditional algorithms. Drone swarm Casualty Scheduling Age of Information Reinforcement learning
249	Considerations of Reinforcement Learning within Real-Time Wireless Communication Systems Jones, Alyse M. 15 June 2022 (has links) Afflicted heavily by spectrum congestion, the unpredictable, dynamic conditions of the radio frequency (RF) spectrum has increasingly become a major obstacle for devices today. More specifically, a significant threat existing within this kind of environment is interference caused by collisions, which is increasingly unavoidable in an overcrowded spectrum. Thus, these devices require a way to avoid such events. Cognitive radios (CR) were proposed as a solution through its transmission adaptability and decision-making capabilities within a radio. Through spectrum sensing, CRs are able to capture the current condition of the RF spectrum and based on its decision-making strategy, interpret these results to make an informed decision on what to do next to optimize its own communication. With the emergence of artificial intelligence, one such decision-making strategy CRs can utilize is Reinforcement Learning (RL). Unlike standard adaptive radios, CRs equipped with RL can predict the conditions of the RF spectrum, and using these predictions, understand what it must do in the future to operate optimally. Recognizing the usefulness of RL in hard-to-predict environments, such as the RF spectrum, research of RL within CRs have become more popular over the past decade, especially for interference mitigation. However, the existing literature neglects to confront the possible limitations that pose a threat to the proper implementation of RL in RF systems. Therefore, this thesis is motivated to investigate what limitations in real-time communication systems can hinder the performance of RL, and as a result of these limitations, emphasize the considerations that should be a focus in the design and implementation of radio frequency reinforcement learning (RFRL) systems. The effects of latency, power, wireless channel impairments, different transmission protocols, and different spectrum sensing detectors are among the possible limitations simulated and analyzed within this work that are not typically considered within simulation-based prior art. To perform this investigation, a representative real-time OFDM transmit/receive chain is implemented within the GNU Radio framework. The system, operating over-the-air through USRPs, leverages reinforcement learning, e.g. Q-Learning, in order to avoid interference with other spectrum users. Performance analysis of this representative system provides a systematic approach for helping to predict limiting factors within an implemented real-time system and thus, aim to provide guidance on how to design these systems with these practical limitations in mind. / M.S. / Because the space in which communication signals travel is congested with activity, collisions among signals, called interference, is becoming more of a problem in wireless communications. Therefore, to avoid such an occurrence, intelligent radios are used to adapt communication devices to operate optimally within this congested space. With the emergence of artificial intelligence, where devices can learn on their own how to adapt, one such way an intelligent radio can dynamically adapt to the congestion is through Reinforcement Learning (RL), which enables prediction of signal activity within the communication space over time. Intelligent radios equipped with RL learn through trial-and-error how to operate optimally within the communication space to avoid places within the communication space that are busy and congested. Recognizing the usefulness of RL in hard-to-predict environments, research of RL within intelligent radios have become more popular over the past decade, especially for navigating a communication space that is congested and where collisions are common. However, existing literature neglects to confront the possible limitations that pose a threat to the proper implementation of RL in communication systems. Therefore, this thesis is motivated to investigate what limitations in real-world communication systems can hinder the performance of RL, and as a result of these limitations, emphasize the considerations that should be a focus in the design and implementation of communication systems equipped with RL. Effects, such as delays in the system, differences in how the signal operates, and how the signal is affected while it is traveling, are among the possible limitations simulated and analyzed within this work that are not typically considered within prior art. To perform this investigation, a modern representative communication system was implemented within software-enabled radios. The system leverages reinforcement learning in order to avoid collisions with other signals in the communication space. Performance analysis of this representative system provides a systematic approach for helping to predict limiting factors within an implemented real-world communication system with RL and thus, aim to provide guidance on how to design these systems with these practical limitations in mind. Wireless Communications Reinforcement Learning Intelligent Radio Spectrum Avoidance
250	Directional Airflow for HVAC Systems Abedi, Milad January 2019 (has links) Directional airflow has been utilized to enable targeted air conditioning in cars and airplanes for many years, where the occupants could adjust the direction of flow. In the building sector however, HVAC systems are usually equipped with stationary diffusors that can only supply the air either in the form of diffusion or with fixed direction to the room in which they have been installed. In the present thesis, the possibility of adopting directional airflow in lieu of the conventional uniform diffusors has been investigated. The potential benefits of such a modification in control capabilities of the HVAC system in terms of improvements in the overall occupant thermal comfort and energy consumption of the HVAC system have been investigated via a simulation study and an experimental study. In the simulation study, an average of 59% per cycle reduction was achieved in the energy consumption. The reduction in the required duration of airflow (proportional to energy consumption) in the experimental study was 64% per cycle. The feasibility of autonomous control of the directional airflow, has been studied in a simulation experiment by utilizing the Reinforcement Learning algorithm which is an artificial intelligence approach that facilitates autonomous control in unknown environments. In order to demonstrate the feasibility of enabling the existing HVAC systems to control the direction of airflow, a device (called active diffusor) was designed and prototyped. The active diffusor successfully replaced the existing uniform diffusor and was able to effectively target the occupant positions by accurately directing the airflow jet to the desired positions. / M.S. / The notion of adjustable direction of airflow has been used in the car industry and airplanes for decades, enabling the users to manually adjust the direction of airflow to their satisfaction. However, in the building the introduction of the incoming airflow to the environment of the room is achieved either by non-adjustable uniform diffusors, aiming to condition the air in the environment in a homogeneous manner. In the present thesis, the possibility of adopting directional airflow in place of the conventional uniform diffusors has been investigated. The potential benefits of such a modification in control capabilities of the HVAC system in terms of improvements in the overall occupant thermal comfort and energy consumption of the HVAC system have been investigated via a simulation study and an experimental study. In the simulation study, an average of 59% per cycle reduction was achieved in the energy consumption. The reduction in the required duration of airflow (proportional to energy consumption) in the experimental study was 64% per cycle on average. The feasibility of autonomous control of the directional airflow, has been studied in a simulation experiment by utilizing the Reinforcement Learning algorithm which is an artificial intelligence approach that facilitates autonomous control in unknown environments. In order to demonstrate the feasibility of enabling the existing HVAC systems to control the direction of airflow, a device (called active diffusor) was designed and prototyped. The active diffusor successfully replaced the existing uniform diffusor and was able to effectively target the occupant positions by accurately directing the airflow jet to the desired positions. HVAC Thermal Comfort Reinforcement Learning Building Energy Consumption

Search results