• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 695
  • 81
  • 68
  • 22
  • 11
  • 8
  • 8
  • 7
  • 7
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 1118
  • 1118
  • 277
  • 234
  • 216
  • 189
  • 168
  • 167
  • 160
  • 157
  • 152
  • 135
  • 129
  • 128
  • 119
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
211

Reinforcement Learning of Repetitive Tasks for Autonomous Heavy-Duty Vehicles

Lindesvik Warma, Simon January 2020 (has links)
Many industrial applications of heavy-duty autonomous vehicles include repetitive manoeuvres, such as, vehicle parking, hub-to-hub transportation etc. This thesis explores the possibility to use the information from previous executions, via reinforcement learning, of specific manoeuvres to improve the performance for future iterations. The manoeuvres are; one straight line path, and one constantly curved path. A proportional-integrative control strategy is designed to control the vehicle and the controller is updated, between each iteration, using a policy gradient method. A rejection sampling procedure is introduced to impose the stability of the control system. This is necessary since the general reinforcement learning framework and policy gradient framework do not consider stability. The performance of the rejection sampling procedure is improved using the ideas of simulated annealing. The performance improvement of the vehicle is evaluated through simulations. Linear and nonlinear vehicle models are evaluated on a straight line path and a constantly curved path. The simulations show that the vehicle improves its ability to track the reference path for all evaluation models and scenarios. Finally, the simulations also show that the controlled system is kept stable throughout the learning process. / Autonoma fordon är en viktig pusselbit i framtidens transportlösningar och industriella miljöer, både klimat- och säkerhetsmässigt. Många manövrar industriella fordon utför är repetetiva, exempelvis parkering. Det här arbetet utforskar möjligheten att lära sig av tidigare försök av manövern för att förbättra fordonets förmåga att utföra den. En proportionelig-integrerande reglerstruktur används för att styra fordonet. Reglerstrukturen är en tillståndsåterkoppling där regulatorn består av två proportionelig-integrerende regulatorer. Reglersystemet är initialiserat stabilt och fordonet låts utföra en iteration av manövern. Regulatorn updateras mellan varje iteration av manövern med hjälp av förstärkningsinlärning. Förstärkningslärning innebär att man använder informationen från tidigare försök av manövern för att förbättra fordonets förmåga att följa referensbanan. Förstärkningslärningen ger alltså instruktioner om hur regulatorn ska uppdateras baserat på hur fordonet presterade under förra iterationen. En samplings procedur implementeras för att försäkra stabiliteten av reglersystemet eftersom förstärkningslärandet inte tar hänsyn till detta. Syftet med samplings proceduren är också att minimera de negativa effekterna på lärningsprocessen. Algoritmen är analyserad genom att simulera fordonet med hjälp av både linjära- och olinjära utvärderingsmodeller på två olika scenarion; en rak bana och en bana med konstant kurvatur. Simuleringarna visar att fordonet förbättrar sin förmåga att följa referensbanorna för alla utvärderingsmodeller av fordonet. Simuleringarna visar också att reglersystemet hålls stabilt under lärningsprocessen.
212

Intelligent Device Selection in Federated Edge Learning with Energy Efficiency

Peng, Cheng 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Due to the increasing demand from mobile devices for the real-time response of cloud computing services, federated edge learning (FEL) emerges as a new computing paradigm, which utilizes edge devices to achieve efficient machine learning while protecting their data privacy. Implementing efficient FEL suffers from the challenges of devices' limited computing and communication resources, as well as unevenly distributed datasets, which inspires several existing research focusing on device selection to optimize time consumption and data diversity. However, these studies fail to consider the energy consumption of edge devices given their limited power supply, which can seriously affect the cost-efficiency of FEL with unexpected device dropouts. To fill this gap, we propose a device selection model capturing both energy consumption and data diversity optimization, under the constraints of time consumption and training data amount. Then we solve the optimization problem by reformulating the original model and designing a novel algorithm, named E2DS, to reduce the time complexity greatly. By comparing with two classical FEL schemes, we validate the superiority of our proposed device selection mechanism for FEL with extensive experimental results. Furthermore, for each device in a real FEL environment, it is the fact that multiple tasks will occupy the CPU at the same time, so the frequency of the CPU used for training fluctuates all the time, which may lead to large errors in computing energy consumption. To solve this problem, we deploy reinforcement learning to learn the frequency so as to approach real value. And compared to increasing data diversity, we consider a more direct way to improve the convergence speed using loss values. Then we formulate the optimization problem that minimizes the energy consumption and maximizes the loss values to select the appropriate set of devices. After reformulating the problem, we design a new algorithm FCE2DS as the solution to have better performance on convergence speed and accuracy. Finally, we compare the performance of this proposed scheme with the previous scheme and the traditional scheme to verify the improvement of the proposed scheme in multiple aspects.
213

Multi-criteria decision making using reinforcement learning and its application to food, energy, and water systems (FEWS) problem

Deshpande, Aishwarya 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Multi-criteria decision making (MCDM) methods have evolved over the past several decades. In today’s world with rapidly growing industries, MCDM has proven to be significant in many application areas. In this study, a decision-making model is devised using reinforcement learning to carry out multi-criteria optimization problems. Learning automata algorithm is used to identify an optimal solution in the presence of single and multiple environments (criteria) using pareto optimality. The application of this model is also discussed, where the model provides an optimal solution to the food, energy, and water systems (FEWS) problem.
214

Optimal Control and Reinforcement Learning of Switched Systems

Chen, Hua January 2018 (has links)
No description available.
215

Reinforcement Learning For Multiple Time Series

Singh, Isha January 2019 (has links)
No description available.
216

A Classification and Visualization System for Lower-Limb Activities Analysis With Musculoskeletal Modeling

Zheng, Jianian 01 June 2020 (has links)
No description available.
217

Market Timing strategy through Reinforcement Learning

HE, Xuezhong January 2021 (has links)
This dissertation implements an optimal trading strategy based on the machine learning method and extreme value theory (EVT) to obtain an excess return on investments in the capital market. The trading strategy outperforms the benchmark S&P 500 index with higher returns and lower volatility through effective market timing. In addition, this dissertation starts by modeling the market tail risk using the EVT and reinforcement learning methods, distinguishing from the traditional value at risk method. In this dissertation, I used EVT to extract the characteristics of the tail risk, which are inputs for reinforcement learning. This process is proved to be effective in market timing, and the trading strategy could avoid market crash and achieve a long-term excess return. In sum, this study has several contributions. First, this study takes a new method to analyze stock price (in this dissertation, I use the S&P 500 index as a stock). I combined the EVT and reinforcement learning to study the price tail risk and predict stock crash efficiently, which is a new method for tail risk research. Thus, I can predict the stock crash or provide the probability of risk, and then, the trading strategy can be built. The second contribution is that this dissertation provides a dynamic market timing trading strategy, which can significantly outperform the market index with a lower volatility and a higher Sharpe ratio. Moreover, the dynamic trading process can provide investors an intuitive sense on the stock market and help in decision-making. Third, the success of the strategy shows that the combination of EVT and reinforcement learning can predict the stock crash very well, which is a great improvement on the extreme event study and deserves further study. / Business Administration/Finance
218

Enhancing Traffic Efficiency of Mixed Traffic Using Control-based and Learning-based Connected and Autonomous Systems

Young Joun Ha (8065802) 15 August 2023 (has links)
<p>Inefficient traffic operations have far-reaching consequences in not just travel mobility but also public health, social equity, and economic prosperity. Congestion, a key symptom of inefficient operations, can lead to increased emissions, accidents, and productivity loss. Therefore, advancements and policies in transportation engineering require careful scrutiny to not only prevent unintended adverse consequences but also capitalize on opportunities for improvement. In spite of significant efforts to enhance traffic mobility and safety, human control of vehicles remains prone to errors such as inattention, impatience, and intoxication. Connected and autonomous vehicles (CAVs) are seen as a great opportunity to address human-related inefficiencies. This dissertation focuses on connectivity and automation and investigates the synergies between technologies. First, a deep reinforcement learning based strategy is proposed herein to enable agents to address the dynamic nature of inputs in traffic environments and to capture proximal and distant information, and to facilitate learning in rapidly changing traffic. The strategy is applied to alleviate congestion at highway bottlenecks by training a small number of CAVs to cooperatively reduce congestion through deep reinforcement learning. Secondly, to address congestion at intersections, the dissertation introduces a fog-based graphic RL (FG-RL) framework. This approach allows traffic signals across multiple intersections to form a cooperative coalition, sharing information for signal timing prescriptions. Large-scale traffic signal optimization is computationally inefficient, so the proposed FG-RL approach breaks down networks into smaller fog nodes that function as local centralization points within a decentralized system Doing so allows for a bottom-up solution approach for decomposing large traffic networks. Furthermore, the dissertation pioneers the notion of using a small CAV fleet, selected from any existing shared autonomous mobility services (SAMSs) to assist city road agencies to achieve string-stable driving in locally congested urban traffic. These vehicles are dispersed throughout the network to perform their primary function of providing ride-share services. However, when they encounter severe congestion, they act cooperatively with each other to be rerouted and to undertake traffic-stabilizing maneuvers to smoothen the traffic and reduce congestion. The smoothing behavior is learned through DRL, while the rerouting is facilitated through the proposed constrained entropy-based dynamic AV routing algorithm (CEDAR).</p>
219

Reinforcement Learning assisted Adaptive difficulty of Proof of Work (PoW) in Blockchain-enabled Federated Learning

Sethi, Prateek 10 August 2023 (has links)
This work addresses the challenge of heterogeneity in blockchain mining, particularly in the context of consortium and private blockchains. The motivation stems from ensuring fairness and efficiency in blockchain technology's Proof of Work (PoW) consensus mechanism. Existing consensus algorithms, such as PoW, PoS, and PoB, have succeeded in public blockchains but face challenges due to heterogeneous miners. This thesis highlights the significance of considering miners' computing power and resources in PoW consensus mechanisms to enhance efficiency and fairness. It explores the implications of heterogeneity in blockchain mining in various applications, such as Federated Learning (FL), which aims to train machine learning models across distributed devices collaboratively. The research objectives of this work involve developing novel RL-based techniques to address the heterogeneity problem in consortium blockchains. Two proposed RL-based approaches, RL based Miner Selection (RL-MS) and RL based Miner and Difficulty Selection (RL-MDS), focus on selecting miners and dynamically adapting the difficulty of PoW based on the computing power of the chosen miners. The contributions of this research work include the proposed RL-based techniques, modifications to the Ethereum code for dynamic adaptation of Proof of Work Difficulty (PoW-D), integration of the Commonwealth Cyber Initiative (CCI) xG testbed with an AI/ML framework, implementation of a simulator for experimentation, and evaluation of different RL algorithms. The research also includes additional contributions in Open Radio Access Network (O-RAN) and smart cities. The proposed research has significant implications for achieving fairness and efficiency in blockchain mining in consortium and private blockchains. By leveraging reinforcement learning techniques and considering the heterogeneity of miners, this work contributes to improving the consensus mechanisms and performance of blockchain-based systems. / Master of Science / Technological Advancement has led to devices having powerful yet heterogeneous computational resources. Due to the heterogeneity in the compute of miner nodes in a blockchain, there is unfairness in the PoW Consensus mechanism. More powerful devices have a higher chance of mining and gaining from the mining process. Additionally, the PoW consensus introduces a delay due to the time to mine and block propagation time. This work uses Reinforcement Learning to solve the challenge of heterogeneity in a private Ethereum blockchain. It also introduces a time constraint to ensure efficient blockchain performance for time-critical applications.
220

Online Model-Free Distributed Reinforcement Learning Approach for Networked Systems of Self-organizing Agents

Chen, Yiqing 22 December 2021 (has links)
Control of large groups of robotic agents is driven by applications including military, aeronautics and astronautics, transportation network, and environmental monitoring. Cooperative control of networked multi-agent systems aims at driving the behavior of the group via feedback control inputs that encode the groups’ dynamics based on information sharing, with inter-agent communications that can be time varying and be spatially non-uniform. Notably, local interaction rules can induce coordinated behaviour, provided suitable network topologies. Distributed learning paradigms are often necessary for this class of systems to be able to operate autonomously and robustly, without the need of external units providing centralized information. Compared with model-based protocols that can be computationally prohibitive due to their mathematical complexity and requirements in terms of feedback information, we present an online model-free algorithm for some nonlinear tracking problems with unknown system dynamics. This method prescribes the actuation forces of agents to follow the time-varying trajectory of a moving target. The tracking problem is addressed by an online value iteration process which requires measurements collected along the trajectories. A set of simulations are conducted to illustrate that the presented algorithm is well functioning in various reference-tracking scenarios.

Page generated in 0.1369 seconds