Global ETD Search

211	A Classification and Visualization System for Lower-Limb Activities Analysis With Musculoskeletal Modeling Zheng, Jianian 01 June 2020 (has links) No description available. Computer Science
212	Market Timing strategy through Reinforcement Learning HE, Xuezhong January 2021 (has links) This dissertation implements an optimal trading strategy based on the machine learning method and extreme value theory (EVT) to obtain an excess return on investments in the capital market. The trading strategy outperforms the benchmark S&P 500 index with higher returns and lower volatility through effective market timing. In addition, this dissertation starts by modeling the market tail risk using the EVT and reinforcement learning methods, distinguishing from the traditional value at risk method. In this dissertation, I used EVT to extract the characteristics of the tail risk, which are inputs for reinforcement learning. This process is proved to be effective in market timing, and the trading strategy could avoid market crash and achieve a long-term excess return. In sum, this study has several contributions. First, this study takes a new method to analyze stock price (in this dissertation, I use the S&P 500 index as a stock). I combined the EVT and reinforcement learning to study the price tail risk and predict stock crash efficiently, which is a new method for tail risk research. Thus, I can predict the stock crash or provide the probability of risk, and then, the trading strategy can be built. The second contribution is that this dissertation provides a dynamic market timing trading strategy, which can significantly outperform the market index with a lower volatility and a higher Sharpe ratio. Moreover, the dynamic trading process can provide investors an intuitive sense on the stock market and help in decision-making. Third, the success of the strategy shows that the combination of EVT and reinforcement learning can predict the stock crash very well, which is a great improvement on the extreme event study and deserves further study. / Business Administration/Finance Finance Extreme value theory Reinforcement learning Tail risk Trading strategy
213	Enhancing Traffic Efficiency of Mixed Traffic Using Control-based and Learning-based Connected and Autonomous Systems Young Joun Ha (8065802) 15 August 2023 (has links) <p>Inefficient traffic operations have far-reaching consequences in not just travel mobility but also public health, social equity, and economic prosperity. Congestion, a key symptom of inefficient operations, can lead to increased emissions, accidents, and productivity loss. Therefore, advancements and policies in transportation engineering require careful scrutiny to not only prevent unintended adverse consequences but also capitalize on opportunities for improvement. In spite of significant efforts to enhance traffic mobility and safety, human control of vehicles remains prone to errors such as inattention, impatience, and intoxication. Connected and autonomous vehicles (CAVs) are seen as a great opportunity to address human-related inefficiencies. This dissertation focuses on connectivity and automation and investigates the synergies between technologies. First, a deep reinforcement learning based strategy is proposed herein to enable agents to address the dynamic nature of inputs in traffic environments and to capture proximal and distant information, and to facilitate learning in rapidly changing traffic. The strategy is applied to alleviate congestion at highway bottlenecks by training a small number of CAVs to cooperatively reduce congestion through deep reinforcement learning. Secondly, to address congestion at intersections, the dissertation introduces a fog-based graphic RL (FG-RL) framework. This approach allows traffic signals across multiple intersections to form a cooperative coalition, sharing information for signal timing prescriptions. Large-scale traffic signal optimization is computationally inefficient, so the proposed FG-RL approach breaks down networks into smaller fog nodes that function as local centralization points within a decentralized system Doing so allows for a bottom-up solution approach for decomposing large traffic networks. Furthermore, the dissertation pioneers the notion of using a small CAV fleet, selected from any existing shared autonomous mobility services (SAMSs) to assist city road agencies to achieve string-stable driving in locally congested urban traffic. These vehicles are dispersed throughout the network to perform their primary function of providing ride-share services. However, when they encounter severe congestion, they act cooperatively with each other to be rerouted and to undertake traffic-stabilizing maneuvers to smoothen the traffic and reduce congestion. The smoothing behavior is learned through DRL, while the rerouting is facilitated through the proposed constrained entropy-based dynamic AV routing algorithm (CEDAR).</p> Transport engineering Connected and Autonomous Systems Traffic Flow Reinforcement Learning
214	Reinforcement Learning assisted Adaptive difficulty of Proof of Work (PoW) in Blockchain-enabled Federated Learning Sethi, Prateek 10 August 2023 (has links) This work addresses the challenge of heterogeneity in blockchain mining, particularly in the context of consortium and private blockchains. The motivation stems from ensuring fairness and efficiency in blockchain technology's Proof of Work (PoW) consensus mechanism. Existing consensus algorithms, such as PoW, PoS, and PoB, have succeeded in public blockchains but face challenges due to heterogeneous miners. This thesis highlights the significance of considering miners' computing power and resources in PoW consensus mechanisms to enhance efficiency and fairness. It explores the implications of heterogeneity in blockchain mining in various applications, such as Federated Learning (FL), which aims to train machine learning models across distributed devices collaboratively. The research objectives of this work involve developing novel RL-based techniques to address the heterogeneity problem in consortium blockchains. Two proposed RL-based approaches, RL based Miner Selection (RL-MS) and RL based Miner and Difficulty Selection (RL-MDS), focus on selecting miners and dynamically adapting the difficulty of PoW based on the computing power of the chosen miners. The contributions of this research work include the proposed RL-based techniques, modifications to the Ethereum code for dynamic adaptation of Proof of Work Difficulty (PoW-D), integration of the Commonwealth Cyber Initiative (CCI) xG testbed with an AI/ML framework, implementation of a simulator for experimentation, and evaluation of different RL algorithms. The research also includes additional contributions in Open Radio Access Network (O-RAN) and smart cities. The proposed research has significant implications for achieving fairness and efficiency in blockchain mining in consortium and private blockchains. By leveraging reinforcement learning techniques and considering the heterogeneity of miners, this work contributes to improving the consensus mechanisms and performance of blockchain-based systems. / Master of Science / Technological Advancement has led to devices having powerful yet heterogeneous computational resources. Due to the heterogeneity in the compute of miner nodes in a blockchain, there is unfairness in the PoW Consensus mechanism. More powerful devices have a higher chance of mining and gaining from the mining process. Additionally, the PoW consensus introduces a delay due to the time to mine and block propagation time. This work uses Reinforcement Learning to solve the challenge of heterogeneity in a private Ethereum blockchain. It also introduces a time constraint to ensure efficient blockchain performance for time-critical applications. Blockchain Machine Learning Federated learning Reinforcement Learning MEC 5G
215	Online Model-Free Distributed Reinforcement Learning Approach for Networked Systems of Self-organizing Agents Chen, Yiqing 22 December 2021 (has links) Control of large groups of robotic agents is driven by applications including military, aeronautics and astronautics, transportation network, and environmental monitoring. Cooperative control of networked multi-agent systems aims at driving the behavior of the group via feedback control inputs that encode the groups’ dynamics based on information sharing, with inter-agent communications that can be time varying and be spatially non-uniform. Notably, local interaction rules can induce coordinated behaviour, provided suitable network topologies. Distributed learning paradigms are often necessary for this class of systems to be able to operate autonomously and robustly, without the need of external units providing centralized information. Compared with model-based protocols that can be computationally prohibitive due to their mathematical complexity and requirements in terms of feedback information, we present an online model-free algorithm for some nonlinear tracking problems with unknown system dynamics. This method prescribes the actuation forces of agents to follow the time-varying trajectory of a moving target. The tracking problem is addressed by an online value iteration process which requires measurements collected along the trajectories. A set of simulations are conducted to illustrate that the presented algorithm is well functioning in various reference-tracking scenarios. Distributed Reinforcement Learning Cooperative Control Model-free Multi-agent systems
216	Integrating Deterministic Planning and Reinforcement Learning for Complex Sequential Decision Making Ernsberger, Timothy S. 07 March 2013 (has links) No description available. Artificial Intelligence reinforcement learning automated planning Markov decision process
217	Learning Successful Strategies in Repeated General-sum Games Crandall, Jacob W. 21 December 2005 (has links) (PDF) Many environments in which an agent can use reinforcement learning techniques to learn profitable strategies are affected by other learning agents. These situations can be modeled as general-sum games. When playing repeated general-sum games with other learning agents, the goal of a self-interested learning agent is to maximize its own payoffs over time. Traditional reinforcement learning algorithms learn myopic strategies in these games. As a result, they learn strategies that produce undesirable results in many games. In this dissertation, we develop and analyze algorithms that learn non-myopic strategies when playing many important infinitely repeated general-sum games. We show that, in many of these games, these algorithms outperform existing multiagent learning algorithms. We derive performance guarantees for these algorithms (for certain learning parameters) and show that these guarantees become stronger and apply to larger classes of games as more information is observed and used by the agents. We establish these results through empirical studies and mathematical proofs. multi-agent learning reinforcement learning satisficing Computer Sciences
218	Reinforcement Learning and Trajectory Optimization for the Concurrent Design of high-performance robotic systems Grandesso, Gianluigi 05 July 2023 (has links) As progress pushes the boundaries of both the performance of new hardware components and the computational capacity of modern computers, the requirements on the performance of robotic systems are becoming more and more demanding. The objective of this thesis is to demonstrate that concurrent design (Co-Design) is the approach to follow to design hardware and control for such high-performance robots. In particular, this work proposes a co-design framework and an algorithm to tackle two main issues: i) how to use Co-Design to benchmark different robotic systems, and ii) how to effectively warm-start the trajectory optimization (TO) problem underlying the co-design problem aiming at global optimality. The first contribution of this thesis is a co-design framework for the energy efficiency analysis of a redundant actuation architecture combining Quasi-Direct Drive (QDD) motors and Series Elastic Actuators (SEAs). The energy consumption of the redundant actuation system is compared to that of Geared Motors (GMs) and SEAs alone. This comparison is made considering two robotic systems performing different tasks. The results show that, using the redundant actuation, one can save up to 99% of energy with respect to SEA for sinusoidal movements. This efficiency is achieved by exploiting the coupled dynamics of the two actuators, resulting in a latching-like control strategy. The analysis also shows that these large energy savings are not straightforwardly extendable to non-sinusoidal movements, but smaller savings (e.g., 7%) are nonetheless possible. The results highlight that the combination of complex hardware morphologies and advanced numerical Co-Design can lead to peak hardware performance that would be unattainable by human intuition alone. Moreover, it is also shown how to leverage Stochastic Programming (SP) to extend a similar co-design framework to design robots that are robust to disturbances by combining TO, morphology and feedback control optimization. The second contribution is a first step towards addressing the non-convexity of complex co-design optimization problems. To this aim, an algorithm for the optimal control of dynamical systems is designed that combines TO and Reinforcement Learning (RL) in a single framework. This algorithm tackles the two main limitations of TO and RL when applied to continuous-space non-linear systems to minimize a non-convex cost function: TO can get stuck in poor local minima when the search is not initialized close to a “good” minimum, whereas the RL training process may be excessively long and strongly dependent on the exploration strategy. Thus, the proposed algorithm learns a “good” control policy via TO-guided RL policy search. Using this policy to compute an initial guess for TO, makes the trajectory optimization process less prone to converge to poor local optima. The method is validated on several reaching problems featuring non-convex obstacle avoidance with different dynamical systems. The results show the great capabilities of the algorithm in escaping local minima, while being more computationally efficient than the state-of-the-art RL algorithms Deep Deterministic Policy Gradient and Proximal Policy Optimization. The current algorithm deals only with the control side of a co-design problem, but future work will extend it to include also hardware optimization. All things considered, this work advanced the state of the art on Co-Design, providing a framework and an algorithm to design both hardware and control for high-performance robots and aiming to the global optimality.
219	Basis Construction and Utilization for Markov Decision Processes Using Graphs Johns, Jeffrey Thomas 01 February 2010 (has links) The ease or difficulty in solving a problemstrongly depends on the way it is represented. For example, consider the task of multiplying the numbers 12 and 24. Now imagine multiplying XII and XXIV. Both tasks can be solved, but it is clearly more difficult to use the Roman numeral representations of twelve and twenty-four. Humans excel at finding appropriate representations for solving complex problems. This is not true for artificial systems, which have largely relied on humans to provide appropriate representations. The ability to autonomously construct useful representations and to efficiently exploit them is an important challenge for artificial intelligence. This dissertation builds on a recently introduced graph-based approach to learning representations for sequential decision-making problems modeled as Markov decision processes (MDPs). Representations, or basis functions, forMDPs are abstractions of the problem’s state space and are used to approximate value functions, which quantify the expected long-term utility obtained by following a policy. The graph-based approach generates basis functions capturing the structure of the environment. Handling large environments requires efficiently constructing and utilizing these functions. We address two issues with this approach: (1) scaling basis construction and value function approximation to large graphs/data sets, and (2) tailoring the approximation to a specific policy’s value function. We introduce two algorithms for computing basis functions from large graphs. Both algorithms work by decomposing the basis construction problem into smaller, more manageable subproblems. One method determines the subproblems by enforcing block structure, or groupings of states. The other method uses recursion to solve subproblems which are then used for approximating the original problem. Both algorithms result in a set of basis functions from which we employ basis selection algorithms. The selection algorithms represent the value function with as few basis functions as possible, thereby reducing the computational complexity of value function approximation and preventing overfitting. The use of basis selection algorithms not only addresses the scaling problem but also allows for tailoring the approximation to a specific policy. This results in a more accurate representation than obtained when using the same subset of basis functions irrespective of the policy being evaluated. To make effective use of the data, we develop a hybrid leastsquares algorithm for setting basis function coefficients. This algorithm is a parametric combination of two common least-squares methods used for MDPs. We provide a geometric and analytical interpretation of these methods and demonstrate the hybrid algorithm’s ability to discover improved policies. We also show how the algorithm can include graphbased regularization to help with sparse samples from stochastic environments. This work investigates all aspects of linear value function approximation: constructing a dictionary of basis functions, selecting a subset of basis functions from the dictionary, and setting the coefficients on the selected basis functions. We empirically evaluate each of these contributions in isolation and in one combined architecture. Markov decision process Reinforcement learning Representation discovery Computer Sciences
220	Autonomous Robot Skill Acquisition Konidaris, George D 13 May 2011 (has links) Among the most impressive of aspects of human intelligence is skill acquisition—the ability to identify important behavioral components, retain them as skills, refine them through practice, and apply them in new task contexts. Skill acquisition underlies both our ability to choose to spend time and effort to specialize at particular tasks, and our ability to collect and exploit previous experience to become able to solve harder and harder problems over time with less and less cognitive effort. Hierarchical reinforcement learning provides a theoretical basis for skill acquisition, including principled methods for learning new skills and deploying them during problem solving. However, existing work focuses largely on small, discrete problems. This dissertation addresses the question of how we scale such methods up to high-dimensional, continuous domains, in order to design robots that are able to acquire skills autonomously. This presents three major challenges; we introduce novel methods addressing each of these challenges. First, how does an agent operating in a continuous environment discover skills? Although the literature contains several methods for skill discovery in discrete environments, it offers none for the general continuous case. We introduce skill chaining, a general skill discovery method for continuous domains. Skill chaining incrementally builds a skill tree that allows an agent to reach a solution state from any of its start states by executing a sequence (or chain) of acquired skills. We empirically demonstrate that skill chaining can improve performance over monolithic policy learning in the Pinball domain, a challenging dynamic and continuous reinforcement learning problem. Second, how do we scale up to high-dimensional state spaces? While learning in relatively small domains is generally feasible, it becomes exponentially harder as the number of state variables grows. We introduce abstraction selection, an efficient algorithm for selecting skill-specific, compact representations from a library of available representations when creating a new skill. Abstraction selection can be combined with skill chaining to solve hard tasks by breaking them up into chains of skills, each defined using an appropriate abstraction. We show that abstraction selection selects an appropriate representation for a new skill using very little sample data, and that this leads to significant performance improvements in the Continuous Playroom, a relatively high-dimensional reinforcement learning problem. Finally, how do we obtain good initial policies? The amount of experience required to learn a reasonable policy from scratch in most interesting domains is unrealistic for robots operating in the real world. We introduce CST, an algorithm for rapidly constructing skill trees (with appropriate abstractions) from sample trajectories obtained via human demonstration, a feedback controller, or a planner. We use CST to construct skill trees from human demonstration in the Pinball domain, and to extract a sequence of low-dimensional skills from demonstration trajectories on a mobile robot. The resulting skills can be reliably reproduced using a small number of example trajectories. Finally, these techniques are applied to build a mobile robot control system for the uBot-5, resulting in a mobile robot that is able to acquire skills autonomously. We demonstrate that this system is able to use skills acquired in one problem to more quickly solve a new problem. Reinforcement Learning Robotics Skill Acquisition Artificial Intelligence and Robotics Computer Sciences

Search results