Spelling suggestions: "subject:"reinforcement."" "subject:"einforcement.""
581 |
Biased Exploration in Offline Hierarchical Reinforcement LearningMiller, Eric D. 26 January 2021 (has links)
No description available.
|
582 |
Model-Free Reinforcement Learning for Hierarchical OO-MDPsGoldblatt, John Dallan 23 May 2022 (has links)
No description available.
|
583 |
ENHANCING POLICY OPTIMIZATION FOR IMPROVED SAMPLE EFFICIENCY AND GENERALIZATION IN DEEP REINFORCEMENT LEARNINGMd Masudur Rahman (19818171) 08 October 2024 (has links)
<p dir="ltr">The field of reinforcement learning has made significant progress in recent years, with deep reinforcement learning (RL) being a major contributor. However, there are still challenges associated with the effective training of RL algorithms, particularly with respect to sample efficiency and generalization. This thesis aims to address these challenges by developing RL algorithms capable of generalizing to unseen environments and adapting to dynamic conditions, thereby expanding the practical applicability of RL in real-world tasks. The first contribution of this thesis is the development of novel policy optimization techniques that enhance the generalization capabilities of RL agents. These techniques include the Thinker method, which employs style transfer to diversify observation trajectories, and Bootstrap Advantage Estimation, which improves policy and value function learning through augmented data. These methods have demonstrated superior performance in standard benchmarks, outperforming existing data augmentation and policy optimization techniques. Additionally, this thesis introduces Robust Policy Optimization, a method that enhances exploration in policy gradient-based RL by perturbing action distributions. This method addresses the limitations of traditional methods, such as entropy collapse and primacy bias, resulting in improved sample efficiency and adaptability in continuous action spaces. The thesis further explores the potential of natural language descriptions as an alternative to image-based state representations in RL. This approach enhances interpretability and generalization in tasks involving complex visual observations by leveraging large language models. Furthermore, this work contributes to the field of semi-autonomous teleoperated robotic surgery by developing systems capable of performing complex surgical tasks remotely, even under challenging conditions such as communication delays and data scarcity. The creation of the DESK dataset supports knowledge transfer across different robotic platforms, further enhancing the capabilities of these systems. Overall, the advancements presented in this thesis represent significant steps toward developing more robust, adaptable, and efficient autonomous agents. These contributions have broad implications for various real-world applications, including autonomous systems, robotics, and safety-critical tasks such as medical surgery.</p>
|
584 |
Experimental and analytical investigation of reinforced concrete bridge pier caps with an externally bonded stainless steel systemKim, Sung Hu 07 January 2016 (has links)
This research is aimed at examining experimentally and analytically the behavior of reinforced concrete bridge pier caps strengthened with externally bonded reinforcement. In the experimental study, nine full-scale reinforced concrete bridge pier caps were built, externally strengthened with stainless steel reinforcement, and ten tested to failure. Load, deflection, and strain measurements were collected and two potential failure mechanisms were identified. In the analytical part of this work, mechanics-based equations were developed for calculating the shear strength of these types of structural elements when a diagonal shear crack is formed under loading. In addition, a combined strut-and-tie/truss model is proposed for determining the strength of reinforced concrete bridge caps with externally bonded reinforcement. Results from both experimental and analytical studies were compared and design recommendations are made for future adoption in bridge and building codes and specifications.
|
585 |
A comparison of omission training with constant or changing reinforcers vs. extinction:response reduction and recoveryVatterott, Madeleine Kay. January 1984 (has links)
Call number: LD2668 .T4 1984 V37 / Master of Science
|
586 |
Mekaniska beräkningar av armeringstråd vid förläggning på högspänningskablar / Mechanical calculations of reinforcing wire upon the application on high voltage cablesNilsson, Philip January 2014 (has links)
This thesis has taken place at ABB High Voltage Cables in Karlskrona and focuses on their reinforcement process (AR50) which reinforces the cable by application of reinforcement wires. The research is strictly limited to only the short period during the application of the wire on the cable and investigates stress differences in one reinforcing wire depending on cable - and wire dimensions as well as brake forces used in the production. The study follows a model - and theory development research process combined with a testing process to obtain the results. The study aims is to increase and expand ABB's knowledge about the reinforcing process that is used to strengthen and protect ABB’s all different high voltage cables together with a computational calculation model. The model is developed in the FEA (Finite Element Analysis) program ABAQUS through a dynamic explicit model. An explanation of how the calculation model has been built and the parameters used are described in this report. These parts then contribute to the outcome of the study which provides a sense that the brake force used in AR50’s reinforcement process does not need to be controlled with a high precision so long as it is large enough to hold the reinforcement wire stretched upon the application. The study also shows that different cable - and wire dimensions does not affect the stress levels somewhat significantly by reinforcing the process and that the nipple used in reinforcement process to press down the reinforcing wire on the cable is the main source that determines how the stress distribution looks like on the reinforcement wire. / Detta examensarbete har tagit plats på ABB High Voltage Cables i Karlskrona och fokuserar på deras armeringsprocess (AR50) som förstärker kabeln genom påläggning av armeringstrådar. Arbetet är starkt begränsat till enbart den korta perioden för själva påläggningen av tråden och undersöker spänningsskillnader i en armeringstråd beroende på olika kabel – och tråddimensioner samt bromskrafter som används i produktionen. Studien följer en modell – och teoriutvecklande forskningsprocess kombinerat med ett utprövande resultatbildande. Studiens syfte är att tillsammans med en beräkningsmodell öka och fördjupa ABBs kunskaper kring armeringstråden som idag används för att stärka och skydda ABBs alla olika högspänningskablar. Beräkningsmodellen tas fram i FEA (Finita Element Analys) prorammet ABAQUS genom en dynamisk explicit modell. En förklaring till hur beräkningsmodellen har byggts upp och vilka parametrar som används beskrivs i rapporten. Dessa delar bidrar sedan till resultatet i studien som ger en bild av att bromskraften som används i AR50s armeringsprocessen inte behöver kontrolleras med en hög precision så länge den är tillräckligt stor för att hålla armeringstråden sträckt vid påläggningen. Studien visar också att olika kabel – och tråddimensioner inte påverkar spänningsnivåerna något markant vid armeringsprocessen och att nippeln som används i armeringsprocesen för att trycka ner armeringstråden mot kabeln bestämmer hur spänningsbilden ser ut. / <p>This thesis is kept confidential</p>
|
587 |
THE EFFECTS OF COUNSELING AND VERBAL REINFORCEMENT ON THE INTERNAL-EXTERNAL CONTROL OF THE DISABLEDCoven, Arnold Barrett, 1929- January 1970 (has links)
No description available.
|
588 |
Reinforcement-learning-based autonomous vehicle navigation in a dynamically changing environmentNgai, Chi-kit., 魏智傑. January 2007 (has links)
published_or_final_version / abstract / Electrical and Electronic Engineering / Doctoral / Doctor of Philosophy
|
589 |
Reducing top mat reinforcement in bridge decksFoster, Stephen Wroe, 1986- 21 October 2010 (has links)
The Texas Department of Transportation (TxDOT) uses precast, prestressed concrete panels (PCPs) as stay-in-place formwork for most bridges built in Texas. The PCPs are placed on the top flanges of adjacent girders and topped with a 4-in. cast-in-place (CIP) slab. This thesis is directed towards identifying and quantifying the serviceability implications of reducing the deck reinforcement across the interior spans of CIP-PCP decks. The goal of this research is to understand how the PCPs influence cracking and crack control in the CIP slab and to make recommendations to optimize the top mat reinforcement accordingly.
Several tests were conducted to evaluate the performance of different top mat reinforcement arrangements for ability to control crack widths across PCP joints. The longitudinal reinforcement was tested using a constant bending moment test, a point load test, and several direct tension tests. Because of difficulty with the CIP-PCP interface during the longitudinal tests, direct tension tests of the CIP slab only were used to compare the transverse reinforcement alternatives. Prior to testing, various top mat design alternatives were evaluated through pre-test calculations for crack widths. Standard reinforcing bars and welded wire reinforcement were considered for the design alternatives.
During this study, it was found that the tensile strength of the CIP slab is critical to controlling transverse crack widths. The CIP-PCP interface is difficult to simulate in the laboratory because of inherent eccentricities that result from the test specimen geometry and loading conditions. Furthermore, the constraint and boundary conditions of CIP-PCP bridge decks are difficult to simulate in the laboratory. Based on the results of this testing program, it seems imprudent to reduce the longitudinal reinforcement across the interior spans of CIP-PCP decks. The transverse reinforcement, however, may be reduced using welded wire reinforcement across the interior spans of CIP-PCP decks without compromising longitudinal crack width control. A reduced standard reinforcing bar option may also be considered, but a slight increase in longitudinal crack widths should be expected. / text
|
590 |
APPLICATION OF SWARM AND REINFORCEMENT LEARNING TECHNIQUES TO REQUIREMENTS TRACINGSultanov, Hakim 01 January 2013 (has links)
Today, software has become deeply woven into the fabric of our lives. The quality of the software we depend on needs to be ensured at every phase of the Software Development Life Cycle (SDLC). An analyst uses the requirements engineering process to gather and analyze system requirements in the early stages of the SDLC. An undetected problem at the beginning of the project can carry all the way through to the deployed product.
The Requirements Traceability Matrix (RTM) serves as a tool to demonstrate how requirements are addressed by the design and implementation elements throughout the entire software development lifecycle. Creating an RTM matrix by hand is an arduous task. Manual generation of an RTM can be an error prone process as well.
As the size of the requirements and design document collection grows, it becomes more challenging to ensure proper coverage of the requirements by the design elements, i.e., assure that every requirement is addressed by at least one design element. The techniques used by the existing requirements tracing tools take into account only the content of the documents to establish possible links. We expect that if we take into account the relative order of the text around the common terms within the inspected documents, we may discover candidate links with a higher accuracy.
The aim of this research is to demonstrate how we can apply machine learning algorithms to software requirements engineering problems. This work addresses the problem of requirements tracing by viewing it in light of the Ant Colony Optimization (ACO) algorithm and a reinforcement learning algorithm. By treating the documents as the starting (nest) and ending points (sugar piles) of a path and the terms used in the documents as connecting nodes, a possible link can be established and strengthened by attracting more agents (ants) onto a path between the two documents by using pheromone deposits. The results of the work show that ACO and RL can successfully establish links between two sets of documents.
|
Page generated in 0.0509 seconds