Global ETD Search

201	FAST(ER) DATA GENERATION FOR OFFLINE RL AND FPS ENVIRONMENTS FOR DECISION TRANSFORMERS Mark R Trovinger (17549493) 06 December 2023 (has links) <p dir="ltr">Reinforcement learning algorithms have traditionally been implemented with the goal</p><p dir="ltr">of maximizing a reward signal. By contrast, Decision Transformer (DT) uses a transformer</p><p dir="ltr">model to predict the next action in a sequence. The transformer model is trained on datasets</p><p dir="ltr">consisting of state, action, return trajectories. The original DT paper examined a small</p><p dir="ltr">number of environments, five from the Atari domain, and three from continuous control,</p><p dir="ltr">and one that examined credit assignment. While this gives an idea of what the decision</p><p dir="ltr">transformer can do, the variety of environments in the Atari domain are limited. In this</p><p dir="ltr">work, we propose an extension of the environments that decision transformer can be trained</p><p dir="ltr">on by adding support for the VizDoom environment. We also developed a faster method for</p><p dir="ltr">offline RL dataset generation, using Sample Factory, a library focused on high throughput,</p><p dir="ltr">to generate a dataset comparable in quality to existing methods using significantly less time.</p><p dir="ltr"><br></p> Reinforcement learning decision transformer reinforcement learning vizdoom offline reinforce
202	An Evaluation of Cross-Function Stimuli in the Treatment of Automatically Maintained Problem Behavior Huang, Po-Kai 12 1900 (has links) Noncontingent reinforcement (NCR) is a possible alternative to differential reinforcement of other behaviors (DRO) that may operate through a similar mechanism. In the research, the participant's problem behaviors were maintained by automatic reinforcement or even multiply maintained. NCR is the method to intervene with the participant who had no clinical effect on using sensory integration therapy (SIT) to reduce problem behaviors in the previous study. The results showed that NCR is an effective way to decrease the problem behaviors without extinction burst. Noncontingent reinforcement (NCR) schedule thinning automatic reinforcement Psychology, Behavioral
203	The role of generalization decrement in failures to demonstrate conditioned reinforcement: temporal factors Astley, Suzette Lynn. January 1978 (has links) Call number: LD2668 .T4 1978 A84 / Master of Science Reinforcement (Psychology) Pigeons. Masters theses
204	The effects of prior foot shock upon bar pressing for intracranial stimulation MacDougall, James Morris. January 1966 (has links) Call number: LD2668 .T4 1966 M138 / Master of Science Reinforcement (Psychology) Psychophysiology. Masters theses
205	Physics-based reinforcement learning for autonomous manipulation Scholz, Jonathan 07 January 2016 (has links) With recent research advances, the dream of bringing domestic robots into our everyday lives has become more plausible than ever. Domestic robotics has grown dramatically in the past decade, with applications ranging from house cleaning to food service to health care. To date, the majority of the planning and control machinery for these systems are carefully designed by human engineers. A large portion of this effort goes into selecting the appropriate models and control techniques for each application, and these skills take years to master. Relieving the burden on human experts is therefore a central challenge for bringing robot technology to the masses. This work addresses this challenge by introducing a physics engine as a model space for an autonomous robot, and defining procedures for enabling robots to decide when and how to learn these models. We also present an appropriate space of motor controllers for these models, and introduce ways to intelligently select when to use each controller based on the estimated model parameters. We integrate these components into a framework called Physics-Based Reinforcement Learning, which features a stochastic physics engine as the core model structure. Together these methods enable a robot to adapt to unfamiliar environments without human intervention. The central focus of this thesis is on fast online model learning for objects with under-specified dynamics. We develop our approach across a diverse range of domestic tasks, starting with a simple table-top manipulation task, followed by a mobile manipulation task involving a single utility cart, and finally an open-ended navigation task with multiple obstacles impeding robot progress. We also present simulation results illustrating the efficiency of our method compared to existing approaches in the learning literature. Machine learning Robotics Reinforcement learning
206	Modelling of chlorine and moisture transport in concrete McLoughlin, Ian Michael January 1997 (has links) No description available. 620.11223 Reinforcement corrosion; Movement; Water
207	Adhesion of steel tyre cord to rubber Niderost, Kevin John January 1991 (has links) No description available. 670 Bonding system; Brass; Reinforcement
208	The Relative Effectiveness of Parental Positive Reinforcement and Punishment in Reducing Oppositional Behavior in Children and in Increasing the Frequency of Parent-Child Interaction Detrich, Ronnie 12 1900 (has links) The purpose of this study was to determine the relative effectiveness of the reinforcement and punishment techniques in the natural environment, and the effect of their use upon the social interaction between parent and child. It was hypothesized that punishment would be more effective than reward in controlling oppositional behavior, but that reinforcement would be more effective in increasing child-initiated interaction with the parents. punishment reinforcement children parents behavior
209	A comparative study of micro & nanocarbon reinforced synthetic rubber composites Maifadi, James 01 September 2014 (has links) A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Science. Johannesburg, 2014. / This study concentrated on two main thrusts: 1) the optimal synthesis and characterisation of various micro- and nanosized carbon materials and 2) a comparative investigation of the capabilities of these carbonaceous materials to reinforce a locally available styrene butadiene rubber (SBR), which was commonly used to make car tyres. In the former case, a range of carbon materials including nitrogen doped/undoped carbon nanotubes as well as carbon microspheres (CMSs) were successfully synthesized by two different techniques (i.e. chemical vapour deposition (CVD) and hydrothermal synthesis). These were then fully characterised by numerous techniques which included: TEM, TGA, FTIR, PXRD, laser Raman spectroscopy, Zeta potential measurements and BET surface area analysis. In the latter case, these micro and nanocarbon materials were systematically added to SBR at various loadings (ranging from 0.125–0.500% (m/m)). Here the tensile strengths of the resultant composites, loaded with these various micro and nanocarbon materials, were measured for comparison to establish which (if any) was the best reinforcement material for SBR. Results obtained from the tensile strength measurements of the variously loaded SBR composites, showed that irrespective of the method of synthesis (i.e. CVD or hydrothermal synthesis) carbon microspheres (undoped, doped, functionalised or unfuntionalised) performed more poorly as fillers than carbon nanotubes. Furthermore the results obtained, from the various characterisation techniques mentioned previously, indicated that the lower performance of these microspheres as fillers may have been due to their: size, shape and low surface areas. By contrast when the tensile strengths of SBR reinforced with either CNTs or NCNTs were measured, the former outperformed the latter as fillers. It was speculated, based upon the data obtained, that NCNTs were poorer fillers than CNTs due to their higher negative surface charges, larger diameters and lower crystallinity. Hence this study has shown that low loadings (i.e. 0.250 % (m/m)) of the correctly matched type of carbonaceous material can significantly enhance the tensile strength and Young’s modulus of a locally available styrene butadiene rubber. Nanostructured materials. Carbon. Rubber--Reinforcement.
210	Is Conditioned Reinforcement by Observation a Verbal Behavior Developmental Cusp? Lanter, Alexandria January 2018 (has links) In 2 studies, I tested the effects of an observational conditioning-by-denial intervention on the demonstration of conditioned reinforcement by observation, observational performance, and observational acquisition of new operants. In Experiment 1, I selected 6 children educationally classified with autism spectrum disorder and multiple disabilities. The participants were 2 females and 4 males who ranged from 5.5-8.2 years old. Participants were selected from one school that implemented a Comprehensive Application of Behavior Analysis to Schooling (CABAS®) approach. I conducted a series of pre-intervention reinforcer assessments that tested 1) the conditioned reinforcement effects of known reinforcing stimuli (edibles) and non-preferred stimuli (binder clips) on a mastered task, and 2) the reinforcement effects of non-preferred stimuli (binder clips) on 3 learning tasks across each participant. These reinforcer assessment probes showed all participants’ rates increased when a known reinforcer (edibles) was delivered compared to non-reinforcing stimuli (binder clips) on the mastered task. Participants did not demonstrate learning when delivered non-preferred stimuli (binder clips) for correct responses on learning tasks. Following the pre-intervention reinforcer assessments I conducted probes for a) conditioned reinforcement by observation b) observational performance and c) observational acquisition of new operants. Pre-intervention probes showed all participants did not demonstrate conditioned reinforcement by observation, or observational acquisition of new operants and 5 out of 6 participants did not demonstrate observational performance. The independent variable was an observational conditioning-by-denial intervention. During the intervention the participant was paired with a known peer, and both children were separated by a partition but were able to see and hear the researcher but not each other. The only thing both the participant and peer could see were each other’s transparent cups, which were attached with Velcro® to each child’s desk. Both participants were given a mastered task. Each time the peer emitted a response the experimenter delivered neutral stimuli (binder clips) into his/her transparent cup, in view of the participant. The intervention continued until the target participant vocally manded/requested for the neutral stimuli and/or made a physical attempt to gain access to the stimuli one or more times across two consecutive sessions. Post-intervention data suggest that neutral stimuli (binder clips) became conditioned reinforcers for mastered and learning tasks as function of the intervention for all 6 participants. Responses to denial of non-preferred stimuli delivered to a peer (conditioned reinforcement by observation), observational performance, and observational acquisition of new operant responses increased in 4 out of 6 participants who did not respond during pre-intervention probes. In Experiment 2, I sought to determine if conditioned reinforcement by observation is a verbal behavior developmental cusp. Experiment 2 was a replication of Experiment 1, with two different reinforcer assessments that tested: 1) the conditioned reinforcer effects of neutral stimuli when the participant was alone and 2) the conditioned reinforcer effects of neutral stimuli when the participant observed a peer play with neutral stimuli. Four males educationally classified with autism spectrum disorder and speech and language impairments participated in Experiment 2. Post-intervention data suggest that neutral stimuli (metal washers, s-hooks, spoon shelf supports) became conditioned reinforcers during the individual and peer reinforcer assessments as a function of the intervention for all 4 participants. Responses to denial of non-preferred stimuli delivered to a peer (conditioned reinforcement by observation), observational performance, and observational acquisition of new operant responses increased across all 4 participants who did not respond during pre-intervention probes. The results of both experiments suggest that a single intervention can establish all three types of observational learning. The results from Experiment 2 confirm that conditioned reinforcement by observation is a verbal behavior developmental cusp. Psychology Reinforcement (Psychology) Behavioral assessment

Search results