Spelling suggestions: "subject:"reinforcement."" "subject:"einforcement.""
161 |
The effects of reinforcement upon the prepecking behaviors of pigeons in the autoshaping experiment.Wessells, Michael G. 01 January 1974 (has links) (PDF)
No description available.
|
162 |
Vicarious verbal conditioning as a function of an observer's expectancy regarding the friendliness of the reinforcing agent.Jorgensen, Bruce W. 01 January 1971 (has links) (PDF)
The fact" that behavior can be conditioned through the use of verbal reinforcement is well documented (c.f. Kanf er, 1968 ; Flanders , 1968 ) • Specific critical responses of a subject, reinforced by praise or utterance of the word "good, " tend to increase in frequency, in this type of conditioning.
|
163 |
The effects of reinforcement of the development of inhibitory stimulus control.Collins, Jeremiah P. 01 January 1971 (has links) (PDF)
The pecking response of pigeons was reinforced when directed at a key transilluminated by chromatic light. Subjects were assigned to one of three sequences of S+ and S- stimuli and to one of two multiple schedules of reinforcement. The AE Group (acquisition - extinction) received daily a block of nine S+ periods followed immediately by a block of nine S- periods. The EA Group received the reverse order of stimuli while a control group (R Group) received a random sequence of stimuli. Each of the ' above sequences of stimulus presentation was in effect in conjunction with either a Mult VI-1 min - Ext schedule or a Mult VI-1 min - VI-4 min schedule of reinforcement. A total of six groups were thus obtained
|
164 |
Response-enforcer manipulations in the analysis of blocking/Stickney, Kenneth John 01 January 1980 (has links) (PDF)
The effect of prior experience on a current learning task is of the utmost relevance to theories of learning. In special cases, prior experience may even prevent subsequent learning. These cases can generally be thought of as instances of the blocking phenomenon (e.g., Kamin, 1969; Wagner, Logan, Haberlandt, and Price, 1968). Specifically, if a stimulus A is paired with a reinforcer so that a response comes under the control of that stimulus, then when a novel stimulus B is simultaneously compounded with A and also paired with the reinforcer, stimulus B does not acquire control over the response, as measured by presentation of B alone in extinction. Stimulus A is said to "block" the establishment of stimulus control by B.
|
165 |
Reinforcement of Hydrogels by Nanofiber NetworkGuo, Yuanhao 29 May 2013 (has links)
No description available.
|
166 |
In Situ Formation of Zinc Dimethacrylate for the Reinforcement of Ethylene-Propylene-Diene Rubber (EPDM)Young, Justin W. 03 August 2005 (has links)
No description available.
|
167 |
Automating Network Operation Centers using Reinforcement LearningAltamimi, Sadi 18 May 2023 (has links)
Reinforcement learning (RL) has been at the core of recent advances in fulfilling
the AI promise towards general intelligence. Unlike other machine learning (ML)
paradigms, such as supervised learning (SL) that learn to mimic how humans act,
RL tries to mimic how humans learn, and in many tasks, managed to discover new
strategies and achieved super-human performance. This is possible mainly because
RL algorithms are allowed to interact with the world to collect the data they need for
training by themselves. This is not possible in SL, where the ML model is limited to a
dataset collected by humans which can be biased towards sub-optimal solutions.
The downside of RL is its high cost when trained on real systems. This high cost
stems from the fact that the actions taken by an RL model during the initial phase of
training are merely random. To overcome this issue, it is common to train RL models
using simulators before deploying them in production. However, designing a realistic
simulator that faithfully resembles the real environment is not easy at all. Furthermore,
simulator-based approaches don’t utilize the sheer amount of field-data available at
their disposal.
This work investigates new ways to bridge the gap between SL and RL through an
offline pre-training phase. The idea is to utilize the field-data to pre-train RL models
in an offline setting (similar to SL), and then allow them to safely explore and improve
their performance beyond human-level. The proposed training pipeline includes: (i)
a process to convert static datasets into RL-environment, (ii) an MDP-aware data
augmentation process of offline-dataset, and (iii) a pre-training step that improves
RL exploration phase. We show how to apply this approach to design an action
recommendation engine (ARE) that automates network operation centers (NOC); a
task that is still tackled by teams of network professionals using hand-crafted rules.
Our RL algorithm learns to maximize the Quality of Experience (QoE) of NOC
users and minimize the operational costs (OPEX) compared to traditional algorithms.
Furthermore, our algorithm is scalable, and can be used to control large-scale networks
of arbitrary size.
|
168 |
Percentage of Reinforcement as a Factor in the Measurement of Conditioned FearHollander, Melvyn A. January 1965 (has links)
No description available.
|
169 |
Choice Between Stimuli Associated with Different Histories of Reinforcement when Presented Separatelyvom Saal, Walter 10 1900 (has links)
<p> Four experiments are reported in which choice behavior was examined in the pigeon. An apparatus was used in which each of two adjacent keys could be lit with either a red or a green dot. In each experiment, subjects received trials of fixed length with one color at a time
(single-stimulus training) before receiving trials with both keys lit together (choice test). On choice tests in Exp. 1, 2, and 3, birds pecked at a higher rate to the stimulus in which a higher number of reinforcements had been received per unit time with the stimulus present, but differences between stimuli in reinforcements per session did not reliably affect choice behavior and differences in proportion of trials followed by reinforcement had only weak effects. In Exp. 4, a brief daily choice test was used to evaluate recency effects. It was found that several sessions of experience with only one color were often necessary to reliably shift choice to that color.</p> / Thesis / Doctor of Philosophy (PhD)
|
170 |
Reliable deep reinforcement learning: stable training and robust deploymentQueeney, James 30 August 2023 (has links)
Deep reinforcement learning (RL) represents a data-driven framework for sequential decision making that has demonstrated the ability to solve challenging control tasks. This data-driven, learning-based approach offers the potential to improve operations in complex systems, but only if it can be trusted to produce reliable performance both during training and upon deployment. These requirements have hindered the adoption of deep RL in many real-world applications. In order to overcome the limitations of existing methods, this dissertation introduces reliable deep RL algorithms that deliver (i) stable training from limited data and (ii) robust, safe deployment in the presence of uncertainty.
The first part of the dissertation addresses the interactive nature of deep RL, where learning requires data collection from the environment. This interactive process can be expensive, time-consuming, and dangerous in many real-world settings, which motivates the need for reliable and efficient learning. We develop deep RL algorithms that guarantee stable performance throughout training, while also directly considering data efficiency in their design. These algorithms are supported by novel policy improvement lower bounds that account for finite-sample estimation error and sample reuse.
The second part of the dissertation focuses on the uncertainty present in real-world applications, which can impact the performance and safety of learned control policies. In order to reliably deploy deep RL in the presence of uncertainty, we introduce frameworks that incorporate safety constraints and provide robustness to general disturbances in the environment. Importantly, these frameworks make limited assumptions on the training process, and can be implemented in settings that require real-world interaction for training. This motivates deep RL algorithms that deliver robust, safe performance at deployment time, while only using standard data collection from a single training environment.
Overall, this dissertation contributes new techniques to overcome key limitations of deep RL for real-world decision making and control. Experiments across a variety of continuous control tasks demonstrate the effectiveness of our algorithms.
|
Page generated in 0.0972 seconds