• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1208
  • 371
  • 195
  • 154
  • 74
  • 54
  • 42
  • 24
  • 23
  • 21
  • 17
  • 17
  • 17
  • 17
  • 17
  • Tagged with
  • 2713
  • 1030
  • 558
  • 373
  • 286
  • 239
  • 232
  • 226
  • 219
  • 207
  • 206
  • 197
  • 164
  • 160
  • 141
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.

Some Discriminative Functions of an Incidental Stimulus Adventitiously Associated with Reinforcement

Cuellar, Israel 05 1900 (has links)
The present study was a systematic replication of a study by Morse and Skinner (1957). An attempt was also made to study some of the parameters involved in the sensory control of operant behavior. Morse and Skinner found that a stimulus present when a response is being reinforced may acquire discriminative control over the response even though its presence at reinforcement is adventitious.

The Effects of Verbal Reinforcement on the Behavior of Mild and Moderate Institutionalized Mentally Retarded Children

Wheat, Darryl G. 08 1900 (has links)
The present study is an effort to investigate some of the pertinent implications, principles, and postulates revealed by learning theorists, specifically McCandless, using mentally retarded children as subjects.

The Effects of Increasing Rates of Reinforcement Through an Alternative Fluent Behavior on the Acquisition and Extinction of Behavior in Dogs

Coulter, Laura E. 08 1900 (has links)
The purpose of the present study was to experimentally investigate the effects of interspersing the opportunity to perform a fluent behavior during the acquisition of a new behavior. The experimenter trained left and right paw movements in domestic canines using a multiple treatment design. One paw movement was trained with a typical shaping procedure while the other was trained with an opportunity to perform a fluent behavior, touching the dog’s nose to a plastic disc, following each successive approximation in the shaping procedure. Two extinction phases were implemented during the experiment. The results showed that higher rates of reinforcement were achieved primarily following changes in criteria for reinforcement for the behavior in acquisition. There were no effects on rate of acquisition of the behavior, but adding an alternative fluent behavior may have slowed the differentiation between the reinforced behavior and alternative behaviors for one dog. The behavior trained with the addition of an alternative fluent behavior extinguished more quickly than in the control condition and extinguished at similar rates to the opposite leg movement. This suggests that the technique of offering an alternative fluent behavior may facilitate the chaining of the opposite behavior with the behavior targeted for reinforcement.

Free Time as a Positive Reinforcer in the Management of Study Behavior in an Aversive Educational Environment

Morriss, Stephen H. 08 1900 (has links)
The purpose of this study was to test the effectiveness of the use of free time as a positive reinforcer in the management of study behavior in an aversive educational environment. It was hypothesized that the presentation of free time contingent upon completion of the study assignment would result in maintained study behavior and reduced student absenteeism.

The use of apprenticeship learning via inverse reinforcement learning for musical composition

Messer, Orry 04 February 2015 (has links)
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. 14 August 2014. / Reinforcement learning is a branch of machine learning wherein an agent is given rewards which it uses to guide its learning. These rewards are often manually specified in terms of a reward function. The agent performs a certain action in a particular state and is given a reward accordingly (where a state is a configuration of the environment). A problem arises when the reward function is either difficult or impossible to manually specify. Apprenticeship learning via inverse reinforcement learning can be used in these cases in order to ascertain a reward function, given a set of expert trajectories. The research presented in this document used apprenticeship learning in order to ascertain a reward function in a musical context. The agent then optimized its performance in terms of this reward function. This was accomplished by presenting the learning agents with pieces of music composed by the author. These were the expert trajectories from which the learning agent discovered a reward function. This reward function allowed the agents to attempt to discover an optimal strategy for maximizing its value. Three learning agents were created. Two were drum-beat generating agents and one a melody composing agent. The first two agents were used to recreate expert drum-beats as well as generate new drum-beats. The melody agent was used to generate new melodies given a set of expert melodies. The results show that apprenticeship learning can be used both to recreate expert musical pieces as well as generate new musical pieces which are similar to with the expert musical pieces. Further, the results using the melody agent indicate that the agent has learned to generate new melodies in a given key, without having been given explicit information about key signatures.

Crowd behavioural simulation via multi-agent reinforcement learning

Lim, Sheng Yan January 2016 (has links)
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. Johannesburg, 2015. / Crowd simulation can be thought of as a group of entities interacting with one another. Traditionally, an animated entity would require precise scripts so that it can function in a virtual environment autonomously. Previous studies on crowd simulation have been used in real world applications but these methods are not learning agents and are therefore unable to adapt and change their behaviours. The state of the art crowd simulation methods include flow based, particle and strategy based models. A reinforcement learning agent could learn how to navigate, behave and interact in an environment without explicit design. Then a group of reinforcement learning agents should be able to act in a way that simulates a crowd. This thesis investigates the believability of crowd behavioural simulation via three multi-agent reinforcement learning methods. The methods are Q-learning in multi-agent markov decision processes model, joint state action Q-learning and joint state value iteration algorithm. The three learning methods are able to produce believable and realistic crowd behaviours.

The generalization of the effects of experimental teacher training

Brown, Carol Wegley January 2011 (has links)
Digitized by Kansas Correctional Industries

The effects of a change in reward probability on preference following autoshaping with two-signal sequences : an extension of the Egger and Miller information hypothesis

Troutman, Charles Michael January 2011 (has links)
Digitized by Kansas Correctional Industries

Adaptive value function approximation in reinforcement learning using wavelets

Mitchley, Michael January 2016 (has links)
A thesis submitted to the Faculty of Science, School of Computational and Applied Mathematics University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Doctor of Philosophy. Johannesburg, South Africa, July 2015. / Reinforcement learning agents solve tasks by finding policies that maximise their reward over time. The policy can be found from the value function, which represents the value of each state-action pair. In continuous state spaces, the value function must be approximated. Often, this is done using a fixed linear combination of functions across all dimensions. We introduce and demonstrate the wavelet basis for reinforcement learning, a basis function scheme competitive against state of the art fixed bases. We extend two online adaptive tiling schemes to wavelet functions and show their performance improvement across standard domains. Finally we introduce the Multiscale Adaptive Wavelet Basis (MAWB), a wavelet-based adaptive basis scheme which is dimensionally scalable and insensitive to the initial level of detail. This scheme adaptively grows the basis function set by combining across dimensions, or splitting within a dimension those candidate functions which have a high estimated projection onto the Bellman error. A number of novel measures are used to find this estimate. i

Performance of spatial alternation under conditions of food approach vs. shock escape in the normal and anterior decorticate rat

Holland, Elizabeth J. January 2010 (has links)
Digitized by Kansas Correctional Industries

Page generated in 0.0631 seconds