• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1384
  • 372
  • 195
  • 157
  • 74
  • 59
  • 44
  • 24
  • 23
  • 21
  • 17
  • 17
  • 17
  • 17
  • 17
  • Tagged with
  • 2937
  • 1215
  • 565
  • 389
  • 336
  • 290
  • 249
  • 243
  • 242
  • 242
  • 233
  • 226
  • 197
  • 195
  • 168
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Effects of Reinforcement History on Stimulus Control Relations

Reyes, Fredy 12 1900 (has links)
Ray (1969) conducted an experiment on multiple stimulus-response relations and selective attention. Ray's (1969) results suggested that stimulus-response relations function as behavioral units. McIlvane and Dube (1996) indicated that if stimulus-response relations are behavioral units the effects of environmental variables on stimulus-response relations should be similar to the effects of environmental variables on single response topographies. This experiment analyzed the effects of reinforcement history on the probability of stimulus-response relations with differing reinforcement histories. In separate conditions random-ratio schedules of reinforcement were contingent on each of four discriminated responses. To assess the effects of reinforcement, during test conditions stimuli controlling different topographies were present concurrently in composite form. Results show that reinforcement history affects the probability of each response topography and that the association between response topographies and their controlling stimuli tends to remain constant throughout variations in reinforcement probability.
2

Reinforcement learning with parameterized actions

Masson, Warwick Anthony January 2016 (has links)
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. Johannesburg, 2016. / In order to complete real-world tasks, autonomous robots require a mix of fine-grained control and high-level skills. A robot requires a wide range of skills to handle a variety of different situations, but must also be able to adapt its skills to handle a specific situation. Reinforcement learning is a machine learning paradigm for learning to solve tasks by interacting with an environment. Current methods in reinforcement learning focus on agents with either a fixed number of discrete actions, or a continuous set of actions. We consider the problem of reinforcement learning with parameterized actions—discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. By representing actions in this way, we have the high level skills given by discrete actions and adaptibility given by the parameters for each action. We introduce the Q-PAMDP algorithm for model-free learning in parameterized action Markov decision processes. Q-PAMDP alternates learning which discrete actions to use in each state and then which parameters to use in those states. We show that under weak assumptions, Q-PAMDP converges to a local maximum. We compare Q-PAMDP with a direct policy search approach in the goal and Platform domains. Q-PAMDP out-performs direct policy search in both domains. / TG2016
3

Rullarmering - Ett rationellt sätt att armera

Bertilsson, Thomas, Ekstrand, Mattias January 2008 (has links)
<p>This is a degree project performed at Halmstad University in Sweden on the subject carpet of</p><p>reinforcement. This project is developed in association with Celsa Steel Service in Halmstad.</p><p>The purpose of this report is to investigate how much time that can be saved if carpet of</p><p>reinforcement is used instead of the traditional way to work with reinforcement and see</p><p>what kind of parameters that is important between the choices. We are also going to see</p><p>how this product affects the situation for reinforcement workers from an ergonomic point of</p><p>view.</p><p>Carpet of reinforcement is made by straight reinforcement’s bars that are welded together</p><p>with steel bands. These rolls are made at a factory with a machine that welds together the</p><p>bars in determined centre distance, with desired dimensions after the construction</p><p>engineers’ blueprints. The big advantage is that the work at the construction site becomes</p><p>much easier. Also the number of reinforcement workers and the total time of the project</p><p>can be reduced. All this together saves money.</p><p>The soft parameters like work environment are also affected in a good way. The rolls are</p><p>heavy and have to be lifted with a construction crane which saves the reinforcement iron</p><p>workers body. The traditional work with reinforcement involves a lot of heavy lifting and</p><p>trying work positions for the workers. Therefore is it fairly usual with wear injuries in the</p><p>trade. We think that carpets of reinforcement may reduce the amount of wear injuries. Most</p><p>of the workers that we interviewed seem to like the product.</p><p>There are two different products that Celsa Steel Service provides the trade with. Bamtec is</p><p>the oldest of these products and also the most common and appreciated. In connection with</p><p>building of bridges it is forbidden to weld together the reinforcement because of the risk of</p><p>fatique fractares in welded zones. Because of this, Bamtec can not be used in bridges. For</p><p>use in bridges a new product called SpinMaster, without welding, has been developed.</p><p>Carpet of reinforcement works best at large simple foundation slabs but can also be used in</p><p>walls.</p>
4

Autonomous inter-task transfer in reinforcement learning domains

Taylor, Matthew Edmund 07 September 2012 (has links)
Reinforcement learning (RL) methods have become popular in recent years because of their ability to solve complex tasks with minimal feedback. While these methods have had experimental successes and have been shown to exhibit some desirable properties in theory, the basic learning algorithms have often been found slow in practice. Therefore, much of the current RL research focuses on speeding up learning by taking advantage of domain knowledge, or by better utilizing agents’ experience. The ambitious goal of transfer learning, when applied to RL tasks, is to accelerate learning on some target task after training on a different, but related, source task. This dissertation demonstrates that transfer learning methods can successfully improve learning in RL tasks via experience from previously learned tasks. Transfer learning can increase RL’s applicability to difficult tasks by allowing agents to generalize their experience across learning problems. This dissertation presents inter-task mappings, the first transfer mechanism in this area to successfully enable transfer between tasks with different state variables and actions. Inter-task mappings have subsequently been used by a number of transfer researchers. A set of six transfer learning algorithms are then introduced. While these transfer methods differ in terms of what base RL algorithms they are compatible with, what type of knowledge they transfer, and what their strengths are, all utilize the same inter-task mapping mechanism. These transfer methods can all successfully use mappings constructed by a human from domain knowledge, but there may be situations in which domain knowledge is unavailable, or insufficient, to describe how two given tasks are related. We therefore also study how inter-task mappings can be learned autonomously by leveraging existing machine learning algorithms. Our methods use classification and regression techniques to successfully discover similarities between data gathered in pairs of tasks, culminating in what is currently one of the most robust mapping-learning algorithms for RL transfer. Combining transfer methods with these similarity-learning algorithms allows us to empirically demonstrate the plausibility of autonomous transfer. We fully implement these methods in four domains (each with different salient characteristics), show that transfer can significantly improve an agent’s ability to learn in each domain, and explore the limits of transfer’s applicability. / text
5

Analysis of a cooperative response

Vogler, Roger Ernest, 1938- January 1966 (has links)
No description available.
6

Rullarmering - Ett rationellt sätt att armera

Bertilsson, Thomas, Ekstrand, Mattias January 2008 (has links)
This is a degree project performed at Halmstad University in Sweden on the subject carpet of reinforcement. This project is developed in association with Celsa Steel Service in Halmstad. The purpose of this report is to investigate how much time that can be saved if carpet of reinforcement is used instead of the traditional way to work with reinforcement and see what kind of parameters that is important between the choices. We are also going to see how this product affects the situation for reinforcement workers from an ergonomic point of view. Carpet of reinforcement is made by straight reinforcement’s bars that are welded together with steel bands. These rolls are made at a factory with a machine that welds together the bars in determined centre distance, with desired dimensions after the construction engineers’ blueprints. The big advantage is that the work at the construction site becomes much easier. Also the number of reinforcement workers and the total time of the project can be reduced. All this together saves money. The soft parameters like work environment are also affected in a good way. The rolls are heavy and have to be lifted with a construction crane which saves the reinforcement iron workers body. The traditional work with reinforcement involves a lot of heavy lifting and trying work positions for the workers. Therefore is it fairly usual with wear injuries in the trade. We think that carpets of reinforcement may reduce the amount of wear injuries. Most of the workers that we interviewed seem to like the product. There are two different products that Celsa Steel Service provides the trade with. Bamtec is the oldest of these products and also the most common and appreciated. In connection with building of bridges it is forbidden to weld together the reinforcement because of the risk of fatique fractares in welded zones. Because of this, Bamtec can not be used in bridges. For use in bridges a new product called SpinMaster, without welding, has been developed. Carpet of reinforcement works best at large simple foundation slabs but can also be used in walls.
7

Response number under a fixed-interval schedule of reinforcement

Gentry, George David 08 1900 (has links)
No description available.
8

Complex schedules of reinforcement : response-reinforcer temporal relationships, the role of brief stimuli, and unit schedule performance as an operant in pigeons

Bradford, Linda DiAnne 08 1900 (has links)
No description available.
9

Two-key concurrent responding : choice and delays of reinforcement at constant relative immediacy of reinforcement

Gentry, George David 08 1900 (has links)
No description available.
10

Interdependencies in multiple fixed ratio reinforcement schedules

Sergio, Joseph P. January 1980 (has links)
Schuster (1959) and Keehn (1963), working with complex schedules of reinforcement, suggested that post-reinforcement pause (PRP) length was a function of the ratio requirements of the schedules' components. More specifically, Schuster indicated that a contrast effect may be the result of changes in schedule components from higher to lower requirements, or vice-versa. The present study attempted to determine the relationship between PRP length and response rate, and component requirements of a multiple fixed ratio schedule. Schedule interactions were examined when behavior was maintained at a given pair of schedules for a prolonged time, and when both schedules were studied at various values.In general, the results of the present study lend support to the position that component requirements of a multiple FR schedule do not interact, as demonstrated by the PRP, to cause a contrast effect to emerge. This study also indicates that an inverse relationship appears to exist between response rate and FR ratio size.

Page generated in 0.0496 seconds