Global ETD Search

21	The effect of reinforcement of the negative discriminandum in differential conditioning Donovick, Peter Joseph, January 1963 (has links) Thesis (M.S.)--University of Wisconsin--Madison, 1963. / Typescript. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 29-30). Reinforcement (Psychology)
22	A comparison of two token reinforcement distribution procedures Reardon, Mary Cathleen, January 1976 (has links) Thesis (M.S.)--University of Wisconsin--Madison. / Typescript. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 47-50). Reinforcement (Psychology)
23	Response variability and partial reinforcement in a complex learning situation Newberry, Benjamin H., January 1966 (has links) Thesis (M.A.)--University of Wisconsin--Madison, 1966. / eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references. Reinforcement (Psychology)
24	Model-based approximation methods for reinforcement learning / Wang, Xin. January 1900 (has links) Thesis (Ph. D.)--Oregon State University, 2007. / Printout. Includes bibliographical references (leaves 187-198). Also available on the World Wide Web. Reinforcement learning
25	Distribution of trials and the partial reinforcement effect after minimal acquisition Young, Brian Jeffrey January 1970 (has links) Both Frustration Theory and Sequential Effects Theory appear to account with some plausibility for the Partial Reinforcement Effect (PRE) and accompanying phenomena. On the basis of Frustration Theory, however, a PRE should not occur after only minimal acquisition training. On this point, the experimental evidence is conflicting. On the basis of analogous studies employing extended acquisition, trial distribution may be suggested as a factor, among others, which might affect the occurrence of the PRE. All previous studies showing evidence for a PRE after minimal acquisition employed intertrial intervals (ITI's) in acquisition and extinction of no more than 15 minutes. The problem which this study was concerned with was therefore whether a PRE could be obtained after only minimal acquisition when trials were distributed. Accordingly, it was hypothesized that a PRE would occur after a sequence of only five trials under a partial reinforcement (PRF) schedule in an 'RNRNR' sequence, compared with performance following continuous reinforcement (CRF), irrespective of whether trials were highly distributed or highly massed. A factorial 2x2 design was used in which four groups of Ss were trained under two conditions of reward schedule (PRF or CRF), and two conditions of ITI both in acquisition and extinction. The method, apparatus and procedure of a previous study (McCain, 1969) which showed evidence for a PRE after only minimal acquisition were replicated as far as possible, except for changes required by the design or on theoretical grounds. Evidence was found for a PRE under both conditions of ITI in terms of differences in extinction trends, confirming the hypothesis tested. Trial distribution was found to effect the mean running speeds shown by PRF and CRF groups such that greater resistance to extinction was shown by the PRF group in comparison to the CRF group when the ITI was 30 sec., while running measures did not show this effect when the ITI was 24 hours. This latter lack of a significant difference appeared to be mainly due to a reduction in PRF running speeds when the ITI was 24 hours rather than an increase in CRF running speeds. / Arts, Faculty of / Psychology, Department of / Graduate Reinforcement (Psychology)
26	Representation discovery using a fixed basis in reinforcement learning Wookey, Dean Stephen January 2016 (has links) A thesis presented for the degree of Doctor of Philosophy, School of Computer Science and Applied Mathematics. University of the Witwatersrand, South Africa. 26 August 2016. / In the reinforcement learning paradigm, an agent learns by interacting with its environment. At each state, the agent receives a numerical reward. Its goal is to maximise the discounted sum of future rewards. One way it can do this is through learning a value function; a function which maps states to the discounted sum of future rewards. With an accurate value function and a model of the environment, the agent can take the optimal action in each state. In practice, however, the value function is approximated, and performance depends on the quality of the approximation. Linear function approximation is a commonly used approximation scheme, where the value function is represented as a weighted sum of basis functions or features. In continuous state environments, there are infinitely many such features to choose from, introducing the new problem of feature selection. Existing algorithms such as OMP-TD are slow to converge, scale poorly to high dimensional spaces, and have not been generalised to the online learning case. We introduce heuristic methods for reducing the search space in high dimensions that significantly reduce computational costs and also act as regularisers. We extend these methods and introduce feature regularisation for incremental feature selection in the batch learning case, and show that introducing a smoothness prior is effective with our SSOMP-TD and STOMP-TD algorithms. Finally we generalise OMP-TD and our algorithms to the online case and evaluate them empirically. / LG2017 Reinforcement learning
27	The progressive effects of satiation on simple and complex operant response sequences / Danson, Carl Merrill January 1965 (has links) No description available. Psychology Reinforcement
28	The effect of increased fixed interval schedules of reinforcement on the time out behavior of the albino rat in a free operant (two bar) situation / Spence, Norville Edison January 1968 (has links) No description available. Psychology Reinforcement
29	The relationships of the internal-external locus of control dimension to scholastic achievement, reflection-impulsivity and residence in priority areas / Lesiak, Walter J. January 1970 (has links) No description available. Psychology Reinforcement
30	Stimulus-reinforcer and response-reinforcer contingencies in several learning paradigms in the rat / Atnip, Gilbert Wilson January 1975 (has links) No description available. Psychology Reinforcement

Search results