Global ETD Search

391	Using course credit as reinforcement for free-style study behavior Kizer, Philip Lee January 1975 (has links) Ninety students in an introductory psychology course were each assigned to one of three matched groups, which were levels of multiple-baseline format. For each level, a simple within subjects reversal design was used, in which points were awarded for participation in study sessions during a nine-day, two-week period, preceded and followed by periods when reinforcement was not available. Participation was defined as studying for at least 15 minutes and then completing and scoring the daily quiz in study session. Of the 90 students, only 61 picked up instruction sheets, and of these 27 actually participated in one or more study sessions. The two most significant findings were that course credit points did reinforce participation in study sessions, which was shown by a clear reversal (a return to no participation when points were withdrawn), and that significantly more top than bottom students participated in study sessions (p<.05). No significant sex differences were found. Reinforcement (Psychology) School credits. Independent study.
392	Conditioning of an abnormal response through learning without awareness Singer, James Marsh January 1972 (has links) Four groups of twenty Ss each were conditioned for an abnormal response of the Muller-Lyer illusion. The four groups were:1. Control - no response from E2. Positive reinforcement - "mmm-hmm" when parison was longer than the standard3. Punishment - "huh-uh" when the comparison was shorter than the standard.4. Both punishment and positive reinforcement - "mmm-hmm" when the comparison was longer than the standard and "huh-uh" when the comparison was shorter than the standardThere was no difference in sex, or in tests before and after verbalization of the reinforcement contingency by each S. There was an overall difference as measured by an analysis of variance (p < •001) . There was no significant difference between positive reinforcement and the control group. There was a significant difference between Groups 1 and 3, and Groups 1 and 4. There was no significant difference in Groups 3 and 4 although there was a trend in that direction. Conditioned response. Reinforcement (Psychology) Verbal behavior.
393	Bond and Development Length in Concrete Beams with Exposed Reinforcement Masnavi, Alireza 14 September 2013 (has links) Corrosion of steel reinforcement in Ontario bridges is causing severe soffit spalling in many situations. These spalled areas are often located within the lap splices and curtailment zones of the primary reinforcement. This can lead to inadequate bar development lengths and the possibility of failures. In order to better predict the residual strength of these deteriorated bridges, a test program was designed, which involved mid-sized concrete beam specimens, with partially de-bonded reinforcement. The de-bonding was simulated in various beam locations, with various de-bonding patterns. The test program consisted of thirteen beams; ten under-reinforced and three over-reinforced. All beams had dimensions of 2100×150×100 mm. The span between simple supports was 1900 mm with a single point load applied at the midspan. Rebar strains and displacement at the midspan were recorded. The goal of this experimental study was to determine the correlation between the spatial location and surface area of de-bonding with the strength of the beams. This was achieved by testing beam specimens with different combinations of de-bonding patterns with respect to location and area. Four beams had de-bonded reinforcement in the flexural zones, seven were de-bonded in the anchorage and flexural zones, and two were fully bonded. In a previous study, a so-called “modified area concept” was developed for rapid assessment of the remaining capacity of heavily spalled girders. This concept was integrated in a computer program, which assesses girder capacity, given a graphical spalling survey and a structural drawing of the girder. The developed program can be easily adapted for full bridge analysis, and to evaluate the effects of reinforcement cross section loss and bond deterioration. The research presented in the current thesis investigates several of the assumptions made in this previous study. The current thesis includes the rationale for the design of the experimental program. In addition, the test results are presented and analyzed. By analyzing the failure modes, failure loads, and crack patterns, along with the load-displacement, load-stiffness, and load-strain behaviour of the various beams, it is concluded that: 1) reinforced concrete beams can carry a significant portion of their original capacity after losing cover over a significant portion of their flexural reinforcement, 2) predictions of the beam capacities and failure modes using the modified area concept are reasonably accurate and conservative in most cases, and 3) the flexural stiffness of the beams was seen to decrease with an increase in the length of the exposed area, in most cases. Recommended areas of future research are identified, including: 1) tests of beams with splices in the flexural reinforcing along the span, 2) field investigation of the concrete strength in regions of the soffit immediately adjacent to the spalled regions, and 3) the development of a correction factor to account for the effects of violating the plane sections assumption. Exposure Concrete Reinforcement Strength Repair Corrosion
394	Reinforcement learning in the presence of rare events Frank, Jordan William, 1980- January 2009 (has links) Learning agents often find themselves in environments in which rare significant events occur independently of their current choice of action. Traditional reinforcement learning algorithms sample events according to their natural probability of occurring, and therefore tend to exhibit slow convergence and high variance in such environments. In this thesis, we assume that learning is done in a simulated environment in which the probability of these rare events can be artificially altered. We present novel algorithms for both policy evaluation and control, using both tabular and function approximation representations of the value function. These algorithms automatically tune the rare event probabilities to minimize the variance and use importance sampling to correct for changes in the dynamics. We prove that these algorithms converge, provide an analysis of their bias and variance, and demonstrate their utility in a number of domains, including a large network planning task. Markov processes. Reinforcement learning. Computer algorithms.
395	Self-reinforcement in the elderly as a function of feedback and modeling Weiner, Howard R January 1976 (has links) Typescript. / Thesis (Ph. D.)--University of Hawaii at Manoa, 1976. / Bibliography: leaves [111]-123. / Microfiche. / vi, 123 leaves ill Older people -- Psychology Reinforcement (Psychology) Feedback (Psychology)
396	Reinforcement learning by incremental patching Kim, Min Sub, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) This thesis investigates how an autonomous reinforcement learning agent can improve on an approximate solution by augmenting it with a small patch, which overrides the approximate solution at certain states of the problem. In reinforcement learning, many approximate solutions are smaller and easier to produce than ???flat??? solutions that maintain distinct parameters for each fully enumerated state, but the best solution within the constraints of the approximation may fall well short of global optimality. This thesis proposes that the remaining gap to global optimality can be efficiently minimised by learning a small patch over the approximate solution. In order to improve the agent???s behaviour, algorithms are presented for learning the overriding patch. The patch is grown around particular regions of the problem where the approximate solution is found to be deficient. Two heuristic strategies are proposed for concentrating resources to those areas where inaccuracies in the approximate solution are most costly, drawing a compromise between solution quality and storage requirements. Patching also handles problems with continuous state variables, by two alternative methods: Kuhn triangulation over a fixed discretisation and nearest neighbour interpolation with a variable discretisation. As well as improving the agent???s behaviour, patching is also applied to the agent???s model of the environment. Inaccuracies in the agent???s model of the world are detected by statistical testing, using a selective sampling strategy to limit storage requirements for collecting data. The patching algorithms are demonstrated in several problem domains, illustrating the effectiveness of patching under a wide range of conditions. A scenario drawn from a real-time strategy game demonstrates the ability of patching to handle large complex tasks. These contributions combine to form a general framework for patching over approximate solutions in reinforcement learning. Complex problems cannot be solved by brute force alone, and some form of approximation is necessary to handle large problems. However, this does not mean that the limitations of approximate solutions must be accepted without question. Patching demonstrates one way in which an agent can leverage approximation techniques without losing the ability to handle fine yet important details. Reinforcement learning. Markov decision processes. Patchwork.
397	Modelling motivation for experience-based attention focus in reinforcement learning Merrick, Kathryn January 2007 (has links) Doctor of Philosophy / Computational models of motivation are software reasoning processes designed to direct, activate or organise the behaviour of artificial agents. Models of motivation inspired by psychological motivation theories permit the design of agents with a key reasoning characteristic of natural systems: experience-based attention focus. The ability to focus attention is critical for agent behaviour in complex or dynamic environments where only small amounts of available information is relevant at a particular time. Furthermore, experience-based attention focus enables adaptive behaviour that focuses on different tasks at different times in response to an agent’s experiences in its environment. This thesis is concerned with the synthesis of motivation and reinforcement learning in artificial agents. This extends reinforcement learning to adaptive, multi-task learning in complex, dynamic environments. Reinforcement learning algorithms are computational approaches to learning characterised by the use of reward or punishment to direct learning. The focus of much existing reinforcement learning research has been on the design of the learning component. In contrast, the focus of this thesis is on the design of computational models of motivation as approaches to the reinforcement component that generates reward or punishment. The primary aim of this thesis is to develop computational models of motivation that extend reinforcement learning with three key aspects of attention focus: rhythmic behavioural cycles, adaptive behaviour and multi-task learning in complex, dynamic environments. This is achieved by representing such environments using context-free grammars, modelling maintenance tasks as observations of these environments and modelling achievement tasks as events in these environments. Motivation is modelled by processes for task selection, the computation of experience-based reward signals for different tasks and arbitration between reward signals to produce a motivation signal. Two specific models of motivation based on the experience-oriented psychological concepts of interest and competence are designed within this framework. The first models motivation as a function of environmental experiences while the second models motivation as an introspective process. This thesis synthesises motivation and reinforcement learning as motivated reinforcement learning agents. Three models of motivated reinforcement learning are presented to explore the combination of motivation with three existing reinforcement learning components. The first model combines motivation with flat reinforcement learning for highly adaptive learning of behaviours for performing multiple tasks. The second model facilitates the recall of learned behaviours by combining motivation with multi-option reinforcement learning. In the third model, motivation is combined with an hierarchical reinforcement learning component to allow both the recall of learned behaviours and the reuse of these behaviours as abstract actions for future learning. Because motivated reinforcement learning agents have capabilities beyond those of existing reinforcement learning approaches, new techniques are required to measure their performance. The secondary aim of this thesis is to develop metrics for measuring the performance of different computational models of motivation with respect to the adaptive, multi-task learning they motivate. This is achieved by analysing the behaviour of motivated reinforcement learning agents incorporating different motivation functions with different learning components. Two new metrics are introduced that evaluate the behaviour learned by motivated reinforcement learning agents in terms of the variety of tasks learned and the complexity of those tasks. Persistent, multi-player computer game worlds are used as the primary example of complex, dynamic environments in this thesis. Motivated reinforcement learning agents are applied to control the non-player characters in games. Simulated game environments are used for evaluating and comparing motivated reinforcement learning agents using different motivation and learning components. The performance and scalability of these agents are analysed in a series of empirical studies in dynamic environments and environments of progressively increasing complexity. Game environments simulating two types of complexity increase are studied: environments with increasing numbers of potential learning tasks and environments with learning tasks that require behavioural cycles comprising more actions. A number of key conclusions can be drawn from the empirical studies, concerning both different computational models of motivation and their combination with different reinforcement learning components. Experimental results confirm that rhythmic behavioural cycles, adaptive behaviour and multi-task learning can be achieved using computational models of motivation as an experience-based reward signal for reinforcement learning. In dynamic environments, motivated reinforcement learning agents incorporating introspective competence motivation adapt more rapidly to change than agents motivated by interest alone. Agents incorporating competence motivation also scale to environments of greater complexity than agents motivated by interest alone. Motivated reinforcement learning agents combining motivation with flat reinforcement learning are the most adaptive in dynamic environments and exhibit scalable behavioural variety and complexity as the number of potential learning tasks is increased. However, when tasks require behavioural cycles comprising more actions, motivated reinforcement learning agents using a multi-option learning component exhibit greater scalability. Motivated multi-option reinforcement learning also provides a more scalable approach to recall than motivated hierarchical reinforcement learning. In summary, this thesis makes contributions in two key areas. Computational models of motivation and motivated reinforcement learning extend reinforcement learning to adaptive, multi-task learning in complex, dynamic environments. Motivated reinforcement learning agents allow the design of non-player characters for computer games that can progressively adapt their behaviour in response to changes in their environment. Motivation reinforcement learning agent computer games
398	A learning classifier system approach to relational reinforcement learning Mellor, Drew January 2008 (has links) Research Doctorate - Doctor of Philosophy (PhD) / Machine learning methods usually represent knowledge and hypotheses using attribute-value languages, principally because of their simplicity and demonstrated utility over a broad variety of problems. However, attribute-value languages have limited expressive power and for some problems the target function can only be expressed as an exhaustive conjunction of specific cases. Such problems are handled better with inductive logic programming (ILP) or relational reinforcement learning (RRL), which employ more expressive languages, typically languages over first-order logic. Methods developed within these fields generally extend upon attribute-value algorithms; however, many attribute-value algorithms that are potentially viable for RRL, the younger of the two fields, remain to be extended. This thesis investigates an approach to RRL derived from the learning classifier system XCS. In brief, the new system, FOXCS, generates, evaluates, and evolves a population of ``condition-action'' rules that are definite clauses over first-order logic. The rules are typically comprehensible enough to be understood by humans and can be inspected to determine the acquired principles. Key properties of FOXCS, which are inherited from XCS, are that it is general (applies to arbitrary Markov decision processes), model-free (rewards and state transitions are ``black box'' functions), and ``tabula rasa'' (the initial policy can be unspecified). Furthermore, in contrast to decision tree learning, its rule-based approach is ideal for incrementally learning expressions over first-order logic, a valuable characteristic for an RRL system. Perhaps the most novel aspect of FOXCS is its inductive component, which synthesizes evolutionary computation and first-order logic refinement for incremental learning. New evolutionary operators were developed because previous combinations of evolutionary computation and first-order logic were non-incremental. The effectiveness of the inductive component was empirically demonstrated by benchmarking on ILP tasks, which found that FOXCS produced hypotheses of comparable accuracy to several well-known ILP algorithms. Further benchmarking on RRL tasks found that the optimality of the policies learnt were at least comparable to those of existing RRL systems. Finally, a significant advantage of its use of variables in rules was demonstrated: unlike RRL systems that did not use variables, FOXCS, with appropriate extensions, learnt scalable policies that were genuinely independent of the dimensionality of the task environment. LCS relational reinforcement learning mutagenesis blocks world
399	Training a non-match response toward a technology for determining controlling stimulus dimensions for two children with autism / Baynham, Tanya Yvonne MacDonald. Rosales-Ruiz, Jesus, January 2007 (has links) Thesis (M.S.)--University of North Texas, Dec., 2007. / Title from title page display. Includes bibliographical references.
400	A constructional canine aggression treatment using a negative reinforcement shaping procedure with dogs in home and community settings / Snider, Kellie Sisson. Rosales-Ruiz, Jesus, January 2007 (has links) Thesis (M.S.)--University of North Texas, Dec., 2007. / Title from title page display. Includes bibliographical references.

Search results