Global ETD Search

451	Extending dynamic scripting Ludwig, Jeremy R. 12 1900 (has links) xvi, 167 p. A print copy of this thesis is available through the UO Libraries. Search the library catalog for the location and call number. / The dynamic scripting reinforcement learning algorithm can be extended to improve the speed, effectiveness, and accessibility of learning in modern computer games without sacrificing computational efficiency. This dissertation describes three specific enhancements to the dynamic scripting algorithm that improve learning behavior and flexibility while imposing a minimal computational cost: (1) a flexible, stand alone version of dynamic scripting that allows for hierarchical dynamic scripting, (2) a method of using automatic state abstraction to increase the context sensitivity of the algorithm, and (3) an integration of this algorithm with an existing hierarchical behavior modeling architecture. The extended dynamic scripting algorithm is then examined in the three different contexts. The first results reflect a preliminary investigation based on two abstract real-time strategy games. The second set of results comes from a number of abstract tactical decision games, designed to demonstrate the strengths and weaknesses of extended dynamic scripting. The third set of results is generated by a series of experiments in the context of the commercial computer role-playing game Neverwinter Nights demonstrating the capabilities of the algorithm in an actual game. To conclude, a number of future research directions for investigating the effectiveness of extended dynamic scripting are described. / Adviser: Arthur Farley Dynamic scripting Reinforcement learning Computer science
452	A reinforcement-learning approach to understanding loss-chasing behavior in rats Marshall, Andrew Thomas January 1900 (has links) Doctor of Philosophy / Psychological Sciences / Kimberly Kirkpatrick / Risky decisions are inherently characterized by the potential to receive gains and losses from these choices, and gains and losses have distinct effects on global risky choice behavior and the likelihoods of making risky choices depending on the outcome of the previous choice. One translationally-relevant phenomenon of risky choice is loss-chasing, in which individuals make risky choices following losses. However, the mechanisms of loss-chasing are poorly understood. The goal of two experiments was to illuminate the mechanisms governing individual differences in loss-chasing and risky choice behaviors. In two experiments, rats chose between a certain outcome that always delivered reward and a risky outcome that probabilistically delivered reward. In Experiment 1, loss processing and loss-chasing behavior were assessed in the context of losses-disguised-as-wins (LDWs), or loss outcomes presented along with gain-related stimuli. The rats presented with LDWs were riskier and less sensitive to differential losses. In Experiment 2, these behaviors were assessed relative to the number of risky losses that could be experienced. Here, the addition of reward omission or a small non-zero loss to the possible risky outcomes elicited substantial individual differences in risky choice, with some rats increasing, decreasing, or maintaining their previous risky choice preferences. Several reinforcement learning (RL) models were fit to individual rats’ data to elucidate the possible psychological mechanisms that best accounted for individual differences in risky choice and loss-chasing behaviors. The RL analyses indicated that the critical predictors of risky choice and loss-chasing behavior were the different rates that individuals updated value estimates with newly experienced gains and losses. Thus, learning deficits may predict individual differences in maladaptive risky decision making. Accordingly, targeted interventions to alleviate learning deficits may ultimately increase the likelihood of making more optimal and informed choices. Risky choice Reinforcement learning Loss chasing Rats
453	Applications of radiation physics in civil engineering Gray, Derrick January 1999 (has links) This thesis presents two separate applications of ionising radiation in Civil Engineering. The first is an investigation to determine the cement content of concrete using gamma-rays from the naturally occurring isotopes 238U, 232Th and their decay chains as well as 40K. Two sets of equations are derived and discussed. Spectra from cement, aggregate and concrete samples were made and the useful full energy peaks from the above sources identified. Two concrete samples were prepared using the same cement, but, containing two different aggregates: a granite based aggregate and a flint based aggregate. A third concrete sample was then prepared where the cement content was not initially known. Data from the first two tests was then used to determine the mass of cement used in the blind test. A great deal of valuable information has also been accrued concerning the interaction of statistical errors in the equations for the prediction of cement content. Spectra from four different cements were collected at regular intervals over a 24 month period and the variation in the activity of each cement over this period is discussed. The second section of this work presents an imaging technique that uses pair production annihilation photons to examine the state of steel reinforcement in concrete structures. Computer simulations along with experimental work have been used. The experimental work used a 226Ra needle as a photon source as it provided a range of gamma-rays with energies over the pair production threshold of 1022keV. A 31mm rebar with 30mm of concrete cover was successfully located during the experimental work. The data collected from the computer simulations has shown that the geometry and the material between the photon source, rebar and detector is of great importance. 624
454	Multi-Agent Area Coverage Control Using Reinforcement Learning Techniques Adepegba, Adekunle Akinpelu January 2016 (has links) An area coverage control law in cooperation with reinforcement learning techniques is proposed for deploying multiple autonomous agents in a two-dimensional planar area. A scalar field characterizes the risk density in the area to be covered yielding nonuniform distribution of agents while providing optimal coverage. This problem has traditionally been addressed in the literature to date using locational optimization and gradient descent techniques, as well as proportional and proportional-derivative controllers. In most cases, agents' actuator energy required to drive them in optimal configurations in the workspace is not considered. Here the maximum coverage is achieved with minimum actuator energy required by each agent. Similar to existing coverage control techniques, the proposed algorithm takes into consideration time-varying risk density. These density functions represent the probability of an event occurring (e.g., the presence of an intruding target) at a certain location or point in the workspace indicating where the agents should be located. To this end, a coverage control algorithm using reinforcement learning that moves the team of mobile agents so as to provide optimal coverage given the density functions as they evolve over time is being proposed. Area coverage is modeled using Centroidal Voronoi Tessellation (CVT) governed by agents. Based on [1,2] and [3], the application of Centroidal Voronoi tessellation is extended to a dynamic changing harbour-like environment. The proposed multi-agent area coverage control law in conjunction with reinforcement learning techniques is implemented in a distributed manner whereby the multi-agent team only need to access information from adjacent agents while simultaneously providing dynamic target surveillance for single and multiple targets and feedback control of the environment. This distributed approach describes how automatic flocking behaviour of a team of mobile agents can be achieved by leveraging the geometrical properties of centroidal Voronoi tessellation in area coverage control while enabling multiple targets tracking without the need of consensus between individual agents. Agent deployment using a time-varying density model is being introduced which is a function of the position of some unknown targets in the environment. A nonlinear derivative of the error coverage function is formulated based on the single-integrator agent dynamics. The agent, aware of its local coverage control condition, learns a value function online while leveraging the same from its neighbours. Moreover, a novel computational adaptive optimal control methodology based on work by [4] is proposed that employs the approximate dynamic programming technique online to iteratively solve the algebraic Riccati equation with completely unknown system dynamics as a solution to linear quadratic regulator problem. Furthermore, an online tuning adaptive optimal control algorithm is implemented using an actor-critic neural network recursive least-squares solution framework. The work in this thesis illustrates that reinforcement learning-based techniques can be successfully applied to non-uniform coverage control. Research combining non-uniform coverage control with reinforcement learning techniques is still at an embryonic stage and several limitations exist. Theoretical results are benchmarked and validated with related works in area coverage control through a set of computer simulations where multiple agents are able to deploy themselves, thus paving the way for efficient distributed Voronoi coverage control problems. Area Coverage Coverage Control Reinforcement Learning
455	The effects of positive reinforcement and a token program in a public junior high school class Main, George C. January 1972 (has links) The operant level of inappropriate behavior was obtained for six students in a non-academic grade nine class receiving an individualized program of instruction in mathematics and science. Two conditions, educational structure and praising appropriate behavior while ignoring inappropriate behavior, were introduced successively. Both procedures reduced inappropriate behavior with five subjects. When a token reinforcement program, using back-up reinforcers readily available in the school, was introduced in conjunction with the conditions of educational structures and praising and ignoring; dramatic decline in the emission of inappropriate responses occurred with all six subjects. Withdrawal of the token program, leaving Educational structure and praising and ignoring in effect, resulted in an increase of inappropriate behavior with five subjects. The token program was reintroduced in conjunction with contingency contracts. The result was a decline of inappropriate behavior below the mean of the first token phase for all subjects. Tokens were thinned during the second token phase leaving back-up reinforcers, teacher-praise and attention, and the completion of contracts in effect. Data obtained during follow-up indicated that the thinning procedure was effective with no subsequent increase in behavior for any subject. The token program, utilized during one of the four blocks in the school time table, appeared to reduce absenteeism. Further evidence that appropriate behavior did generalize to other classes. / Education, Faculty of / Graduate Reinforcement (Psychology) Behaviorism (Psychology) Educational psychology
456	Verbal operant conditioning as a function of need for social approval and connotative meaning of the stimulus material Lee, Dong Yul January 1970 (has links) One hundred and forty four college subjects were divided into twelve groups on the basis of the score on a measure of need for social approval (high and low) and a measure of connotative meaning of the concept 'hippie’ (positive, negative, and neutral). By instituting two reinforcement conditions in a Taffel type of verbal conditioning task, these twelve groups of subjects were positively reinforced on a 100% reinforcement schedule, either congruently or lncongruently with their initial meaning of hippie (2 x 2 x 3 factorial design). The reinforcing stimulus was the experimenter's saying "Good" or "Fine" for a negative or positive description of hippie, depending upon the reinforcement conditions. It was hypothesized that subjects with a high need for social approval would show a greater conditioning performance than subjects with a low need for social approval. It was also hypothesized that subjects who received reinforcement congruently with their meaning of hippie would show a greater increase in the conditioning performance than subjects who received reinforcement lncongruently with their meaning of hippie. The data showed that there was no systematic difference in the conditioning performance between subjects with a high and low need for social approval as measured by the Marlowe-Crowne Social Desirability Scale. In addition, the need variable did not significantly interact either with the meaning, the reinforcement condition, or the block level of the conditioning trials. However, subjects who were reinforced congruently with their meaning of hippie showed significantly greater increase in the conditioning performance as compared to those who were reinforced incongruently with their meaning of hippie. In fact, subjects who received incongruent reinforcement failed to demonstrate any consistent changes in the rate of response emission during the conditioning period. Subjects with a neutral meaning of hippie showed a conditioning performance greater than the incongruently reinforced groups, but less than the congruently reinforced groups in both reinforcement conditions. The results were interpreted as indicating the importance of the condition under which subjects receive reinforcement—congruent or incongruent reinforcement—in determining responsivity toward socially reinforcing stimuli. / Arts, Faculty of / Psychology, Department of / Graduate Reinforcement (Psychology) Conditioned response Social desirability
457	The Effect of partial reinforcement before and after continuous reinforcement on acquisition level and resistance to extinction. Husband, James Dalla Thorne January 1968 (has links) Sutherland, Mackintosh, and Wolfe (1965) demonstrated that continuous reinforcement given subsequent to partial reinforcement resulted in superior resistance to extinction than did continuous reinforcement administered prior to partial reinforcement. Because the experimenters did not make compensations for different acquisition asymptotes in their analysis of the extinction data, the interpretation of their results is questionable. Thelos and McGinnis (1967) obtained results contrary to those obtained by the latter experimenters. This study was run in order to obtain additional information pertaining to the effects of shifts of reinforcement schedules on resistance to extinction. Extended acquisition trials were used with the expectation of bringing the six experimental groups up to the same terminal asymptotic levels. When Ss were approximately 45 days old, Ss were randomly assigned to one of six acquisition conditions. These conditions were: 96 partial reinforcement trials, 192 partial reinforcement trials, 96 partial reinforcement trials followed by 96 continuous reinforcement trials, 96 continuous reinforcement trials followed by 96 partial reinforcement trials, 192 continuous reinforcement trials, and 96 continuous reinforcement trials. Following training in a straight runway, all Ss received 60 nonreward trials given in blocks of 6 trials a day. Time measures were taken of runway performance on acquisition and extinction trials. The results supported the hypothesis that extended training would produce equivalent terminal asymptotic running speeds in acquisition. None of the hypotheses made by Sutherland, Mackintosh, and Wolfe (1965) or by Thelos and McGinnis (1967) were supported by the results of the experiment; no significant partial reinforcement effect was obtained. / Arts, Faculty of / Psychology, Department of / Graduate Reinforcement (Psychology) Conditioned response Extinction (Psychology
458	Correspondence Between Verbal Behavior About Reinforcers and Performance Under Schedules of Reinforcement. Bekker-Pace, Ruthie 08 1900 (has links) Important advancements have been made in the identification of reinforcers over the past decade. The use of preference assessments has become a systematic way to identify preferred events that may function as reinforcers for an individual's behavior. Typically, preference assessments require participants to select stimuli through verbal surveys or engagement with stimuli as preferred or non-preferred. Not all studies go on to directly test the effects of the preferred stimuli, and even fewer studies directly test for the effects of the non- preferred stimuli. The present study systematically identified preferred and non-preferred stimuli in adult human subjects by verbal report and then proceeded to test the effects of both verbally reported preferred and non preferred events on single and concurrent schedules of reinforcement. The results are discussed in terms of contemporary concerns regarding preference and reinforcer assessments. Reinforcement (Psychology) Verbal behavior. reinforcer preference assessment
459	Bond of Reinforcing Bars to Steel Fiber Reinforced Concrete (SFRC) García Taengua, Emilio José 21 October 2013 (has links) The use of steel fiber reinforced concrete (SFRC hereafter) is becoming more and more common. Building codes and recommendations are gradually including the positive effect of fibers on mechanical properties of concrete. How to take advantage of the higher ductility and energy absorption capacity of SFRC to reduce anchorage lengths when using fibers is not a straightforward issue. Fibers improve bond performance because they confine reinforcement (playing a similar role to that of transverse reinforcement). Their impact on bond performance of concrete is really important in terms of toughness/ductility. The study of previous literature has revealed important points of ongoing discussion regarding different issues, especially the following: a) whether the effect of fibers on bond strength is negligible or not, b) whether the effect of fibers on bond strength is dependent on any other factors such as concrete compressive strength or concrete cover, c) quantifying the effect of fibers on the ductility of bond failure (bond toughness). These issues have defined the objectives of this thesis. A modified version of the Pull Out Test (POT hereafter) has been selected as the most appropriate test for the purposes of this research. The effect of a number of factors on bond stress¿slip curves has been analyzed. The factors considered are: concrete compressive strength (between 30 MPa and 50 MPa), rebar diameter (between 8 mm and 20 mm), concrete cover (between 30 mm and 5 times rebar diameter), fiber content (up to 70 kg/m3), and fiber slenderness and length. The experimental program has been designed relying on the principles of statistical Design Of Experiments. This has allowed to select a reduced number of combinations to be tested without any bias or loss of accuracy. A total of 81 POT specimens have been produced and tested. An accurate model for predicting the mode of bond failure has been developed. It relates splitting probability to the factors considered. It has been proved that increasing fiber content restrains the risk of splitting failure. The favorable effect of fibers when preventing splitting failures has been revealed to be more important for higher concrete compressive strength values. Higher compressive strength values require higher concrete cover/diameter ratios for splitting failure to be prevented. Fiber slenderness and fiber length modify the effect of fiber content on splitting probability and therefore on minimum cover/diameter ratios required to prevent splitting failures. Two charts have been developed for estimating the minimum cover/ diameter ratio required to prevent splitting. Predictive equations have been obtained for estimating bond strength and areas under the bond stress¿slip curve as a function of the factors considered. Increasing fiber content has a slightly positive impact on bond strength, which is mainly determined by concrete compressive strength. On the contrary, fibers have a very important effect on the ductility of bond failure, just as well as concrete cover, as long as no splitting occurs. Multivariate analysis has proved that bond stress corresponding to the onset of slippage behaves independently from the rest of the bond stress¿slip curve. The effect of fibers and concrete compressive strength on bond stress values corresponding to the onset of slips is mainly attributable to their influence on the material mechanical properties. On the contrary, the effect of fibers and concrete cover on the rest of the bond stress¿slip curve is due to their structural role. / García Taengua, EJ. (2013). Bond of Reinforcing Bars to Steel Fiber Reinforced Concrete (SFRC) [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/32952 / TESIS Bond Reinforcement Steel fibers SFRC INGENIERIA DE LA CONSTRUCCION
460	A Shaping Procedure for Introducing Horses to Clipping Hardaway, Alison K 12 1900 (has links) The purpose of the current study is to evaluate a procedure that can be used to introduce horses to clipping. Negative reinforcement was used in a shaping paradigm. Shaping steps were conducted by the handler, starting with touching the horse with the hand, then touching the horse with the clippers while they are off, culminating with touching the horse with the clippers while they are on. When a horse broke contact with either the hand or the clippers, the hand or the clippers were held at that point until the horse emitted an appropriate response. When the horse emitted an appropriate response, the clippers were removed, and the handler stepped away from the horse. For all eight horses, this shaping plan was effective in enabling the clipping of each horse with minimal inappropriate behavior and without additional restraint. The entire process took under an hour for each horse. Negative Reinforcement Shaping Horses Clipping Psychology, Behavioral

Search results