Global ETD Search

541	The Effects of Interspersed Trials and Density of Reinforcement on Accuracy, Looking Away, and Self-Injurious Behavior of a Child with Autism Ybarra, Rita 05 1900 (has links) This research examines the effects of task interspersal and density of reinforcement on several behaviors of an autistic 6-year-old boy during the performance of a visual matching task and two auditory matching tasks. Experiment 1 investigated the effects of interspersing high and low accuracy tasks on correct matching responses, positions of matching responses, looking away, and self-injurious behavior (SIB). The effects of interspersed trials were evaluated using an ABAB multiple treatments design. Results indicated that interspersed trials produced slightly more correct responses during the visual matching task; however, correct responses decreased during the other two tasks. The use of interspersed trials also decreased looking away from the stimuli and SIB. Experiment 2 evaluated the effects of reinforcement density apart from task interspersal. Two conditions, reinforce-corrects-only and reinforce-all-responses, were compared in Experiment 2. Correct responses increased slightly for all three tasks during the reinforce-all-responses condition. Looking away and SIB were very infrequent throughout Experiment 2. Reinforcement (Psychology). autism reinforcement task interspersement
542	Contingência e contigüidade no responder de ratos submetidos a esquemas de razão e intervalo variáveis / Contingencies and contiguity imposition on response by exposing rats to variable interval and variable ratio schedule Fonseca, Cristina Moreira 12 May 2006 (has links) O presente estudo é formado por dois experimentos (Experimento 1 e Experimento 2) que empregaram procedimento de atraso de reforço não sinalizado non-resetting [esquema tandem em que o segundo componente ocorre em tempo fixo (FT)]. Os experimentos tiveram como objetivo geral manipular experimentalmente relações de contingência e contigüidade utilizando diferentes esquemas de reforço (esquema dependente de resposta, esquema dependente da resposta com liberação atrasada do reforço e esquema independente de resposta). Mais especificamente, os experimentos tiveram como objetivo verificar os efeitos produzidos pela introdução do atraso do reforço sobre a taxa e a distribuição de freqüência no tempo das respostas de pressão à barra, emitidas por ratos submetidos a esquemas de intervalo variável (VI) e razão variável (VR).No Experimento 1, os efeitos da introdução de um atraso de 5 s foram comparados à liberação de água independente da resposta (VT). Os resultados do experimento mostraram que, em linha de base (presença de contingência e contigüidade), o esquema VR gerou, relativamente ao VI, taxas de respostas mais elevadas. Com a introdução do atraso de reforço (presença de contingência e contigüidade reduzida) houve diminuição nas taxas de respostas dos sujeitos de ambos os grupos em relação à linha de base, com diminuição maior nas taxas de respostas dos sujeitos do Grupo VI. Na vigência do VT (ausência de contingência, mas não de contigüidade), esta diminuição foi ainda mais acentuada. Os efeitos diferenciados do atraso nos esquemas VI e VR foram detalhados por meio da distribuição de freqüência de respostas no tempo, de modo a se identificarem os valores de atraso que efetivamente ocorreram (isto é, o intervalo entre a liberação do reforço e a última resposta anteriormente emitida). Para os sujeitos do Grupo VI, os valores de atrasos registrados se concentraram geralmente em valores próximos ao atraso programado de 5 s enquanto que, para os sujeitos do Grupo VR, os intervalos efetivamente registrados assumiram valores menores. Uma explicação para esses resultados deve-se às próprias características dos esquemas. Deste modo, a probabilidade dos atrasos registrados se concentrarem no valor do atraso programado é maior no VI do que no VR. No Experimento 2, foi verificado o efeito da introdução de diferentes valores de atraso - 2, 5 e 8 s - sobre a taxa e a distribuição de freqüência de respostas, submetendo-se ratos ingênuos aos esquemas de VI e VR, com e sem atraso de reforço. Os resultados do Experimento 2 mostraram que, com a introdução dos diferentes valores de atraso de reforço (presença de contingência e contigüidade parametricamente manipulada), houve diminuições nas taxas de respostas dos sujeitos de ambos os grupos. A análise por meio da distribuição de freqüência dos valores de atrasos mostrou que para os sujeitos de ambos os grupos - VI e VR, os valores de atrasos registrados assumiram os menores intervalos quando o atraso foi de 2 s, concentraram-se em valores próximos ao atraso programado, quando o atraso foi de 5 s e, foram distribuídos entre os diferentes intervalos quando o atraso foi de 8 s. Resultados diferentes, porém, foram observados em dois sujeitos cujos valores de atrasos se concentraram nos valores de atrasos programados. No conjunto, os resultados de ambos os experimentos mostram que apesar das taxas de respostas semelhantes, identificam-se efeitos do atraso sob a distribuição do responder quando os sujeitos estão sob VI ou VR. Efeitos que não puderam ser observados quando a análise dos dados limitou-se à taxa de resposta, medida esta mais freqüentemente usada na investigação de relações de contingência e contigüidade. / The present study comprises two experiments (experiment 1 and experiment 2) that applied delayed reinforcement [non-resentting] procedure [schedule in which the second component operates on a fixed time condition (FT)]. The main purpose was to manipulate experimentally contingencies and contiguity relations by using different reinforcement schedules (response dependent schedule, response dependent schedule with delayed reinforcement and response independent schedule). The experiments investigated the effects of delayed reinforcement imposition on the lever press response rate and distribution of frequency on time, by exposing rats to variable interval (VI) and variable ratio schedule (VR). Experiment 1 compared effects of 5 s delays imposition with response independent water presentation. Experiment 1 showed that VR schedule engendered, in baseline condition (contingency and contiguity condition), higher response rates than did VI schedule. Delayed reinforcement (contingency and reduced contiguity condition) produced lower response rates, in both groups, than that of baseline condition. The lowest rates were observed in VI group. During VT condition (no contingency, but with contiguity), the rate decrease was accentuated. The different effects of delay under VI and VR schedules were showed in detail by the distribution of response frequency on time so that the effective delay values could be identified (the interval between reinforcement presentation and the last emitted response). In VI group, the observed delay values were in general near the programmed 5 s delay, whereas the intervals effectively observed, in VR group, were lower than the programmed 5 s delay. The schedules features can explain these results. The observed delays are more likely to be near the delay programmed value in VI than in VR schedule. Experiment 2 showed the effect of imposition of different delay values 2, 5 and 8 s- on the response rate and frequency distribution, by exposing naïve rats to VI and VR schedules, with and without reinforcement delay. The experiment 2 revealed that the response rates decreased, in both groups, when different reinforcement delay values were presented (contingency and parametrically manipulated contiguity). The analysis of distribution of delay values frequency showed that the registered delay values were lower than the programmed 2 s delay, delay values were in general near the programmed 5 s delay and distributed over the different values in both groups (VI and VR). Different findings were, however, observed in two subjects whose delay values were near the programmed delay values. The results of both experiments, taken together, indicate that the analysis of distribution of delay values frequency reveals the different effects of delay on VI and VR schedule. These effects could not be observed when the data analysis took into account only response rate, a more usual measure employed in investigations on relationship between contingency and contiguity. Atraso de reforçamento Esquemas de reforço Ratos Rats Reinforcement delay Reinforcement schedules Tabelas de contingência Tables of contingency
543	Ligações e armaduras de lajes em vigas mistas de aço e de concreto. / Connections and slab reinforcement of concrete-steel composite beams. Fuzihara, Marisa Aparecida Leonel da Silva 24 November 2006 (has links) As vigas mistas de aço-concreto vêm ganhando espaço no mundo e no Brasil. Sua grande vantagem é o aproveitamento das melhores propriedades que cada material apresenta. O aço possui excelente resposta para esforços tanto de tração como os de compressão e o concreto para esforços de compressão. As vigas mistas envolvem basicamente o perfil de aço, a laje de concreto, os conectores e as armaduras. Na interface destes materiais ocorrem fenômenos que merecem destaque, como grau de interação, cisalhamento na superfície de contato e separação vertical. Os procedimentos normalmente empregados em projetos de estruturas convencionais de concreto armado e de aço fornecem muitas respostas para questões semelhantes nas estruturas mistas, porém, no geral, não abordam a questão mais relevante que é a ligação entre o aço e o concreto. Na vizinhança dos conectores de cisalhamento, a laje da viga mista de aço e concreto está sujeita a uma combinação de cisalhamento longitudinal e momento fletor transversal, por isso a interface é a região que necessita de uma análise cuidadosa. Esses aspectos são os objetos principais da pesquisa. Adicionalmente são discutidos os procedimentos de projetos adotados pelas normas brasileira (NBR 8800-86), americana (AISC) e européia (EUROCODE 4): nas regiões de ligações entre os materiais por meio de conectores em perfis de aço sob lajes de concreto, no controle da fissuração em seções solicitadas por momentos negativos e nas armaduras transversais de costura. / The use of composite steel-concrete beams is increasing in Brazil and in the world, because this is to take advantage of the best properties of each material. Steel has an excellent response to compression and tension and concrete has to compression. Composite beams include basically the steel beam, concrete slab, connectors and reinforcement. Some phenomena in the interface of these materials must be considered, like the degree of interaction, shear in contact surface and uplift. The procedures normally taken in design of conventional structures of reinforced concrete and steel structures supply many answers to similar questions in composite structures, but, in general, they do not approach the most relevant question which is the bond between steel and concrete. The slab of composite steel-concrete beam is affected by a combination of longitudinal shear and transverse flexure, in the neighborhood of the shear connector. The analysis of the behavior of the slab and the reinforcement are main aspect of the work. In addition, some design procedures adopted by Brazilian Standard (NBR 8800-86), American Standard (AISC-2005) and European standard (EUROCODE 4) are discussed, in especial the related to connects, the crack control in sections with hogging moment and in transverse reinforcement. Armaduras de costura Armaduras de lajes Composite beams Connections Ligações Slab reinforcement Transverse reinforcement Vigas mistas
544	The Effects of Behavioral Charting, Token Reinforcement, and Social Reinforcement on the Production Rates of Sheltered Workshop Clients Moore, Eugenia M. 12 1900 (has links) This investigation concerned the effects of behavioral charting, token reinforcement, social reinforcement, and combinations of behavioral charting with token or social reinforcement, upon the production rates of sheltered workshop clients. The differential effects of these reinforcement methods were investigated by arranging for the application of each reinforcement mode in a sheltered workshop setting and comparing the mean production rates achieved by two groups of sheltered workshop clients under each reinforcement condition. The findings derived from this sample led to the conclusion that positive reinforcement, and specifically social reinforcement used both alone and in combination with behavioral charting, can be a very effective mode of reinforcement for sheltered workshop clients. It was suggested that more attention might be devoted in rehabilitation facilities to using the simpler and more readily available forms of reinforcement which behavioral charting and social reinforcement represent. sheltered workshops Goodwill Industries reinforcement Operant conditioning. Sheltered workshops. Reinforcement (Psychology)
545	Contingência e contigüidade no responder de ratos submetidos a esquemas de razão e intervalo variáveis / Contingencies and contiguity imposition on response by exposing rats to variable interval and variable ratio schedule Cristina Moreira Fonseca 12 May 2006 (has links) O presente estudo é formado por dois experimentos (Experimento 1 e Experimento 2) que empregaram procedimento de atraso de reforço não sinalizado non-resetting [esquema tandem em que o segundo componente ocorre em tempo fixo (FT)]. Os experimentos tiveram como objetivo geral manipular experimentalmente relações de contingência e contigüidade utilizando diferentes esquemas de reforço (esquema dependente de resposta, esquema dependente da resposta com liberação atrasada do reforço e esquema independente de resposta). Mais especificamente, os experimentos tiveram como objetivo verificar os efeitos produzidos pela introdução do atraso do reforço sobre a taxa e a distribuição de freqüência no tempo das respostas de pressão à barra, emitidas por ratos submetidos a esquemas de intervalo variável (VI) e razão variável (VR).No Experimento 1, os efeitos da introdução de um atraso de 5 s foram comparados à liberação de água independente da resposta (VT). Os resultados do experimento mostraram que, em linha de base (presença de contingência e contigüidade), o esquema VR gerou, relativamente ao VI, taxas de respostas mais elevadas. Com a introdução do atraso de reforço (presença de contingência e contigüidade reduzida) houve diminuição nas taxas de respostas dos sujeitos de ambos os grupos em relação à linha de base, com diminuição maior nas taxas de respostas dos sujeitos do Grupo VI. Na vigência do VT (ausência de contingência, mas não de contigüidade), esta diminuição foi ainda mais acentuada. Os efeitos diferenciados do atraso nos esquemas VI e VR foram detalhados por meio da distribuição de freqüência de respostas no tempo, de modo a se identificarem os valores de atraso que efetivamente ocorreram (isto é, o intervalo entre a liberação do reforço e a última resposta anteriormente emitida). Para os sujeitos do Grupo VI, os valores de atrasos registrados se concentraram geralmente em valores próximos ao atraso programado de 5 s enquanto que, para os sujeitos do Grupo VR, os intervalos efetivamente registrados assumiram valores menores. Uma explicação para esses resultados deve-se às próprias características dos esquemas. Deste modo, a probabilidade dos atrasos registrados se concentrarem no valor do atraso programado é maior no VI do que no VR. No Experimento 2, foi verificado o efeito da introdução de diferentes valores de atraso - 2, 5 e 8 s - sobre a taxa e a distribuição de freqüência de respostas, submetendo-se ratos ingênuos aos esquemas de VI e VR, com e sem atraso de reforço. Os resultados do Experimento 2 mostraram que, com a introdução dos diferentes valores de atraso de reforço (presença de contingência e contigüidade parametricamente manipulada), houve diminuições nas taxas de respostas dos sujeitos de ambos os grupos. A análise por meio da distribuição de freqüência dos valores de atrasos mostrou que para os sujeitos de ambos os grupos - VI e VR, os valores de atrasos registrados assumiram os menores intervalos quando o atraso foi de 2 s, concentraram-se em valores próximos ao atraso programado, quando o atraso foi de 5 s e, foram distribuídos entre os diferentes intervalos quando o atraso foi de 8 s. Resultados diferentes, porém, foram observados em dois sujeitos cujos valores de atrasos se concentraram nos valores de atrasos programados. No conjunto, os resultados de ambos os experimentos mostram que apesar das taxas de respostas semelhantes, identificam-se efeitos do atraso sob a distribuição do responder quando os sujeitos estão sob VI ou VR. Efeitos que não puderam ser observados quando a análise dos dados limitou-se à taxa de resposta, medida esta mais freqüentemente usada na investigação de relações de contingência e contigüidade. / The present study comprises two experiments (experiment 1 and experiment 2) that applied delayed reinforcement [non-resentting] procedure [schedule in which the second component operates on a fixed time condition (FT)]. The main purpose was to manipulate experimentally contingencies and contiguity relations by using different reinforcement schedules (response dependent schedule, response dependent schedule with delayed reinforcement and response independent schedule). The experiments investigated the effects of delayed reinforcement imposition on the lever press response rate and distribution of frequency on time, by exposing rats to variable interval (VI) and variable ratio schedule (VR). Experiment 1 compared effects of 5 s delays imposition with response independent water presentation. Experiment 1 showed that VR schedule engendered, in baseline condition (contingency and contiguity condition), higher response rates than did VI schedule. Delayed reinforcement (contingency and reduced contiguity condition) produced lower response rates, in both groups, than that of baseline condition. The lowest rates were observed in VI group. During VT condition (no contingency, but with contiguity), the rate decrease was accentuated. The different effects of delay under VI and VR schedules were showed in detail by the distribution of response frequency on time so that the effective delay values could be identified (the interval between reinforcement presentation and the last emitted response). In VI group, the observed delay values were in general near the programmed 5 s delay, whereas the intervals effectively observed, in VR group, were lower than the programmed 5 s delay. The schedules features can explain these results. The observed delays are more likely to be near the delay programmed value in VI than in VR schedule. Experiment 2 showed the effect of imposition of different delay values 2, 5 and 8 s- on the response rate and frequency distribution, by exposing naïve rats to VI and VR schedules, with and without reinforcement delay. The experiment 2 revealed that the response rates decreased, in both groups, when different reinforcement delay values were presented (contingency and parametrically manipulated contiguity). The analysis of distribution of delay values frequency showed that the registered delay values were lower than the programmed 2 s delay, delay values were in general near the programmed 5 s delay and distributed over the different values in both groups (VI and VR). Different findings were, however, observed in two subjects whose delay values were near the programmed delay values. The results of both experiments, taken together, indicate that the analysis of distribution of delay values frequency reveals the different effects of delay on VI and VR schedule. These effects could not be observed when the data analysis took into account only response rate, a more usual measure employed in investigations on relationship between contingency and contiguity. Atraso de reforçamento Esquemas de reforço Ratos Tabelas de contingência Rats Reinforcement delay Reinforcement schedules Tables of contingency
546	An Evaluation of Reinforcement Effects of Preferred Items During Discrete-Trial Instruction Rorer, Lynette 05 1900 (has links) This study compared the relative reinforcing efficacy of high-preferred and low-preferred stimuli, as determined by two types of preference assessments, on acquisition rates in three children diagnosed with an Autism Spectrum Disorder (ASD). The study also evaluated the indirect effects of preference on students’ stereotypy and problem behavior during instructional periods. Participants were presented with a task and provided high or low-preferred stimuli contingent upon correct responding. Results showed that acquisition occurred more rapidly in the highly preferred condition for some participants. Higher rates of problem behavior occurred in the low preferred condition for all participants. These results highlight the importance of utilizing preference assessment procedures to identify and deliver high-preferred items in skill acquisition procedures for individuals with ASD. reinforcement discrete trial autism Children with autism spectrum disorders. Reinforcement (Psychology)
547	Using Contingency Maps to Teach Requests for Information Andrade-Plaza, Roberto 28 June 2018 (has links) Autism spectrum disorder is a developmental disorder characterized by social, behavioral, and communicative deficits. Although there is no known cure for autism, there are many research-based interventions that aid in strengthening such deficits, especially those associated with failures of stimulus control One way to address such failures is to provide additional stimuli that enhance or override information provided by naturally occurring stimuli. Contingency maps are one such example. This study uses an observing response (i.e., hand-raising) to allow the subjects to request contingency maps. The purpose of this study is to identify if contingency maps function as reinforcers and if requests for information can be acquired using an observing-response paradigm. Major findings of the present study indicate that requests for information can be acquired and maintained by access to CMs. contingency maps mixed schedule of reinforcement multiple schedule of reinforcement observing response Psychology
548	Bayesian methods for knowledge transfer and policy search in reinforcement learning Wilson, Aaron (Aaron Creighton) 28 July 2012 (has links) How can an agent generalize its knowledge to new circumstances? To learn effectively an agent acting in a sequential decision problem must make intelligent action selection choices based on its available knowledge. This dissertation focuses on Bayesian methods of representing learned knowledge and develops novel algorithms that exploit the represented knowledge when selecting actions. Our first contribution introduces the multi-task Reinforcement Learning setting in which an agent solves a sequence of tasks. An agent equipped with knowledge of the relationship between tasks can transfer knowledge between them. We propose the transfer of two distinct types of knowledge: knowledge of domain models and knowledge of policies. To represent the transferable knowledge, we propose hierarchical Bayesian priors on domain models and policies respectively. To transfer domain model knowledge, we introduce a new algorithm for model-based Bayesian Reinforcement Learning in the multi-task setting which exploits the learned hierarchical Bayesian model to improve exploration in related tasks. To transfer policy knowledge, we introduce a new policy search algorithm that accepts a policy prior as input and uses the prior to bias policy search. A specific implementation of this algorithm is developed that accepts a hierarchical policy prior. The algorithm learns the hierarchical structure and reuses components of the structure in related tasks. Our second contribution addresses the basic problem of generalizing knowledge gained from previously-executed policies. Bayesian Optimization is a method of exploiting a prior model of an objective function to quickly identify the point maximizing the modeled objective. Successful use of Bayesian Optimization in Reinforcement Learning requires a model relating policies and their performance. Given such a model, Bayesian Optimization can be applied to search for an optimal policy. Early work using Bayesian Optimization in the Reinforcement Learning setting ignored the sequential nature of the underlying decision problem. The work presented in this thesis explicitly addresses this problem. We construct new Bayesian models that take advantage of sequence information to better generalize knowledge across policies. We empirically evaluate the value of this approach in a variety of Reinforcement Learning benchmark problems. Experiments show that our method significantly reduces the amount of exploration required to identify the optimal policy. Our final contribution is a new framework for learning parametric policies from queries presented to an expert. In many domains it is difficult to provide expert demonstrations of desired policies. However, it may still be a simple matter for an expert to identify good and bad performance. To take advantage of this limited expert knowledge, our agent presents experts with pairs of demonstrations and asks which of the demonstrations best represents a latent target behavior. The goal is to use a small number of queries to elicit the latent behavior from the expert. We formulate a Bayesian model of the querying process, an inference procedure that estimates the posterior distribution over the latent policy space, and an active procedure for selecting new queries for presentation to the expert. We show, in multiple domains, that the algorithm successfully learns the target policy and that the active learning strategy generally improves the speed of learning. / Graduation date: 2013 Machine Learning Reinforcement Learning Bayesian Transfer Reinforcement learning Bayesian statistical decision theory Mathematical optimization
549	Embodied Evolution of Learning Ability Elfwing, Stefan January 2007 (has links) Embodied evolution is a methodology for evolutionary robotics that mimics the distributed, asynchronous, and autonomous properties of biological evolution. The evaluation, selection, and reproduction are carried out by cooperation and competition of the robots, without any need for human intervention. An embodied evolution framework is therefore well suited to study the adaptive learning mechanisms for artificial agents that share the same fundamental constraints as biological agents: self-preservation and self-reproduction. The main goal of the research in this thesis has been to develop a framework for performing embodied evolution with a limited number of robots, by utilizing time-sharing of subpopulations of virtual agents inside each robot. The framework integrates reproduction as a directed autonomous behavior, and allows for learning of basic behaviors for survival by reinforcement learning. The purpose of the evolution is to evolve the learning ability of the agents, by optimizing meta-properties in reinforcement learning, such as the selection of basic behaviors, meta-parameters that modulate the efficiency of the learning, and additional and richer reward signals that guides the learning in the form of shaping rewards. The realization of the embodied evolution framework has been a cumulative research process in three steps: 1) investigation of the learning of a cooperative mating behavior for directed autonomous reproduction; 2) development of an embodied evolution framework, in which the selection of pre-learned basic behaviors and the optimization of battery recharging are evolved; and 3) development of an embodied evolution framework that includes meta-learning of basic reinforcement learning behaviors for survival, and in which the individuals are evaluated by an implicit and biologically inspired fitness function that promotes reproductive ability. The proposed embodied evolution methods have been validated in a simulation environment of the Cyber Rodent robot, a robotic platform developed for embodied evolution purposes. The evolutionarily obtained solutions have also been transferred to the real robotic platform. The evolutionary approach to meta-learning has also been applied for automatic design of task hierarchies in hierarchical reinforcement learning, and for co-evolving meta-parameters and potential-based shaping rewards to accelerate reinforcement learning, both in regards to finding initial solutions and in regards to convergence to robust policies. / QC 20100706 Embodied Evolution Evolutionary Robotics Reinforcement Learning Shaping Rewards Meta-parameters Hierarchical Reinforcement Learning Computer science Datalogi
550	Adaptation-based programming Bauer, Tim (Timothy R.) 31 January 2013 (has links) Partial programming is a field of study where users specify an outline or skeleton of a program, but leave various parts undefined. The undefined parts are then completed by an external mechanism to form a complete program. Adaptation-Based Programming (ABP) is a method of partial programming that utilizes techniques from the field of reinforcement learning (RL), a subfield of machine learning, to find good completions of those partial programs. An ABP user writes a partial program in some host programming language. At various points where the programmer is uncertain of the best course of action, they include choices that non-deterministically select amongst several options. Additionally, users indicate program success through a reward construct somewhere in their program. The resulting non-deterministic program is completed by treating it as an equivalent RL problem and solving the problem with techniques from that field. Over repeated executions, the RL algorithms within the ABP system will learn to select choices at various points that maximize the reward received. This thesis explores various aspects of ABP such as the semantics of different implementations, including different design trade-offs encountered with each approach. The goal of all approaches is to present a model for programs that adapt to their environment based on the points of uncertainty within the program that the programmer has indicated. The first approach presented in this work is an implementation of ABP as a domain-specific language embedded within a functional language. This language provides constructs for common patterns and situations that arise in adaptive programs. This language proves to be compositional and to foster rapid experimentation with different adaptation methods (e.g. learning algorithms). A second approach presents an implementation of ABP as an object-oriented library that models adaptive programs as formal systems from the field of RL called Markov Decision Processes (MDPs). This approach abstracts away many of the details of the learning algorithm from the casual user and uses a fixed learning algorithm to control the program adaptation rather than allowing it to vary. This abstraction results in an easier-to-use library, but limits the scenarios that ABP can effectively be used in. Moreover, treating adaptive programs as MDPs leads to some unintuitive situations where seemingly reasonably programs fail to adapt efficiently. This work addresses this problem with algorithms that analyze the adaptive program's structure and data flow to boost the rate at which these problematic adaptive programs learn thus increasing the number of problems that ABP can effectively be used to solve. / Graduation date: 2013 Partial Programming Programming Languages Reinforcement Learning Reinforcement learning

Search results