81 |
Model-based Bayesian reinforcement learning in complex domainsRoss, Stéphane, January 1900 (has links)
Thesis (M.Sc.). / Written for the School of Computer Science. Title from title page of PDF (viewed 2008/12/09). Includes bibliographical references.
|
82 |
Extending dynamic scripting /Ludwig, Jeremy R. January 2008 (has links)
Thesis (Ph. D.)--University of Oregon, 2008. / Typescript. Includes vita and abstract. Includes bibliographical references (leaves 163-167). Also available online in Scholars' Bank; and in ProQuest, free to University of Oregon users.
|
83 |
Condensing observation of locale and agents : a state representation /Valenti, Brett A. January 1900 (has links)
Thesis (M.S.)--Oregon State University, 2010. / Printout. Includes bibliographical references (leaves 67-72). Also available on the World Wide Web.
|
84 |
The assessment of preference for qualitatively different reinforcers in persons with developmental and learning disabilities a comparison of value using behavioral economic and standard preference assessment procedures /Bredthauer, Jennifer Lyn. Johnston, James M., January 2009 (has links)
Thesis (Ph. D.)--Auburn University. / Abstract. Includes bibliographical references (p. 148-163).
|
85 |
Limitations and extensions of the WoLF-PHC algorithm /Cook, Philip R., January 2007 (has links) (PDF)
Thesis (M.S.)--Brigham Young University. Dept. of Computer Science, 2007. / Includes bibliographical references (p. 93-101).
|
86 |
Task localization, similarity, and transfer : towards a reinforcement learning task library system /Carroll, James Lamond, January 2005 (has links) (PDF)
Thesis (M.S.)--Brigham Young University. Dept. of Computer Science, 2005. / Includes bibliographical references (p. 113-117).
|
87 |
Zpětnovazební učení v multiagentním makroekonomickém modelu / Reinforcement learning in Agent-based macroeconomic modelVlk, Bořivoj January 2018 (has links)
Utilizing game theory, learning automata and reinforcement learning concepts, thesis presents a computational model (simulation) based on general equilibrium theory and classical monetary model. Model is based on interacting Constructively Rational agents. Constructive Ratio- nality has been introduced in current literature as machine learning based concept that allows relaxing assumptions on modeled economic agents information and ex- pectations. Model experiences periodical endogenous crises (Fall in both production and con- sumption accompanied with rise in unemployment rate). Crises are caused by firms and households adopting to a change in price and wage levels. Price and wage level adjustments are necessary for the goods and labor market to clear in the presence of technological growth. Finally, model has good theoretical background and large potential for further de- velopment. Also, general properties of games of learning entities are examined, with special focus on sudden changes (shocks) in the game and behavior of game's play- ers, during recovery from which rigidities can emerge. JEL Classification D80, D83, C63, E32, C73, Keywords Learning, Information and Knowledge, Agent-based, Reinforcement learning, Business cycle, Stochastic and Dynamic Games, Simulation, Modeling Author's e-mail...
|
88 |
Motor expectancy: the modulation of the reward positivity in a reinforcement learning motor taskTrska, Robert 30 August 2018 (has links)
An adage posits that we learn from our mistakes; however, this is not entirely true. According to reinforcement learning theory, we learn when the expectation of our actions differs from outcomes. Here, we examined whether expectancy driven learning lends a role in motor learning. Given the vast amount of overlapping anatomy and circuitry within the brain with respect to reward and motor processes, it is appropriate to examine both motor control and expectancy processes within a singular task. In the current study, participants performed a line drawing task via tablet under conditions of changing expectancies. Participants were provided feedback in a reinforcement-learning manner, as positive (✓) or negative (x) based off their performance. Modulation of expected outcomes were reflected by changes in amplitude of the human event-related potential (ERP), the reward positivity. The reward positivity is thought to reflect phasic dopamine release from the mesolimbic dopaminergic system to the basal ganglia and cingulate cortex. Due to the overlapping circuitry of reward and motor pathways, another human ERP, the bereitschatftspotential (BP), was examined. The BP is implicated in motor planning and execution; however, the late aspect of the BP shares similarity with the contingent negative variability (CNV). Current evidence demonstrates a relationship between expectancy and reward positivity amplitude in a motor learning context, as well as modulation of the BP under difficult task conditions. Behavioural data supports prior literature and may suggest a connection between sensory motor prediction errors working in concert with reward prediction errors. Further evidence supports a frontal-medial evaluation system for motor errors. Additionally, results support prior evidence of motor plans being formed upon target observation and held in memory until motor execution, rather than their formation before movement onset. / Graduate
|
89 |
Extending dynamic scriptingLudwig, Jeremy R. 12 1900 (has links)
xvi, 167 p. A print copy of this thesis is available through the UO Libraries. Search the library catalog for the location and call number. / The dynamic scripting reinforcement learning algorithm can be extended to improve the speed, effectiveness, and accessibility of learning in modern computer games without sacrificing computational efficiency. This dissertation describes three specific enhancements to the dynamic scripting algorithm that improve learning behavior and flexibility while imposing a minimal computational cost: (1) a flexible, stand alone version of dynamic scripting that allows for hierarchical dynamic scripting, (2) a method of using automatic state abstraction to increase the context sensitivity of the algorithm, and (3) an integration of this algorithm with an existing hierarchical behavior modeling architecture. The extended dynamic scripting algorithm is then examined in the three different contexts. The first results reflect a preliminary investigation based on two abstract real-time strategy games. The second set of results comes from a number of abstract tactical decision games, designed to demonstrate the strengths and weaknesses of extended dynamic scripting. The third set of results is generated by a series of experiments in the context of the commercial computer role-playing game Neverwinter Nights demonstrating the capabilities of the algorithm in an actual game. To conclude, a number of future research directions for investigating the effectiveness of extended dynamic scripting are described. / Adviser: Arthur Farley
|
90 |
A reinforcement-learning approach to understanding loss-chasing behavior in ratsMarshall, Andrew Thomas January 1900 (has links)
Doctor of Philosophy / Psychological Sciences / Kimberly Kirkpatrick / Risky decisions are inherently characterized by the potential to receive gains and losses from these choices, and gains and losses have distinct effects on global risky choice behavior and the likelihoods of making risky choices depending on the outcome of the previous choice. One translationally-relevant phenomenon of risky choice is loss-chasing, in which individuals make risky choices following losses. However, the mechanisms of loss-chasing are poorly understood. The goal of two experiments was to illuminate the mechanisms governing individual differences in loss-chasing and risky choice behaviors. In two experiments, rats chose between a certain outcome that always delivered reward and a risky outcome that probabilistically delivered reward. In Experiment 1, loss processing and loss-chasing behavior were assessed in the context of losses-disguised-as-wins (LDWs), or loss outcomes presented along with gain-related stimuli. The rats presented with LDWs were riskier and less sensitive to differential losses. In Experiment 2, these behaviors were assessed relative to the number of risky losses that could be experienced. Here, the addition of reward omission or a small non-zero loss to the possible risky outcomes elicited substantial individual differences in risky choice, with some rats increasing, decreasing, or maintaining their previous risky choice preferences. Several reinforcement learning (RL) models were fit to individual rats’ data to elucidate the possible psychological mechanisms that best accounted for individual differences in risky choice and loss-chasing behaviors. The RL analyses indicated that the critical predictors of risky choice and loss-chasing behavior were the different rates that individuals updated value estimates with newly experienced gains and losses. Thus, learning deficits may predict individual differences in maladaptive risky decision making. Accordingly, targeted interventions to alleviate learning deficits may ultimately increase the likelihood of making more optimal and informed choices.
|
Page generated in 0.0955 seconds