Return to search

Dynamic decision behavior: Competitive tests of decision policies in a class of two-armed bandit problems

The Two-Armed Bandit Problem (TAB) is an individual decision making problem that is dynamic in nature. In a dynamic task, stage-to-stage changes in the state of the system are affected by the decision-maker's (DM) previous decisions as well as by the states of the system at the preceding stages. Unlike most of the dynamic tasks that are very complex, the TAB is much simpler with respect to the way it is presented to the DM and the range of decisions on each trial (simple binary choice). Thus it is an excellent choice for the study of behavior in dynamic tasks and is the focus of this thesis. Most of the earlier research has focused on developing theoretical solutions to the problem and variations of the problem. Very little effort has been directed to examining the performance of naive subjects. Hence this dissertation tries to present subjects with a series of realistic and consequently progressively more difficult tasks, and to account for their behavior. We looked at both individual and aggregate behavior. Given that the problems we tackled were so ill defined, it precluded us from developing normative solutions. Hence we took a descriptive approach, sacrificing mathematical tractability for realism. Our research focused on the classic TAB problem and three variations of it (namely, the one armed bandit problem (OAB)), TAB problem with increasing probabilities and the TAB problem with one arm with increasing probabilities and the other arm with decreasing probabilities. Instead of just capturing aggregate behavior we paid particular attention to individual decision makers comparing their decisions in the classic TAB case to the optimal policy and to two degradations of the optimal policy. In the other three experiments subjects' decisions were compare to heuristics that were developed and actual earnings were compared to potential earnings. Results were mixed in the experiments. None of our subjects were consistent in the policies that they used. None of the policies used could account for more than 30% of the decisions. In the classic TAB problem and the OAB problem we find that our results contradict earlier studies. We also found that our heuristics outperformed the subjects in some of the studies and in a couple of the studies the subjects outperformed the heuristics.

Identiferoai:union.ndltd.org:arizona.edu/oai:arizona.openrepository.com:10150/284295
Date January 2000
CreatorsAbraham, Elizabeth Verghese
ContributorsRapoport, Amnon
PublisherThe University of Arizona.
Source SetsUniversity of Arizona
Languageen_US
Detected LanguageEnglish
Typetext, Dissertation-Reproduction (electronic)
RightsCopyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.

Page generated in 0.0127 seconds