Return to search

Beyond Disagreement-based Learning for Contextual Bandits

<p>While instance-dependent contextual bandits have been previously studied, their analysis<br>
has been exclusively limited to pure disagreement-based learning. This approach lacks a<br>
nuanced understanding of disagreement and treats it in a binary and absolute manner.<br>
In our work, we aim to broaden the analysis of instance-dependent contextual bandits by<br>
studying them under the framework of disagreement-based learning in sub-regions. This<br>
framework allows for a more comprehensive examination of disagreement by considering its<br>
varying degrees across different sub-regions.<br>
To lay the foundation for our analysis, we introduce key ideas and measures widely<br>
studied in the contextual bandit and disagreement-based active learning literature. We<br>
then propose a novel, instance-dependent contextual bandit algorithm for the realizable<br>
case in a transductive setting. Leveraging the ability to observe contexts in advance, our<br>
algorithm employs a sophisticated Linear Programming subroutine to identify and exploit<br>
sub-regions effectively. Next, we provide a series of results tying previously introduced<br>
complexity measures and offer some insightful discussion on them. Finally, we enhance the<br>
existing regret bounds for contextual bandits by integrating the sub-region disagreement<br>
coefficient, thereby showcasing significant improvement in performance against the pure<br>
disagreement-based approach.<br>
In the concluding section of this thesis, we do a brief recap of the work done and suggest<br>
potential future directions for further improving contextual bandit algorithms within the<br>
framework of disagreement-based learning in sub-regions. These directions offer opportuni-<br>
ties for further research and development, aiming to refine and enhance the effectiveness of<br>
contextual bandit algorithms in practical applications.<br>
<br>
</p>

  1. 10.25394/pgs.23739957.v1
Identiferoai:union.ndltd.org:purdue.edu/oai:figshare.com:article/23739957
Date26 July 2023
CreatorsPinaki Ranjan Mohanty (16522407)
Source SetsPurdue University
Detected LanguageEnglish
TypeText, Thesis
RightsCC BY 4.0
Relationhttps://figshare.com/articles/thesis/Beyond_Disagreement-based_Learning_for_Contextual_Bandits/23739957

Page generated in 0.0018 seconds