Global ETD Search

Return to search

Beyond Disagreement-based Learning for Contextual Bandits

While instance-dependent contextual bandits have been previously studied, their analysis 
has been exclusively limited to pure disagreement-based learning. This approach lacks a 
nuanced understanding of disagreement and treats it in a binary and absolute manner. 
In our work, we aim to broaden the analysis of instance-dependent contextual bandits by 
studying them under the framework of disagreement-based learning in sub-regions. This 
framework allows for a more comprehensive examination of disagreement by considering its 
varying degrees across different sub-regions. 
To lay the foundation for our analysis, we introduce key ideas and measures widely 
studied in the contextual bandit and disagreement-based active learning literature. We 
then propose a novel, instance-dependent contextual bandit algorithm for the realizable 
case in a transductive setting. Leveraging the ability to observe contexts in advance, our 
algorithm employs a sophisticated Linear Programming subroutine to identify and exploit 
sub-regions effectively. Next, we provide a series of results tying previously introduced 
complexity measures and offer some insightful discussion on them. Finally, we enhance the 
existing regret bounds for contextual bandits by integrating the sub-region disagreement 
coefficient, thereby showcasing significant improvement in performance against the pure 
disagreement-based approach. 
In the concluding section of this thesis, we do a brief recap of the work done and suggest 
potential future directions for further improving contextual bandit algorithms within the 
framework of disagreement-based learning in sub-regions. These directions offer opportuni- 
ties for further research and development, aiming to refine and enhance the effectiveness of 
contextual bandit algorithms in practical applications.

10.25394/pgs.23739957.v1

Planning and decision making

Statistical theory

Contextual bandits

Disagreement based learning

Transductive learning

Identifer	oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/23739957
Date	26 July 2023
Creators	Pinaki Ranjan Mohanty (16522407)
Source Sets	Purdue University
Detected Language	English
Type	Text, Thesis
Rights	CC BY 4.0
Relation	https://figshare.com/articles/thesis/Beyond_Disagreement-based_Learning_for_Contextual_Bandits/23739957

Page generated in 0.0018 seconds

Beyond Disagreement-based Learning for Contextual Bandits

Description

Links & Downloads

Tags

Additional Fields