Global ETD Search

1	Beyond Disagreement-based Learning for Contextual Bandits Pinaki Ranjan Mohanty (16522407) 26 July 2023 (has links) <p>While instance-dependent contextual bandits have been previously studied, their analysis<br> has been exclusively limited to pure disagreement-based learning. This approach lacks a<br> nuanced understanding of disagreement and treats it in a binary and absolute manner.<br> In our work, we aim to broaden the analysis of instance-dependent contextual bandits by<br> studying them under the framework of disagreement-based learning in sub-regions. This<br> framework allows for a more comprehensive examination of disagreement by considering its<br> varying degrees across different sub-regions.<br> To lay the foundation for our analysis, we introduce key ideas and measures widely<br> studied in the contextual bandit and disagreement-based active learning literature. We<br> then propose a novel, instance-dependent contextual bandit algorithm for the realizable<br> case in a transductive setting. Leveraging the ability to observe contexts in advance, our<br> algorithm employs a sophisticated Linear Programming subroutine to identify and exploit<br> sub-regions effectively. Next, we provide a series of results tying previously introduced<br> complexity measures and offer some insightful discussion on them. Finally, we enhance the<br> existing regret bounds for contextual bandits by integrating the sub-region disagreement<br> coefficient, thereby showcasing significant improvement in performance against the pure<br> disagreement-based approach.<br> In the concluding section of this thesis, we do a brief recap of the work done and suggest<br> potential future directions for further improving contextual bandit algorithms within the<br> framework of disagreement-based learning in sub-regions. These directions offer opportuni-<br> ties for further research and development, aiming to refine and enhance the effectiveness of<br> contextual bandit algorithms in practical applications.<br> <br> </p> Planning and decision making Statistical theory Contextual bandits Disagreement based learning Active Learning Interactive Learning Data Driven ML Linear Programming Transductive learning

Search results

Beyond Disagreement-based Learning for Contextual Bandits