Global ETD Search

Return to search

Thompson sampling-based online decision making in network routing

Online decision making is a kind of machine learning problems where decisions are made in a sequential manner so as to accumulate as many rewards as possible.
Typical examples include multi-armed bandit (MAB) problems where an agent needs to decide which arm to pull in each round, and network routing problems where each router needs to decide the next hop for each packet.
Thompson sampling (TS) is an efficient and effective algorithm for online decision making problems. Although TS has been proposed for a long time, it was not until recent years that the theoretical guarantees for TS in the standard MAB were given.
In this thesis, we first analyze the performance of TS both theoretically and practically in a special MAB called combinatorial MAB with sleeping arms and long-term fairness constraints (CSMAB-F). Then, we apply TS to a novel reactive network routing problem, called \emph{opportunistic routing without link metrics known a priori}, and use the proof techniques we developed for CSMAB-F to analyze the performance. / Graduate

http://hdl.handle.net/1828/12095

Online Decision Making

Multi-armed Bandits

Thompson Sampling

Network Routing

Identifer	oai:union.ndltd.org:uvic.ca/oai:dspace.library.uvic.ca:1828/12095
Date	02 September 2020
Creators	Huang, Zhiming
Contributors	Pan, Jianping
Source Sets	University of Victoria
Language	English, English
Detected Language	English
Type	Thesis
Format	application/pdf
Rights	Available to the World Wide Web

Page generated in 0.0031 seconds

Thompson sampling-based online decision making in network routing

Description

Links & Downloads

Tags

Additional Fields