Open access
Date
2020-04Type
- Journal Article
Abstract
We consider a setting where multiple players sequentially choose among a common set of actions (arms). Motivated by an application to cognitive radio networks, we assume that players incur a loss upon colliding, and that communication between players is not possible. Existing approaches assume that the system is stationary. Yet this assumption is often violated in practice, e.g., due to signal strength fluctuations. In this work, we design the first multi-player Bandit algorithm that provably works in arbitrarily changing environments, where the losses of the arms may even be chosen by an adversary. This resolves an open problem posed by Rosenski et al. (2016). Show more
Permanent link
https://doi.org/10.3929/ethz-b-000414972Publication status
publishedExternal links
Journal / series
Journal of Machine Learning ResearchVolume
Pages / Article No.
Publisher
MIT PressSubject
Multi-Armed Bandits; Multi-Player Problems; Online Learning; Sequential Decision Making; Cognitive Radio NetworksOrganisational unit
03908 - Krause, Andreas / Krause, Andreas
Funding
815943 - Reliable Data-Driven Decision Making in Cyber-Physical Systems (EC)
More
Show all metadata