Stochastic Linear Bandits Robust to Adversarial Attacks
METADATA ONLY
Loading...
Author / Producer
Date
2021
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
We consider a stochastic linear bandit problem in which the rewards are not only subject to random noise, but also adversarial attacks subject to a suitable budget C (i.e., an upper bound on the sum of corruption magnitudes across the time horizon). We provide two variants of a Robust Phased Elimination algorithm, one that knows C and one that does not. Both variants are shown to attain near-optimal regret in the non-corrupted case C = 0, while incurring additional additive terms respectively having a linear and quadratic dependency on C in general. We present algorithm-independent lower bounds showing that these additive terms are nearoptimal. In addition, in a contextual setting, we revisit a setup of diverse contexts, and show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing C.
Permanent link
Publication status
published
Book title
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
Journal / series
Volume
130
Pages / Article No.
991 - 999
Publisher
PMLR
Event
24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
03908 - Krause, Andreas / Krause, Andreas
Notes
Funding
815943 - Reliable Data-Driven Decision Making in Cyber-Physical Systems (EC)
19-2 FEL-47 - Robust Sample-Efficient Learning when Data ist Costly (ETHZ)
19-2 FEL-47 - Robust Sample-Efficient Learning when Data ist Costly (ETHZ)