Stochastic Linear Bandits Robust to Adversarial Attacks


METADATA ONLY
Loading...

Date

2021

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

We consider a stochastic linear bandit problem in which the rewards are not only subject to random noise, but also adversarial attacks subject to a suitable budget C (i.e., an upper bound on the sum of corruption magnitudes across the time horizon). We provide two variants of a Robust Phased Elimination algorithm, one that knows C and one that does not. Both variants are shown to attain near-optimal regret in the non-corrupted case C = 0, while incurring additional additive terms respectively having a linear and quadratic dependency on C in general. We present algorithm-independent lower bounds showing that these additive terms are nearoptimal. In addition, in a contextual setting, we revisit a setup of diverse contexts, and show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing C.

Publication status

published

Book title

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)

Volume

130

Pages / Article No.

991 - 999

Publisher

PMLR

Event

24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

03908 - Krause, Andreas / Krause, Andreas check_circle

Notes

Funding

815943 - Reliable Data-Driven Decision Making in Cyber-Physical Systems (EC)
19-2 FEL-47 - Robust Sample-Efficient Learning when Data ist Costly (ETHZ)

Related publications and datasets