Information-Directed Sampling - Frequentist Analysis and Applications
dc.contributor.author
Kirschner, Johannes
dc.contributor.supervisor
Krause, Andreas
dc.contributor.supervisor
Van Roy, Benjamin
dc.contributor.supervisor
Szepesvári, Csaba
dc.date.accessioned
2021-07-14T05:57:36Z
dc.date.available
2021-07-13T14:32:04Z
dc.date.available
2021-07-14T05:57:36Z
dc.date.issued
2021
dc.identifier.uri
http://hdl.handle.net/20.500.11850/494381
dc.identifier.doi
10.3929/ethz-b-000494381
dc.description.abstract
Sequential decision-making is an iterative process between a learning agent and an environment. We study the stochastic setting, where the learner chooses an action in each round and the environment returns a noisy feedback signal. The learner's objective is to maximize a reward function that depends on the chosen actions. This basic model has many applications, including adaptive experimental design, product recommendations, dynamic pricing and black-box optimization.
The combination of statistical uncertainty and the objective to maximize reward creates a tension between exploration and exploitation: The learner has to carefully balance between actions that provide informative feedback and actions estimated to yield a high reward. The fields of bandit algorithms and partial monitoring study methods to resolve the exploration-exploitation trade-off optimally, using various regularity assumptions on the feedback-reward structure.
Two of the most widely used methods are optimistic algorithms and Thompson sampling, which have been successfully applied in numerous settings and come with strong theoretical guarantees. More recently, however, an increasing amount of evidence shows that optimism and Thompson sampling are not universal exploration principles. In structured models with correlated feedback, clearly suboptimal actions sometimes provide informative feedback that outweighs their cost. Meanwhile, optimistic approaches and Thompson sampling discard such actions early on, which leads to inefficient exploration.
An alternative and less studied design principle is information-directed sampling (IDS), originally proposed in the Bayesian setting. The main contribution in this thesis is a frequentist interpretation of the IDS framework, complemented with frequentist performance guarantees for several settings with linear reward and feedback structure. Using the IDS approach, we resolve the long-standing challenge to find an asymptotically instance-optimal algorithm for linear bandits that is simultaneously minimax optimal. We further extend the IDS approach to the more general linear partial monitoring setting, making the method applicable to a vast range of previously studied models for sequential decision-making. Along the way, we develop the theory of information-directed sampling, uncover a connection to primal-dual methods and propose computationally faster approximations.
Lastly, we discuss extensions of the IDS framework to contextual decision-making and the kernelized setting and highlight example applications.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
Sequential Decision Making
en_US
dc.subject
Partial monitoring
en_US
dc.subject
Information-Directed Sampling
en_US
dc.title
Information-Directed Sampling - Frequentist Analysis and Applications
en_US
dc.type
Doctoral Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
dc.date.published
2021-07-14
ethz.size
220 p.
en_US
ethz.code.ddc
DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science
en_US
ethz.identifier.diss
27627
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::03908 - Krause, Andreas / Krause, Andreas
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02661 - Institut für Maschinelles Lernen / Institute for Machine Learning::03908 - Krause, Andreas / Krause, Andreas
en_US
ethz.date.deposited
2021-07-13T14:32:09Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2021-07-14T05:57:42Z
ethz.rosetta.lastUpdated
2022-03-29T10:23:57Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Information-Directed%20Sampling%20-%20Frequentist%20Analysis%20and%20Applications&rft.date=2021&rft.au=Kirschner,%20Johannes&rft.genre=unknown&rft.btitle=Information-Directed%20Sampling%20-%20Frequentist%20Analysis%20and%20Applications
Files in this item
Publication type
-
Doctoral Thesis [28988]