On the Laplace approximation for sequential model selection of Bayesian neural networks
OPEN ACCESS
Loading...
Author / Producer
Date
2022
Publication Type
Master Thesis
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
The Laplace approximation yields a tractable marginal likelihood for Bayesian neural networks. This enables empirical Bayes methods for optimizing BNN hyperparameters directly on training data, even if validation data is unavailable. Sequential decision-making problems are typical instances of settings that rely on training data alone. Hence, this thesis asks whether hyperparameter optimization using the Laplace marginal likelihood benefits Bayesian neural networks on sequential decision-making problems. The answer is mixed: while online model selection improves decision-making performance for large sample sizes, maximum marginal likelihood models in the small-data regime behave like constant predictors. Therefore, this thesis further investigates the relation between the Laplace marginal likelihood and generalization. Towards an explanation, we hypothesize that the marginal likelihood assesses generalization in observation space, while decision-making requires generalization in function space. In particular, fixed hyperparameters yield overfitting in function space, while optimized hyperparameters yield underfitting. Lastly, we hypothesize that implicit ensembling effects during sequential decision-making compensate overfitting, but not underfitting.
Permanent link
Publication status
published
External links
Editor
Contributors
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH Zurich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
09568 - Rätsch, Gunnar / Rätsch, Gunnar