
Open access
Datum
2021Typ
- Conference Paper
ETH Bibliographie
yes
Altmetrics
Abstract
Deep ensembles aggregate predictions of diverse neural networks to improve generalisation and quantify uncertainty. Here, we investigate their behavior when increasing the ensemble members’ parameter size - a practice typically associated with better performance for single models. We show that under practical assumptions in the overparametrized regime far into the double descent curve, not only the ensemble test loss degrades, but common out-of-distribution detection and calibration metrics suffer as well. Reminiscent to deep double descent, we observe this phenomenon not only when increasing the single member’s capacity but also as we increase the training budget, suggesting deep ensembles can benefit from early stopping. This sheds light on the success and failure modes of deep ensembles and suggests that averaging finite width models perform better than the neural tangent kernel limit for these metrics. Mehr anzeigen
Persistenter Link
https://doi.org/10.3929/ethz-b-000501624Publikationsstatus
publishedSeiten / Artikelnummer
Konferenz
Organisationseinheit
02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.09479 - Grewe, Benjamin F. / Grewe, Benjamin F.
Anmerkungen
Conference lecture held at the poster session 1 on July 23, 2021ETH Bibliographie
yes
Altmetrics