Infinite width (finite depth) neural networks benefit from multi-task learning unlike shallow Gaussian Processes - an exact quantitative macroscopic characterization

Open access
Date
2021-12-31Type
- Working Paper
ETH Bibliography
yes
Altmetrics
Abstract
We prove in this paper that optimizing wide ReLU neural networks (NNs) with at least one hidden layer using l2-regularization on the parameters enforces multi-task learning due to representation-learning -- also in the limit of width to infinity. This is in contrast to multiple other results in the literature, in which idealized settings are assumed and where wide (ReLU)-NNs loose their ability to benefit from multi-task learning in the infinite width limit. We deduce the ability of multi-task learning from proving an exact quantitative macroscopic characterization of the learned NN in an appropriate function space. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000522526Publication status
publishedPublisher
ETH ZurichSubject
Machine Learning; Deep Learning; deep neural networks; neural networks; Artificial Intelligence; Artificial neural networks; Regularization; Bayesian Neural NetworksOrganisational unit
03845 - Teichmann, Josef / Teichmann, Josef
Related publications and datasets
Is previous version of: https://doi.org/10.3929/ethz-b-000550890
Is identical to: https://arxiv.org/abs/2112.15577
More
Show all metadata
ETH Bibliography
yes
Altmetrics