How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning - an Exact Macroscopic Characterization

Open access
Date
2022-06-02Type
- Working Paper
ETH Bibliography
yes
Altmetrics
Abstract
In practice, multi-task learning (through learning features shared among tasks) is an essential property of deep neural networks (NNs). While infinite-width limits of NNs can provide good intuition for their generalization behavior, the well-known infinite-width limits of NNs in the literature (e.g., neural tangent kernels) assume specific settings in which wide ReLU-NNs behave like shallow Gaussian Processes with a fixed kernel. Consequently, in such settings, these NNs lose their ability to benefit from multi-task learning in the infinite-width limit. In contrast, we prove that optimizing wide ReLU neural networks with at least one hidden layer using ℓ 2 -regularization on the parameters promotes multitask learning due to representation-learning – also in the limiting regime where the network width tends to infinity. We present an exact quantitative characterization of this infinite width limit in an appropriate function space that neatly describes multi-task learning. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000550890Publication status
publishedPublisher
ETH ZurichSubject
Multi-task learning; Machine Learning; Machine learning (artificial intelligence); Machine Learning (stat.ML); Neural Networks; Deep Learning; Deep ReLU neural networks; Deep neural networks; Bayesian Neural Networks; Bayesian neural network; artificial intelligence; machine learning; Artificial neural networks; Regularization; Tikhonov regularization; Generalization; Overparametrized Neural Networks; Large width limit; Infinite width limitOrganisational unit
03845 - Teichmann, Josef / Teichmann, Josef
Related publications and datasets
Is new version of: https://doi.org/10.3929/ethz-b-000522526
Is identical to: https://arxiv.org/abs/2112.15577
More
Show all metadata
ETH Bibliography
yes
Altmetrics