Show simple item record

dc.contributor.author
Heiss, Jakob
dc.contributor.author
Teichmann, Josef
dc.contributor.author
Wutte, Hanna
dc.date.accessioned
2022-10-24T15:07:26Z
dc.date.available
2022-06-03T13:12:31Z
dc.date.available
2022-06-03T13:25:19Z
dc.date.available
2022-06-03T13:47:45Z
dc.date.available
2022-10-20T15:35:20Z
dc.date.available
2022-10-24T15:07:26Z
dc.date.issued
2022-06-02
dc.identifier.uri
http://hdl.handle.net/20.500.11850/550890
dc.identifier.doi
10.3929/ethz-b-000550890
dc.description.abstract
In practice, multi-task learning (through learning features shared among tasks) is an essential property of deep neural networks (NNs). While infinite-width limits of NNs can provide good intuition for their generalization behavior, the well-known infinite-width limits of NNs in the literature (e.g., neural tangent kernels) assume specific settings in which wide ReLU-NNs behave like shallow Gaussian Processes with a fixed kernel. Consequently, in such settings, these NNs lose their ability to benefit from multi-task learning in the infinite-width limit. In contrast, we prove that optimizing wide ReLU neural networks with at least one hidden layer using ℓ 2 -regularization on the parameters promotes multitask learning due to representation-learning – also in the limiting regime where the network width tends to infinity. We present an exact quantitative characterization of this infinite width limit in an appropriate function space that neatly describes multi-task learning.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
Multi-task learning
en_US
dc.subject
Machine Learning
en_US
dc.subject
Machine learning (artificial intelligence)
en_US
dc.subject
Machine Learning (stat.ML)
en_US
dc.subject
Neural Networks
en_US
dc.subject
Deep Learning
en_US
dc.subject
Deep ReLU neural networks
en_US
dc.subject
Deep neural networks
en_US
dc.subject
Bayesian Neural Networks
en_US
dc.subject
Bayesian neural network
en_US
dc.subject
artificial intelligence
en_US
dc.subject
machine learning
en_US
dc.subject
Artificial neural networks
en_US
dc.subject
Regularization
en_US
dc.subject
Tikhonov regularization
en_US
dc.subject
Generalization
en_US
dc.subject
Overparametrized Neural Networks
en_US
dc.subject
Large width limit
en_US
dc.subject
Infinite width limit
en_US
dc.title
How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning - an Exact Macroscopic Characterization
en_US
dc.title.alternative
How Infinitely Wide Neural Networks Benefit from Multi-task Learning – an Exact Macroscopic Characterization
en_US
dc.type
Working Paper
dc.rights.license
In Copyright - Non-Commercial Use Permitted
ethz.size
26 p. (version 3.2) ; 52 p. (version 4)
en_US
ethz.code.ddc
DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science
en_US
ethz.code.ddc
DDC - DDC::5 - Science::510 - Mathematics
en_US
ethz.code.jel
::JEL - JEL::C - Mathematical and Quantitative Methods
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02000 - Dep. Mathematik / Dep. of Mathematics::02003 - Mathematik Selbständige Professuren::03845 - Teichmann, Josef / Teichmann, Josef
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02000 - Dep. Mathematik / Dep. of Mathematics::02003 - Mathematik Selbständige Professuren::03845 - Teichmann, Josef / Teichmann, Josef
en_US
ethz.relation.isNewVersionOf
10.3929/ethz-b-000522526
ethz.relation.isIdenticalTo
https://arxiv.org/abs/2112.15577
ethz.date.deposited
2022-06-03T13:12:37Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2022-06-03T13:25:54Z
ethz.rosetta.lastUpdated
2023-02-07T07:19:17Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=How%20Infinitely%20Wide%20Neural%20Networks%20Can%20Benefit%20from%20Multi-task%20Learning%20-%20an%20Exact%20Macroscopic%20Characterization&rft.date=2022-06-02&rft.au=Heiss,%20Jakob&Teichmann,%20Josef&Wutte,%20Hanna&rft.genre=preprint&rft.btitle=How%20Infinitely%20Wide%20Neural%20Networks%20Can%20Benefit%20from%20Multi-task%20Learning%20-%20an%20Exact%20Macroscopic%20Characterization
 Search print copy at ETH Library

Files in this item

Thumbnail
Thumbnail

Publication type

Show simple item record