Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation


METADATA ONLY
Loading...

Date

2023-10-15

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

In spite of the accomplishments of deep learning based algorithms in numerous applications and very broad corresponding research interest, at the moment there is still no rigorous understanding of the reasons why such algorithms produce useful results in certain situations. A thorough mathematical analysis of deep learning based algorithms seems to be crucial in order to improve our understanding and to make their implementation more effective and efficient. In this article we provide a mathematically rigorous full error analysis of deep learning based empirical risk minimisation with quadratic loss function in the probabilistically strong sense, where the underlying deep neural networks are trained using stochastic gradient descent with random initialisation. The convergence speed we obtain suffers under the curse of dimensionality. However, it is presumably close to optimal in the generality of the framework we consider and, to the best of our knowledge, we establish the first full error analysis in the scientific literature for a deep learning based algorithm in the probabilistically strong sense as well as the first full error analysis in the scientific literature for a deep learning based algorithm where stochastic gradient descent with random initialisation is the employed optimisation method.

Publication status

published

Editor

Book title

Volume

455

Pages / Article No.

127907

Publisher

Elsevier

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Deep learning; Deep neural networks; Empirical risk minimisation; Full error analysis; Approximation; Generalisation; Optimisation; Strong convergence; Stochastic gradient descent; Random initialisation

Organisational unit

Notes

Funding

ETH-47 15-2 - Mild stochastic calculus and numerical approximations for nonlinear stochastic evolution equations with Levy noise (ETHZ)

Related publications and datasets