Strong error analysis for stochastic gradient descent optimization algorithms
Metadata only
Date
2021-01Type
- Journal Article
Abstract
Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of machine learning applications. In this article we perform a rigorous strong error analysis for SGD optimization algorithms. In particular, we prove for every arbitrarily small ε∈(0,∞) and every arbitrarily large p∈(0,∞) that the considered SGD optimization algorithm converges in the strong Lp-sense with order 1/2−ε to the global minimum of the objective function of the considered stochastic optimization problem under standard convexity-type assumptions on the objective function and relaxed assumptions on the moments of the stochastic errors appearing in the employed SGD optimization algorithm. The key ideas in our convergence proof are, first, to employ techniques from the theory of Lyapunov-type functions for dynamical systems to develop a general convergence machinery for SGD optimization algorithms based on such functions, then, to apply this general machinery to concrete Lyapunov-type functions with polynomial structures and, thereafter, to perform an induction argument along the powers appearing in the Lyapunov-type functions in order to achieve for every arbitrarily large p∈(0,∞) strong Lp-convergence rates. Show more
Publication status
publishedExternal links
Journal / series
IMA Journal of Numerical AnalysisVolume
Pages / Article No.
Publisher
Oxford University PressSubject
Stochastic gradient descent; Stochastic approximation algorithms; Strong error analysisOrganisational unit
02501 - Seminar für Angewandte Mathematik / Seminar for Applied Mathematics
02204 - RiskLab / RiskLab
09557 - Cheridito, Patrick / Cheridito, Patrick
Funding
175699 - Higher order numerical approximation methods for stochastic partial differential equations (SNF)
More
Show all metadata