Sharp Analysis of Stochastic Optimization under Global Kurdyka-Łojasiewicz Inequality
METADATA ONLY
Loading...
Author / Producer
Date
2022
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
We study the complexity of finding the global solution to stochastic nonconvex optimization when the objective function satisfies global Kurdyka-{\L}ojasiewicz (KL) inequality and the queries from stochastic gradient oracles satisfy mild expected smoothness assumption. We first introduce a general framework to analyze Stochastic Gradient Descent (SGD) and its associated nonlinear dynamics under the setting. As a byproduct of our analysis, we obtain a sample complexity of O ( ϵ − ( 4 − α ) / α ) for SGD when the objective satisfies the so called α -P{\L} condition, where α is the degree of gradient domination. Furthermore, we show that a modified SGD with variance reduction and restarting (PAGER) achieves an improved sample complexity of O ( ϵ − 2 / α ) when the objective satisfies the average smoothness assumption. This leads to the first optimal algorithm for the important case of α = 1 which appears in applications such as policy optimization in reinforcement learning.
Permanent link
Publication status
published
Book title
Advances in Neural Information Processing Systems 35
Journal / series
Volume
Pages / Article No.
15836 - 15848
Publisher
Curran
Event
36th Annual Conference on Neural Information Processing Systems (NeurIPS 2022)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
09729 - He, Niao / He, Niao
02219 - ETH AI Center / ETH AI Center
Notes
Poster presentation on November 30, 2022.
Funding
Related publications and datasets
Is new version of: 10.48550/arXiv.2210.01748