Parameter-Agnostic Optimization under Relaxed Smoothness


METADATA ONLY
Loading...

Date

2024

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

Tuning hyperparameters, such as the stepsize, presents a major challenge of training machine learning models. To address this challenge, numerous adaptive optimization algorithms have been developed that achieve near-optimal complexities, even when stepsizes are independent of problem-specific parameters, provided that the loss function is $L$-smooth. However, as the assumption is relaxed to the more realistic $(L_0, L_1)$-smoothness, all existing convergence results still necessitate tuning of the stepsize. In this study, we demonstrate that Normalized Stochastic Gradient Descent with Momentum (NSGD-M) can achieve a (nearly) rate-optimal complexity without prior knowledge of any problem parameter, though this comes at the cost of introducing an exponential term dependent on $L_1$ in the complexity. We further establish that this exponential term is inevitable to such schemes by introducing a theoretical framework of lower bounds tailored explicitly for parameter-agnostic algorithms. Interestingly, in deterministic settings, the exponential factor can be neutralized by employing Gradient Descent with a Backtracking Line Search. To the best of our knowledge, these findings represent the first parameter-agnostic convergence results under the generalized smoothness condition. Our empirical experiments further confirm our theoretical insights.

Publication status

published

Book title

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics

Volume

238

Pages / Article No.

4861 - 4869

Publisher

PMLR

Event

27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

09729 - He, Niao / He, Niao check_circle

Notes

Funding

207343 - RING: Robust Intelligence with Nonconvex Games (SNF)

Related publications and datasets