Metadata only
Date
2023Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
We investigate the process of neural network training using gradient descent-based optimizers from a dynamic system point of view. To this end, we model the iterative parameter updates as a time-discrete switched linear system and analyze its stability behavior over the course of training. Accordingly, we develop a regularization scheme to encourage stable training dynamics by penalizing divergent parameter updates. Our experiments show promising stabilization and convergence effects on regression tasks, density-based crowd counting, and generative adversarial networks (GAN). Our results indicate that stable network training minimizes the variance of performance across different parameter initializations, and increases robustness to the choice of learning rate. Particularly in the GAN setup, the stability regularization enables faster convergence and lower FID with more consistency across runs. Our source code is available at: https://github.com/fangzl123/stableTrain.git. Show more
Publication status
publishedExternal links
Book title
Computer Vision – ACCV 2022Volume
Pages / Article No.
Publisher
SpringerEvent
Organisational unit
03514 - Van Gool, Luc / Van Gool, Luc
More
Show all metadata
ETH Bibliography
yes
Altmetrics