UnICORNN: A recurrent model for learning very long time dependencies


METADATA ONLY
Loading...

Date

2021-03

Publication Type

Report

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

The design of recurrent neural networks (RNNs) to accurately process sequential inputs with long-time dependencies is very challenging on account of the exploding and vanishing gradient problem. To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of oscillators. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem. A suite of experiments are presented to demonstrate that the proposed RNN provides state of the art performance on a variety of learning tasks with (very) long-time dependencies.

Publication status

published

Editor

Book title

Volume

2021-10

Pages / Article No.

Publisher

Seminar for Applied Mathematics, ETH Zurich

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Recurrent neural network; Hamiltonian systems; Gradient stability; Long-term dependencies

Organisational unit

03851 - Mishra, Siddhartha / Mishra, Siddhartha check_circle

Notes

Funding

770880 - Computation and analysis of statistical solutions of fluid flow (EC)

Related publications and datasets