Bio-Inspired Continual Learning and Transfer Learning on Temporal Sequences
OPEN ACCESS
Loading...
Author / Producer
Date
2024
Publication Type
Doctoral Thesis
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
Many machine learning models suffer from the well-known catastrophic forgetting problem, that is, they perform worse on previous tasks after they are trained on a new task. In the field of continual learning, researchers investigate training algorithms and architectures that can produce models which work well on both the old and new tasks, similar to the capability of humans to continuously learn new and improve on existing knowledge and skills.
So far, continual learning models predominantly target non-sequential data such as images, while little attention has been applied to the input temporal sequences, e.g., video or motion. Furthermore, the ability of generative models to learn and generate temporal sequences in within a continual learning setup is relatively unexplored, since continual learning research is predominately done on classification tasks.
This thesis focuses on two aspects of generative learning in the domain of continual learning. The first is the investigation of biologically-inspired incremental learning models that can produce temporal sequences given a conditional input, specifically, generating different human motion actions in a continual learning setup. The second is in the area of transfer learning between multiple modalities (text and visual sensor events) in a generative model. The visual sensor events come from a biologically-inspired Dynamic Vision Sensor event camera sensor that outputs sparse spatio-temporal events.
The recent new latent diffusion architectures haven proven to be powerful generative models that can leverage pretrained components to reduce computational resources required to train a generative model by operating in latent space. This thesis presents a new sparse autoencoder that can encode sensor events to an informative latent representation. This latent space is then used by a new latent diffusion model on a text-to-events objective.
This work advances the state-of-the-art by demonstrating the first pipeline for generating event sequences from a text prompt describing a dynamic scene, concretely, a person performing a gesture.
The thesis contributions include training algorithms on a brain-inspired generative model for generating human motions in a continual learning scenario, investigations into motion curriculum training over a set of tasks, a model and training technique for an autoencoder for spatially and temporally sparse event frames, and a novel text-to-events model for synthetic event stream generation.
This work advances the state-of-the-art by demonstrating the first pipeline for generating event sequences from a text prompt describing a dynamic scene, concretely, a person performing a gesture.
The thesis contributions include training algorithms on a brain-inspired generative model for generating human motions in a continual learning scenario, investigations into motion curriculum training over a set of tasks, a model and training technique for an autoencoder for spatially and temporally sparse event frames, and a novel text-to-events model for synthetic event stream generation.
Permanent link
Publication status
published
External links
Editor
Contributors
Examiner : Liu, Shih-Chii
Examiner : Delbrück, Tobias
Examiner: Yanik, Mehmet Fatih
Examiner : Siegelmann, Hava
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH Zurich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Machine Learning; Artificial Intelligence; Continual learning; Generative AI; sequential model; Event-based vision; human motion model; Variational autoencoder; diffusion model
Organisational unit
08836 - Delbrück, Tobias (Tit.-Prof.)
Notes
Funding
Related publications and datasets
Cites: