Search
Results
-
Telling BERT's Full Story: from Local Attention to Global Aggregation
(2021)Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main VolumeWe take a deep look into the behaviour of self-attention heads in the transformer architecture. In light of recent work discouraging the use of attention distributions for explaining a model’s behaviour, we show that attention distributions can nevertheless provide insights into the local behaviour of attention heads. This way, we propose a distinction between local patterns revealed by attention and global patterns that refer back to the ...Conference Paper -
Of Non-Linearity and Commutativity in BERT
(2021)2021 International Joint Conference on Neural Networks (IJCNN)In this work we provide new insights into the transformer architecture, and in particular, its best-known variant, BERT. First, we propose a method to measure the degree of non-linearity of different elements of transformers. Next, we focus our investigation on the feed-forward networks (FFN) inside transformers, which contain two thirds of the model parameters and have so far not received much attention. We find that FFNs are an inefficient ...Conference Paper -
Attentive Multi-Task Deep Reinforcement Learning
(2020)Lecture Notes in Computer Science ~ Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019Sharing knowledge between tasks is vital for efficient learning in a multi-task setting. However, most research so far has focused on the easier case where knowledge transfer is not harmful, i.e., where knowledge from one task cannot negatively impact the performance on another task. In contrast, we present an approach to multi-task deep reinforcement learning based on attention that does not require any a-priori assumptions about the ...Conference Paper -
The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony
(2019)International Conference on Unmanned Aircraft Systems (ICUAS 2019)Conference Paper -
Disentangling the Latent Space of (Variational) Autoencoders for NLP
(2018)Advances in Intelligent Systems and Computing ~ Proceedings of the 18th UK Workshop on Computational Intelligence 2018Conference Paper -
MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer
(2018)Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR 2018)We introduce MIDI-VAE, a neural network model basedon Variational Autoencoders that is capable of handlingpolyphonic music with multiple instrument tracks, as wellas modeling the dynamics of music by incorporating notedurations and velocities. We show that MIDI-VAE can per-form style transfer on symbolic music by automaticallychanging pitches, dynamics and instruments of a musicpiece from, e.g., a Classical to a ...Conference Paper -
Monaural Music Source Separation using a ResNet Latent Separator Network
(2020)2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)Conference Paper -
Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning
(2018)2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)Conference Paper -
Medley2K: A Dataset of Medley Transitions
(2020)Proceedings MML 2020: 13th International Workshop on Machine Learning and Music at ECML/PKDD 2020Conference Paper -
Natural Language Multitasking - Analyzing and Improving Syntactic Saliency of Hidden Representations
(2018)arXivConference Paper