Search
Results
-
Telling BERT's Full Story: from Local Attention to Global Aggregation
(2021)Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main VolumeWe take a deep look into the behaviour of self-attention heads in the transformer architecture. In light of recent work discouraging the use of attention distributions for explaining a model’s behaviour, we show that attention distributions can nevertheless provide insights into the local behaviour of attention heads. This way, we propose a distinction between local patterns revealed by attention and global patterns that refer back to the ...Conference Paper -
Of Non-Linearity and Commutativity in BERT
(2021)2021 International Joint Conference on Neural Networks (IJCNN)In this work we provide new insights into the transformer architecture, and in particular, its best-known variant, BERT. First, we propose a method to measure the degree of non-linearity of different elements of transformers. Next, we focus our investigation on the feed-forward networks (FFN) inside transformers, which contain two thirds of the model parameters and have so far not received much attention. We find that FFNs are an inefficient ...Conference Paper -
Medley2K: A Dataset of Medley Transitions
(2020)Proceedings MML 2020: 13th International Workshop on Machine Learning and Music at ECML/PKDD 2020Conference Paper -
Neural Symbolic Music Genre Transfer Insights
(2020)Communications in Computer and Information Science ~ Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019Transferring a song from one genre to another is most difficult if no instrumentation information is provided and genre is only defined by the timing and pitch of the played notes. Inspired by the CycleGAN music genre transfer presented in [2] we investigate whether recent additions to GAN training like spectral normalization and self-attention can improve transfer. Our preliminary results show that spectral normalization improves audible ...Conference Paper -
Attentive Multi-Task Deep Reinforcement Learning
(2020)Lecture Notes in Computer Science ~ Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019Sharing knowledge between tasks is vital for efficient learning in a multi-task setting. However, most research so far has focused on the easier case where knowledge transfer is not harmful, i.e., where knowledge from one task cannot negatively impact the performance on another task. In contrast, we present an approach to multi-task deep reinforcement learning based on attention that does not require any a-priori assumptions about the ...Conference Paper -
Monaural Music Source Separation using a ResNet Latent Separator Network
(2020)2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)Conference Paper -
On Identifiability in Transformers
(2020)In this paper we delve deep in the Transformer architecture by investigating two of its core components: self-attention and contextual embeddings. In particular, we study the identifiability of attention weights and token embeddings, and the aggregation of context into hidden tokens. We show that, for sequences longer than the attention head dimension, attention weights are not identifiable. We propose effective attention as a complementary ...Conference Paper -
Tunnel Vision Attack on IMPALA - Questioning the Robustness of Reinforcement Learning Agents
(2019)Conference Paper -
Swimming Style Recognition and Lap Counting Using a Smartwatch and Deep Learning
(2019)ISWC '19: Proceedings of the 23rd International Symposium on Wearable ComputersConference Paper -
The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony
(2019)International Conference on Unmanned Aircraft Systems (ICUAS 2019)Conference Paper