Search

Telling BERT's Full Story: from Local Attention to Global Aggregation

Pascual, Damian; Brunner, Gino; Wattenhofer, Roger (2021)

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

We take a deep look into the behaviour of self-attention heads in the transformer architecture. In light of recent work discouraging the use of attention distributions for explaining a model’s behaviour, we show that attention distributions can nevertheless provide insights into the local behaviour of attention heads. This way, we propose a distinction between local patterns revealed by attention and global patterns that refer back to the ...

Conference Paper

Of Non-Linearity and Commutativity in BERT

Zhao, Sumu; Pascual, Damian; Brunner, Gino; et al. (2021)

2021 International Joint Conference on Neural Networks (IJCNN)

In this work we provide new insights into the transformer architecture, and in particular, its best-known variant, BERT. First, we propose a method to measure the degree of non-linearity of different elements of transformers. Next, we focus our investigation on the feed-forward networks (FFN) inside transformers, which contain two thirds of the model parameters and have so far not received much attention. We find that FFNs are an inefficient ...

Conference Paper

Medley2K: A Dataset of Medley Transitions

Faber, Lukas; Luck, Sandro; Pascual, Damian; et al. (2020)

Proceedings MML 2020: 13th International Workshop on Machine Learning and Music at ECML/PKDD 2020

Conference Paper

Neural Symbolic Music Genre Transfer Insights

Brunner, Gino; Moayeri, Mazda; Richter, Oliver; et al. (2020)

Communications in Computer and Information Science ~ Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019

Transferring a song from one genre to another is most difficult if no instrumentation information is provided and genre is only defined by the timing and pitch of the played notes. Inspired by the CycleGAN music genre transfer presented in [2] we investigate whether recent additions to GAN training like spectral normalization and self-attention can improve transfer. Our preliminary results show that spectral normalization improves audible ...

Conference Paper

Attentive Multi-Task Deep Reinforcement Learning

Bräm, Timo; Brunner, Gino; Richter, Oliver; et al. (2020)

Lecture Notes in Computer Science ~ Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019

Sharing knowledge between tasks is vital for efficient learning in a multi-task setting. However, most research so far has focused on the easier case where knowledge transfer is not harmful, i.e., where knowledge from one task cannot negatively impact the performance on another task. In contrast, we present an approach to multi-task deep reinforcement learning based on attention that does not require any a-priori assumptions about the ...

Conference Paper

Monaural Music Source Separation using a ResNet Latent Separator Network

Brunner, Gino; Naas, Nawel; Palsson, Sveinn; et al. (2020)

2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)

Conference Paper

On Identifiability in Transformers

Brunner, Gino; Liu, Yang; Pascual, Damian; et al. (2020)

In this paper we delve deep in the Transformer architecture by investigating two of its core components: self-attention and contextual embeddings. In particular, we study the identifiability of attention weights and token embeddings, and the aggregation of context into hidden tokens. We show that, for sequences longer than the attention head dimension, attention weights are not identifiable. We propose effective attention as a complementary ...

Conference Paper

Tunnel Vision Attack on IMPALA - Questioning the Robustness of Reinforcement Learning Agents

Bolick, Julian; Brunner, Gino; Richter, Oliver; et al. (2019)

Conference Paper

Swimming Style Recognition and Lap Counting Using a Smartwatch and Deep Learning

Brunner, Gino; Melnyk, Darya; Sigfússon, Birkir; et al. (2019)

ISWC '19: Proceedings of the 23rd International Symposium on Wearable Computers

Conference Paper

The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony

Brunner, Gino; Szebedy, Bence; Tanner, Simon; et al. (2019)

International Conference on Unmanned Aircraft Systems (ICUAS 2019)

Conference Paper

Results

Telling BERT's Full Story: from Local Attention to Global Aggregation

Of Non-Linearity and Commutativity in BERT

Medley2K: A Dataset of Medley Transitions

Neural Symbolic Music Genre Transfer Insights

Attentive Multi-Task Deep Reinforcement Learning

Monaural Music Source Separation using a ResNet Latent Separator Network

On Identifiability in Transformers

Tunnel Vision Attack on IMPALA - Questioning the Robustness of Reinforcement Learning Agents

Swimming Style Recognition and Lap Counting Using a Smartwatch and Deep Learning

The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony

Refine by

Research Collection

Search

Search

Results

Telling BERT's Full Story: from Local Attention to Global Aggregation ﻿

Of Non-Linearity and Commutativity in BERT ﻿

Medley2K: A Dataset of Medley Transitions ﻿

Neural Symbolic Music Genre Transfer Insights ﻿

Attentive Multi-Task Deep Reinforcement Learning ﻿

Monaural Music Source Separation using a ResNet Latent Separator Network ﻿

On Identifiability in Transformers ﻿

Tunnel Vision Attack on IMPALA - Questioning the Robustness of Reinforcement Learning Agents ﻿

Swimming Style Recognition and Lap Counting Using a Smartwatch and Deep Learning ﻿

The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony ﻿

Refine by

Telling BERT's Full Story: from Local Attention to Global Aggregation

Of Non-Linearity and Commutativity in BERT

Medley2K: A Dataset of Medley Transitions

Neural Symbolic Music Genre Transfer Insights

Attentive Multi-Task Deep Reinforcement Learning

Monaural Music Source Separation using a ResNet Latent Separator Network

On Identifiability in Transformers

Tunnel Vision Attack on IMPALA - Questioning the Robustness of Reinforcement Learning Agents

Swimming Style Recognition and Lap Counting Using a Smartwatch and Deep Learning

The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony