Search

Telling BERT's Full Story: from Local Attention to Global Aggregation

Pascual, Damian; Brunner, Gino; Wattenhofer, Roger (2021)

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

We take a deep look into the behaviour of self-attention heads in the transformer architecture. In light of recent work discouraging the use of attention distributions for explaining a model’s behaviour, we show that attention distributions can nevertheless provide insights into the local behaviour of attention heads. This way, we propose a distinction between local patterns revealed by attention and global patterns that refer back to the ...

Conference Paper

Of Non-Linearity and Commutativity in BERT

Zhao, Sumu; Pascual, Damian; Brunner, Gino; et al. (2021)

2021 International Joint Conference on Neural Networks (IJCNN)

In this work we provide new insights into the transformer architecture, and in particular, its best-known variant, BERT. First, we propose a method to measure the degree of non-linearity of different elements of transformers. Next, we focus our investigation on the feed-forward networks (FFN) inside transformers, which contain two thirds of the model parameters and have so far not received much attention. We find that FFNs are an inefficient ...

Conference Paper

Attentive Multi-Task Deep Reinforcement Learning

Bräm, Timo; Brunner, Gino; Richter, Oliver; et al. (2020)

Lecture Notes in Computer Science ~ Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019

Sharing knowledge between tasks is vital for efficient learning in a multi-task setting. However, most research so far has focused on the easier case where knowledge transfer is not harmful, i.e., where knowledge from one task cannot negatively impact the performance on another task. In contrast, we present an approach to multi-task deep reinforcement learning based on attention that does not require any a-priori assumptions about the ...

Conference Paper

The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony

Brunner, Gino; Szebedy, Bence; Tanner, Simon; et al. (2019)

International Conference on Unmanned Aircraft Systems (ICUAS 2019)

Conference Paper

Disentangling the Latent Space of (Variational) Autoencoders for NLP

Brunner, Gino; Wang, Yuyi; Wattenhofer, Roger; et al. (2018)

Advances in Intelligent Systems and Computing ~ Proceedings of the 18th UK Workshop on Computational Intelligence 2018

Conference Paper

MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer

Brunner, Gino; Konrad, Andres; Wang, Yuyi; et al. (2018)

Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR 2018)

We introduce MIDI-VAE, a neural network model basedon Variational Autoencoders that is capable of handlingpolyphonic music with multiple instrument tracks, as wellas modeling the dynamics of music by incorporating notedurations and velocities. We show that MIDI-VAE can per-form style transfer on symbolic music by automaticallychanging pitches, dynamics and instruments of a musicpiece from, e.g., a Classical to a ...

Conference Paper

Monaural Music Source Separation using a ResNet Latent Separator Network

Brunner, Gino; Naas, Nawel; Palsson, Sveinn; et al. (2020)

2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)

Conference Paper

Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning

Brunner, Gino; Fritsche, Manuel; Richter, Oliver; et al. (2018)

2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)

Conference Paper

Medley2K: A Dataset of Medley Transitions

Faber, Lukas; Luck, Sandro; Pascual, Damian; et al. (2020)

Proceedings MML 2020: 13th International Workshop on Machine Learning and Music at ECML/PKDD 2020

Conference Paper

Natural Language Multitasking - Analyzing and Improving Syntactic Saliency of Hidden Representations

Brunner, Gino; Wang, Yuyi; Wattenhofer, Roger; et al. (2018)

arXiv

Conference Paper

Results

Telling BERT's Full Story: from Local Attention to Global Aggregation

Of Non-Linearity and Commutativity in BERT

Attentive Multi-Task Deep Reinforcement Learning

The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony

Disentangling the Latent Space of (Variational) Autoencoders for NLP

MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer

Monaural Music Source Separation using a ResNet Latent Separator Network

Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning

Medley2K: A Dataset of Medley Transitions

Natural Language Multitasking - Analyzing and Improving Syntactic Saliency of Hidden Representations

Refine by

Research Collection

Search

Search

Results

Telling BERT's Full Story: from Local Attention to Global Aggregation ﻿

Of Non-Linearity and Commutativity in BERT ﻿

Attentive Multi-Task Deep Reinforcement Learning ﻿

The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony ﻿

Disentangling the Latent Space of (Variational) Autoencoders for NLP ﻿

MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer ﻿

Monaural Music Source Separation using a ResNet Latent Separator Network ﻿

Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning ﻿

Medley2K: A Dataset of Medley Transitions ﻿

Natural Language Multitasking - Analyzing and Improving Syntactic Saliency of Hidden Representations ﻿

Refine by

Telling BERT's Full Story: from Local Attention to Global Aggregation

Of Non-Linearity and Commutativity in BERT

Attentive Multi-Task Deep Reinforcement Learning

The Urban Last Mile Problem: Autonomous Drone Delivery to Your Balcony

Disentangling the Latent Space of (Variational) Autoencoders for NLP

MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer

Monaural Music Source Separation using a ResNet Latent Separator Network

Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning

Medley2K: A Dataset of Medley Transitions

Natural Language Multitasking - Analyzing and Improving Syntactic Saliency of Hidden Representations