On Identifiability in Transformers

Brunner, Gino; Liu, Yang; Pascual, Damian; Richter, Oliver; Ciaramita, Massimiliano; Wattenhofer, Roger

Zur Kurzanzeige

dc.contributor.author

Brunner, Gino

dc.contributor.author

Liu, Yang

dc.contributor.author

Pascual, Damian

dc.contributor.author

Richter, Oliver

dc.contributor.author

Ciaramita, Massimiliano

dc.contributor.author

Wattenhofer, Roger

dc.date.accessioned

2020-08-28T12:50:00Z

dc.date.available

2020-08-25T10:24:40Z

dc.date.available

2020-08-28T12:50:00Z

dc.date.issued

2020

dc.identifier.uri

http://hdl.handle.net/20.500.11850/432547

dc.description.abstract

In this paper we delve deep in the Transformer architecture by investigating two of its core components: self-attention and contextual embeddings. In particular, we study the identifiability of attention weights and token embeddings, and the aggregation of context into hidden tokens. We show that, for sequences longer than the attention head dimension, attention weights are not identifiable. We propose effective attention as a complementary tool for improving explanatory interpretations based on attention. Furthermore, we show that input tokens retain to a large degree their identity across the model. We also find evidence suggesting that identity information is mainly encoded in the angle of the embeddings and gradually decreases with depth. Finally, we demonstrate strong mixing of input information in the generation of contextual embeddings by means of a novel quantification method based on gradient attribution. Overall, we show that self-attention distributions are not directly interpretable and present tools to better understand and further investigate Transformer models.

en_US

dc.language.iso

en

en_US

dc.publisher

International Conference on Learning Representations

en_US

dc.subject

Attention

en_US

dc.subject

Generation

en_US

dc.subject

Interpretability

en_US

dc.subject

NLP

en_US

dc.subject

Self attention

en_US

dc.subject

Transformer

en_US

dc.title

On Identifiability in Transformers

en_US

dc.type

Conference Paper

ethz.size

35 p.

en_US

ethz.event

8th International Conference on Learning Representations (ICLR 2020) (virtual)

en_US

ethz.event.location

Addis Ababa, Ethiopia

en_US

ethz.event.date

April 26-30, 2020

en_US

ethz.notes

Due to the Corona virus (COVID-19) the conference was conducted virtually.

en_US

ethz.publication.place

s.l.

en_US

ethz.publication.status

published

en_US

ethz.leitzahl

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02640 - Inst. f. Technische Informatik und Komm. / Computer Eng. and Networks Lab.::03604 - Wattenhofer, Roger / Wattenhofer, Roger

en_US

ethz.leitzahl.certified

ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02640 - Inst. f. Technische Informatik und Komm. / Computer Eng. and Networks Lab.::03604 - Wattenhofer, Roger / Wattenhofer, Roger

en_US

ethz.identifier.url

https://iclr.cc/virtual_2020/poster_BJg1f6EFDB.html#details

ethz.date.deposited

2020-08-25T10:24:59Z

ethz.source

FORM

ethz.eth

yes

en_US

ethz.availability

Metadata only

en_US

ethz.rosetta.installDate

2020-08-28T12:50:11Z

ethz.rosetta.lastUpdated

2020-08-28T12:50:11Z

ethz.rosetta.versionExported

true

ethz.COinS

ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=On%20Identifiability%20in%20Transformers&rft.date=2020&rft.au=Brunner,%20Gino&Liu,%20Yang&Pascual,%20Damian&Richter,%20Oliver&Ciaramita,%20Massimiliano&rft.genre=proceeding&rft.btitle=On%20Identifiability%20in%20Transformers

Printexemplar via ETH-Bibliothek suchen

Dateien zu diesem Eintrag

Dateien	Größe	Format	Im Viewer öffnen
Zu diesem Eintrag gibt es keine Dateien.

Publikationstyp

Conference Paper [34968]

Zur Kurzanzeige

Research Collection

Suche

On Identifiability in Transformers Mendeley CSV RIS BibTeX

Dateien zu diesem Eintrag

Publikationstyp

On Identifiability in Transformers

Mendeley

CSV

RIS

BibTeX