Learning the Transformer Kernel

Pal Chowdhury, Sankalan; Solomou, Adamos; Dubey, Avinava; Sachan, Mrinmaya

doi:10.3929/ethz-b-000592500

Download

Full text (published version) (PDF, 2.062Mb)

Open access

Author

Pal Chowdhury, Sankalan

Date

2022-07

Type

Journal Article

ETH Bibliography

yes

Altmetrics

Download

Full text (published version) (PDF, 2.062Mb)

Rights / license

Creative Commons Attribution 4.0 International

Abstract

In this work we introduce KL-TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers. Our framework approximates the Transformer kernel as a dot product between spectral feature maps and learns the kernel by learning the spectral distribution. This n Show more

Permanent link

https://doi.org/10.3929/ethz-b-000592500

Publication status

published

External links

https://openreview.net/forum?id=tLIBAEYjcv

Journal / series

Transactions on Machine Learning Research

Publisher

OpenReview

Organisational unit

09684 - Sachan, Mrinmaya / Sachan, Mrinmaya

More

Show all metadata

ETH Bibliography

yes

Altmetrics

Research Collection

Search

Learning the Transformer Kernel Mendeley CSV RIS BibTeX

Learning the Transformer Kernel

Mendeley

CSV

RIS

BibTeX