GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Frantar, Elias; Ashkboos, Saleh; Hoefler, Torsten; Alistarh, Dan

Metadata only

Author

Date

2023

Type

Conference Paper

ETH Bibliography

yes

Altmetrics

Abstract

Generative Pre-trained Transformer models, known as GPT or OPT, set themselves apart through breakthrough performance across complex language modelling tasks, but also by their extremely high computational and storage costs. Specifically, due to their massive size, even inference for large, highly- Show more

Publication status

published

External links

https://openreview.net/forum?id=tcbBPnfwxS
https://iclr.cc/virtual/2023/poster/10855

Book title

The Eleventh International Conference on Learning Representations

Publisher

OpenReview

Event

11th International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, May 1-5, 2023

Subject

compression; quantization; generative pre-trained transformers; GPT; second-order methods

Organisational unit

03950 - Hoefler, Torsten / Hoefler, Torsten

Related publications and datasets

Is new version of: https://doi.org/10.48550/arXiv.2210.17323

More

Show all metadata

ETH Bibliography

yes

Altmetrics

Research Collection

Search

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers Mendeley CSV RIS BibTeX

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

Mendeley

CSV

RIS

BibTeX