Foundational Challenges in Assuring Alignment and Safety of Large Language Models


Loading...

Date

2024

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose 200+, concrete research questions.

Publication status

published

Editor

Book title

Volume

Pages / Article No.

Publisher

OpenReview

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

09764 - Tramèr, Florian / Tramèr, Florian

Notes

Funding

Related publications and datasets