Meta learning addresses noisy and under-labeled data in machine learning-guided antibody engineering


METADATA ONLY
Loading...

Date

2024-01-17

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

Machine learning-guided protein engineering is rapidly progressing; however, collecting high-quality, large datasets remains a bottleneck. Directed evolution and protein engineering studies often require extensive experimental processes to eliminate noise and label protein sequence-function data. Meta learning has proven effective in other fields in learning from noisy data via bi-level optimization given the availability of a small dataset with trusted labels. Here, we leverage meta learning approaches to overcome noisy and under-labeled data and expedite workflows in antibody engineering. We generate yeast display antibody mutagenesis libraries and screen them for target antigen binding followed by deep sequencing. We then create representative learning tasks, including learning from noisy training data, positive and unlabeled learning, and learning out of distribution properties. We demonstrate that meta learning has the potential to reduce experimental screening time and improve the robustness of machine learning models by training with noisy and under-labeled training data.

Publication status

published

Editor

Book title

Journal / series

Volume

15 (1)

Pages / Article No.

4 - 180000

Publisher

Cell Press

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

machine learning; meta learning; protein engineering; antibody engineering; deep sequencing

Organisational unit

Notes

Funding

197941 - Single-cell profiling of antibody repertoires and transcriptomes from B cells to determine the relationship with antigen-specificity and aging (SNF)

Related publications and datasets