Meta learning addresses noisy and under-labeled data in machine learning-guided antibody engineering
METADATA ONLY
Loading...
Author / Producer
Date
2024-01-17
Publication Type
Journal Article
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
Machine learning-guided protein engineering is rapidly progressing; however, collecting high-quality, large datasets remains a bottleneck. Directed evolution and protein engineering studies often require extensive experimental processes to eliminate noise and label protein sequence-function data. Meta learning has proven effective in other fields in learning from noisy data via bi-level optimization given the availability of a small dataset with trusted labels. Here, we leverage meta learning approaches to overcome noisy and under-labeled data and expedite workflows in antibody engineering. We generate yeast display antibody mutagenesis libraries and screen them for target antigen binding followed by deep sequencing. We then create representative learning tasks, including learning from noisy training data, positive and unlabeled learning, and learning out of distribution properties. We demonstrate that meta learning has the potential to reduce experimental screening time and improve the robustness of machine learning models by training with noisy and under-labeled training data.
Permanent link
Publication status
published
External links
Editor
Book title
Journal / series
Volume
15 (1)
Pages / Article No.
4 - 180000
Publisher
Cell Press
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
machine learning; meta learning; protein engineering; antibody engineering; deep sequencing
Organisational unit
Notes
Funding
197941 - Single-cell profiling of antibody repertoires and transcriptomes from B cells to determine the relationship with antigen-specificity and aging (SNF)