Antibody engineering by combining genome editing, deep sequencing, and deep learning
Open access
Author
Date
2020Type
- Doctoral Thesis
ETH Bibliography
yes
Altmetrics
Abstract
Monoclonal antibodies are one of the fastest growing classes of therapeutic drugs in today’s pharmaceutical market because of their uses in a wide-range of disease indications. Even with their unparalleled success, therapeutic antibody engineering and optimization is still a slow and disjointed process which relies on a multitude of experimental approaches that are both labor and resource intensive. By combining the three most impactful techniques in life sciences in the past 20 years: 1) genome editing, 2) deep sequencing, and 3) deep learning, we move beyond traditional experimental screening approaches and have developed a fundamentally new approach to augment the therapeutic antibody engineering and optimization process.
One of the major issues associated with antibody engineering is the fact it relies extensively on directed evolution approaches, which are typically constrained to screening systems of phage or yeast display. These microbial expression hosts are in large part utilized because of their ability to stably replicate plasmid DNA and therefore accommodate large recombinant libraries of antibody variants. However, drawbacks of phage and yeast are that they lack the capacity to express full-length antibody molecules or provide post-translational modifications (e.g., glycosylation). Since nearly all therapeutic antibodies are ultimately expressed in full-length format in mammalian cells, phage- and yeast-derived antibodies often have different biophysical properties that require additional optimization when transferred to mammalian cells.
In this thesis, I describe a novel approach to engineering antibodies directly in mammalian cells by taking advantage of recent advances in genome editing, name CRISPR-Cas9. We developed homology-directed mutagenesis, a novel method for the targeted mutagenesis of genes directly in the genome of mammalian cells. By applying this technique to a previously developed, mammalian-based antibody expression and display platform, we are now able to engineer and optimize antibody molecules in their final therapeutic format.
Although it is now possible to build libraries directly in mammalian cells with homology-directed mutagenesis, there still exists a vast discrepancy between the theoretical protein sequence space of an antibody and what is experimentally achievable, severely limiting antibody optimization efforts when multi-parameter optimization is desired. To bridge this gap, we have devised a workflow that combines homology-directed mutagenesis, mammalian display screening and deep sequencing in order to train highly accurate deep learning models capable of predicting the antigen binding status of an antibody sequence. Deep learning models allowed us to interrogate nearly the entire mutational space of an antibody region and identify millions of variants that retained binding to target antigen. These deep learning predicted antigen-binding variants were subsequently optimized across multiple parameters important for drug developability.
Intrigued by the predictive power of deep learning to identify antigen-binding sequences and its potentially greater impact on the field of protein engineering, we sought to further explore the limitations and influences imposed on model development by the size and quality of training data. By systematically evaluating a diverse set of commonly applied machine learning approaches, we revealed that only a small fraction of sequences covering the available sequence space are needed to train highly predictive models, as well as highlighted the computational costs and tradeoffs of each model. As the throughput of linking genotype-phenotype information in directed evolution experiments continue to increase, we anticipate the inevitable adaption of machine learning into many protein engineering workflows. We hope the insights highlighted in our studies provide the foundation to assist with this next step in the field. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000440885Publication status
publishedExternal links
Search print copy at ETH Library
Publisher
ETH ZurichSubject
Antibody engineering; Genome editing; Machine learningOrganisational unit
03952 - Reddy, Sai / Reddy, Sai
More
Show all metadata
ETH Bibliography
yes
Altmetrics