Antibody engineering by combining genome editing, deep sequencing, and deep learning
dc.contributor.author
Mason, Derek
dc.contributor.supervisor
Reddy, Sai
dc.contributor.supervisor
Panke, Sven
dc.contributor.supervisor
Platt, Randall
dc.date.accessioned
2020-09-17T12:38:58Z
dc.date.available
2020-09-17T08:03:06Z
dc.date.available
2020-09-17T12:38:58Z
dc.date.issued
2020
dc.identifier.uri
http://hdl.handle.net/20.500.11850/440885
dc.identifier.doi
10.3929/ethz-b-000440885
dc.description.abstract
Monoclonal antibodies are one of the fastest growing classes of therapeutic drugs in today’s pharmaceutical market because of their uses in a wide-range of disease indications. Even with their unparalleled success, therapeutic antibody engineering and optimization is still a slow and disjointed process which relies on a multitude of experimental approaches that are both labor and resource intensive. By combining the three most impactful techniques in life sciences in the past 20 years: 1) genome editing, 2) deep sequencing, and 3) deep learning, we move beyond traditional experimental screening approaches and have developed a fundamentally new approach to augment the therapeutic antibody engineering and optimization process.
One of the major issues associated with antibody engineering is the fact it relies extensively on directed evolution approaches, which are typically constrained to screening systems of phage or yeast display. These microbial expression hosts are in large part utilized because of their ability to stably replicate plasmid DNA and therefore accommodate large recombinant libraries of antibody variants. However, drawbacks of phage and yeast are that they lack the capacity to express full-length antibody molecules or provide post-translational modifications (e.g., glycosylation). Since nearly all therapeutic antibodies are ultimately expressed in full-length format in mammalian cells, phage- and yeast-derived antibodies often have different biophysical properties that require additional optimization when transferred to mammalian cells.
In this thesis, I describe a novel approach to engineering antibodies directly in mammalian cells by taking advantage of recent advances in genome editing, name CRISPR-Cas9. We developed homology-directed mutagenesis, a novel method for the targeted mutagenesis of genes directly in the genome of mammalian cells. By applying this technique to a previously developed, mammalian-based antibody expression and display platform, we are now able to engineer and optimize antibody molecules in their final therapeutic format.
Although it is now possible to build libraries directly in mammalian cells with homology-directed mutagenesis, there still exists a vast discrepancy between the theoretical protein sequence space of an antibody and what is experimentally achievable, severely limiting antibody optimization efforts when multi-parameter optimization is desired. To bridge this gap, we have devised a workflow that combines homology-directed mutagenesis, mammalian display screening and deep sequencing in order to train highly accurate deep learning models capable of predicting the antigen binding status of an antibody sequence. Deep learning models allowed us to interrogate nearly the entire mutational space of an antibody region and identify millions of variants that retained binding to target antigen. These deep learning predicted antigen-binding variants were subsequently optimized across multiple parameters important for drug developability.
Intrigued by the predictive power of deep learning to identify antigen-binding sequences and its potentially greater impact on the field of protein engineering, we sought to further explore the limitations and influences imposed on model development by the size and quality of training data. By systematically evaluating a diverse set of commonly applied machine learning approaches, we revealed that only a small fraction of sequences covering the available sequence space are needed to train highly predictive models, as well as highlighted the computational costs and tradeoffs of each model. As the throughput of linking genotype-phenotype information in directed evolution experiments continue to increase, we anticipate the inevitable adaption of machine learning into many protein engineering workflows. We hope the insights highlighted in our studies provide the foundation to assist with this next step in the field.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
Antibody engineering
en_US
dc.subject
Genome editing
en_US
dc.subject
Machine learning
en_US
dc.title
Antibody engineering by combining genome editing, deep sequencing, and deep learning
en_US
dc.type
Doctoral Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
dc.date.published
2020-09-17
ethz.size
161 p.
en_US
ethz.code.ddc
DDC - DDC::6 - Technology, medicine and applied sciences::610 - Medical sciences, medicine
en_US
ethz.identifier.diss
26628
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02060 - Dep. Biosysteme / Dep. of Biosystems Science and Eng.::03952 - Reddy, Sai / Reddy, Sai
en_US
ethz.date.deposited
2020-09-17T08:03:16Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2020-09-17T12:39:20Z
ethz.rosetta.lastUpdated
2021-02-15T17:18:30Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Antibody%20engineering%20by%20combining%20genome%20editing,%20deep%20sequencing,%20and%20deep%20learning&rft.date=2020&rft.au=Mason,%20Derek&rft.genre=unknown&rft.btitle=Antibody%20engineering%20by%20combining%20genome%20editing,%20deep%20sequencing,%20and%20deep%20learning
Files in this item
Publication type
-
Doctoral Thesis [29164]