Show simple item record

dc.contributor.author
Born, Jannis
dc.contributor.author
Manica, Matteo
dc.contributor.author
Cadow, Joris
dc.contributor.author
Markert, Greta
dc.contributor.author
Mill, Nil Adell
dc.contributor.author
Filipavicius, Modestas
dc.contributor.author
Janakarajan, Nikita
dc.contributor.author
Cardinale, Antonio
dc.contributor.author
Laino, Teodoro
dc.contributor.author
Rodríguez Martínez, María
dc.date.accessioned
2021-05-06T12:21:44Z
dc.date.available
2021-05-05T04:02:22Z
dc.date.available
2021-05-06T12:21:44Z
dc.date.issued
2021-06
dc.identifier.other
10.1088/2632-2153/abe808
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/482459
dc.identifier.doi
10.3929/ethz-b-000482459
dc.description.abstract
Bridging systems biology and drug design, we propose a deep learning framework for de novo discovery of molecules tailored to bind with given protein targets. Our methodology is exemplified by the task of designing antiviral candidates to target SARS-CoV-2 related proteins. Crucially, our framework does not require fine-tuning for specific proteins but is demonstrated to generalize in proposing ligands with high predicted binding affinities against unseen targets. Coupling our framework with the automatic retrosynthesis prediction of IBM RXN for Chemistry, we demonstrate the feasibility of swift chemical synthesis of molecules with potential antiviral properties that were designed against a specific protein target. In particular, we synthesize an antiviral candidate designed against the host protein angiotensin converting enzyme 2 (ACE2); a surface receptor on human respiratory epithelial cells that facilitates SARS-CoV-2 cell entry through its spike glycoprotein. This is achieved as follows. First, we train a multimodal ligand–protein binding affinity model on predicting affinities of bioactive compounds to target proteins and couple this model with pharmacological toxicity predictors. Exploiting this multi-objective as a reward function of a conditional molecular generator that consists of two variational autoencoders (VAE), our framework steers the generation toward regions of the chemical space with high-reward molecules. Specifically, we explore a challenging setting of generating ligands against unseen protein targets by performing a leave-one-out-cross-validation on 41 SARS-CoV-2-related target proteins. Using deep reinforcement learning, it is demonstrated that in 35 out of 41 cases, the generation is biased towards sampling binding ligands, with an average increase of 83% comparing to an unbiased VAE. The generated molecules exhibit favorable properties in terms of target binding affinity, selectivity and drug-likeness. We use molecular retrosynthetic models to provide a synthetic accessibility assessment of the best generated hit molecules. Finally, with this end-to-end framework, we synthesize 3-Bromobenzylamine, a potential inhibitor of the host ACE2 protein, solely based on the recommendations of a molecular retrosynthesis model and a synthesis protocol prediction model. We hope that our framework can contribute towards swift discovery of de novo molecules with desired pharmacological properties.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
Institute of Physics
en_US
dc.rights.uri
http://creativecommons.org/licenses/by/4.0/
dc.subject
Drug discovery
en_US
dc.subject
Deep learning
en_US
dc.subject
COVID-19
en_US
dc.subject
Generative chemistry
en_US
dc.subject
SARS-CoV-2
en_US
dc.subject
Machine learning
en_US
dc.subject
Compound protein
en_US
dc.subject
Interaction
en_US
dc.title
Data-driven molecular design for discovery and synthesis of novel ligands: a case study on SARS-CoV-2
en_US
dc.type
Journal Article
dc.rights.license
Creative Commons Attribution 4.0 International
dc.date.published
2021-03-25
ethz.journal.title
Machine Learning: Science and Technology
ethz.journal.volume
2
en_US
ethz.journal.issue
2
en_US
ethz.pages.start
025024
en_US
ethz.size
15 p.
en_US
ethz.version.deposit
publishedVersion
en_US
ethz.identifier.scopus
ethz.publication.place
Bristol
en_US
ethz.publication.status
published
en_US
ethz.date.deposited
2021-05-05T04:02:36Z
ethz.source
SCOPUS
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2021-05-06T12:21:55Z
ethz.rosetta.lastUpdated
2022-03-29T07:08:20Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Data-driven%20molecular%20design%20for%20discovery%20and%20synthesis%20of%20novel%20ligands:%20a%20case%20study%20on%20SARS-CoV-2&rft.jtitle=Machine%20Learning:%20Science%20and%20Technology&rft.date=2021-06&rft.volume=2&rft.issue=2&rft.spage=025024&rft.au=Born,%20Jannis&Manica,%20Matteo&Cadow,%20Joris&Markert,%20Greta&Mill,%20Nil%20Adell&rft.genre=article&rft_id=info:doi/10.1088/2632-2153/abe808&
 Search print copy at ETH Library

Files in this item

Thumbnail

Publication type

Show simple item record