Text-based representations with interpretable machine learning reveal structure-property relationships of polybenzenoid hydrocarbons
Metadata only
Date
2023-01Type
- Journal Article
Abstract
New tools are developed and applied to enable the use of interpretable machine learning for investigation of structure-property relationships in polybenzenoid hydrocarbons (PBHs). A textual molecular representation, which is based on the annulation sequence of PBHs, is shown to be of utility either in its textual form or as a basis for a curated feature vector. Both forms display interpretability exceeding those achievable by standard SMILES representation; and the former also has increased predictive accuracy. A recently developed model, CUSTODI, was applied for the first time as an interpretable model, identifying important structural features that impact various electronic molecular properties. The resulting insights not only validate several well-known "rules of thumb" of organic chemistry but also reveal new behaviors and influential structural motifs, thus providing guiding principles for rational design and fine-tuning of PBHs. Show more
Publication status
publishedExternal links
Journal / series
Journal of Physical Organic ChemistryVolume
Pages / Article No.
Publisher
WileySubject
machine learning; molecular design; molecular representation; polycyclic aromatic hydrocarbons; structure-property relationshipsOrganisational unit
03425 - Chen, Peter / Chen, Peter
Related publications and datasets
Is supplemented by: https://gitlab.com/porannegroup/lalas
Is supplemented by: https://gitlab.com/porannegroup/compas
More
Show all metadata