Active learning for computational chemogenomics


Loading...

Date

2017-03

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

Aim: Computational chemogenomics models the compound–protein interaction space, typically for drug discovery, where existing methods predominantly either incorporate increasing numbers of bioactivity samples or focus on specific subfamilies of proteins and ligands. As an alternative to modeling entire large datasets at once, active learning adaptively incorporates a minimum of informative examples for modeling, yielding compact but high quality models. Results/methodology: We assessed active learning for protein/target family-wide chemogenomic modeling by replicate experiment. Results demonstrate that small yet highly predictive models can be extracted from only 10–25% of large bioactivity datasets, irrespective of molecule descriptors used. Conclusion: Chemogenomic active learning identifies small subsets of ligand–target interactions in a large screening database that lead to knowledge discovery and highly predictive models.

Publication status

published

Editor

Book title

Volume

9 (4)

Pages / Article No.

381 - 402

Publisher

Future Science

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

chemogenomics; computational chemistry and modeling; virtual screening

Organisational unit

03852 - Schneider, Gisbert / Schneider, Gisbert check_circle

Notes

Funding

Related publications and datasets