Automated preparation of nanoscopic structures: Graph-based sequence analysis, mismatch detection, and pH-consistent protonation with uncertainty estimates


Loading...

Date

2024-04-30

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

Structure and function in nanoscale atomistic assemblies are tightly coupled, andevery atom with its specific position and even every electron will have a decisiveeffect on the electronic structure, and hence, on the molecular properties. Molec-ular simulations of nanoscopic atomistic structures therefore require accuratelyresolved three-dimensional input structures. If extracted from experiment, thesestructures often suffer from severe uncertainties, of which the lack of informationon hydrogen atoms is a prominent example. Hence, experimental structuresrequire careful review and curation, which is a time-consuming and error-proneprocess. Here, we present a fast and robust protocol for the automated structureanalysis and pH-consistent protonation, in short, ASAP. For biomolecules as atarget, the ASAP protocol integrates sequence analysis and error assessment of agiven input structure. ASAP allows for pKₐ prediction from reference data throughGaussian process regression including uncertainty estimation and connects tosystem-focused atomistic modeling described in Brunken and Reiher (J. Chem. TheoryComput.16, 2020, 1646). Although focused on biomolecules, ASAP can be extendedto other nanoscopic objects, because most of its design elements rely on a generalgraph-based foundation guaranteeing transferability. The modular character ofthe underlying pipeline supports different degrees of automation, which allows for(i) efficient feedback loops for human-machine interaction with a low entrance barrierand for (ii) integration into autonomous procedures such as automated force fieldparametrizations. This facilitates fast switching of the pH-state through on-the-flysystem-focused reparametrization during a molecular simulation at virtually no extracomputational cost.

Publication status

published

Editor

Book title

Volume

45 (11)

Pages / Article No.

761 - 776

Publisher

Wiley

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

atomistic simulation; Gaussian process; machine learning; protein structure

Organisational unit

03736 - Reiher, Markus / Reiher, Markus check_circle

Notes

Funding

182400 - Exhaustive First-Principles Exploration of Chemical Reaction Networks for Catalysis Design (SNF)

Related publications and datasets