Systematic investigation of synthetic operon designs enables prediction and control of expression levels of multiple proteins
OPEN ACCESS
Loading...
Author / Producer
Date
2022-06-10
Publication Type
Working Paper
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Abstract
Controlling the expression levels of multiple recombinant proteins for optimal performance is crucial for synthetic biosystems but remains difficult given the large number of DNA-encoded factors that influence the process of gene expression from transcription to translation. In bacterial hosts, biosystems can be economically encoded as operons, but the sequence requirements for exact tuning of expression levels in an operon remain unclear. Here, we demonstrate the extent and predictability of protein-level variation using diverse arrangements of twelve genes to generate 88 synthetic operons with up to seven genes at varying inducer concentrations. The resulting 2772 protein expression measurements allowed the training of a sequence-based machine learning model that explains 83% of the variation in the data with a mean absolute error of 9% relative to reference constructs, making it a useful tool for protein expression prediction. Feature importance analysis indicates that operon length, gene position and gene junction structure are of major importance for protein expression.
Permanent link
Publication status
published
External links
Editor
Book title
Journal / series
Volume
Pages / Article No.
Publisher
Cold Spring Harbor Laboratory
Event
Edition / version
v1
Methods
Software
Geographic location
Date collected
Date created
Subject
Synthetic biology; Biotechnology; Synthetic operons; Combinatorial DNA assembly; Machine learning; Protein expression
Organisational unit
03602 - Panke, Sven / Panke, Sven
Notes
Funding
289326 - Standarization and orthogonalization of the gene expression flow for robust engineering of NTN (new-to-nature) biological properties. (EC)