Novel Methods to Engineer the Substrate Specificity of Proteases


Loading...

Author / Producer

Date

2024

Publication Type

Doctoral Thesis

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

Highly specific proteases enable the targeted cleavage of polypeptides and have found various applications in fields such as biotechnology, biological research, and medicine. However, commercially used proteases still mainly rely on natively available substrate specificities. The ability to engineer proteases to selectively cleave new targets would substantially broaden their utility. For example, in the field of proteomics, engineered proteases cleaving sites that contain post-translational modifications have already led to new discoveries. A particularly exciting prospect is the application of designer proteases for therapeutic interventions, where they could once be used to cleave disease-related targets. In recent years, directed evolution has led to remarkable achievements in equipping proteins with improved or novel functions. Key to such engineering efforts is testing a diverse pool of variants for the desired activity and isolating improved candidates. In Chapter 1, we review currently available high-throughput platforms particularly suitable for the development of proteases with reprogrammed specificity. We describe two major limitations of those methods and address them by developing novel high-throughput platforms, each addressing one of the shortcomings. Firstly, standard screening assays do not allow integration of the testing for cleavage of multiple undesired substrates during the screening process. This often leads to variants with expanded, rather than truly shifted specificity. To integrate such controls, we aimed to reinvent how protease activity is assessed, and developed a novel in vivo assay in Chapter 2. We present a DNA recorder for protease activity that enables testing of thousands of protease variants on a plethora of substrate variants in a one vessel-pool of Escherichia coli cells. Relying on next generation sequencing as a read out, we identify each individual protease-substrate combination along with the resulting cleavage efficiency in parallel. We demonstrate the suitability of the method by assessing the activity of around 30’000 variants of Tobacco Etch Virus protease (TEVp) as a model on up to 134 substrates concomitantly, representing ~600’000 protease-substrate combinations. First, we performed a deep mutational scanning of TEVp with all single mutants of its wildtype substrate peptide. This allowed us to identify TEVp positions critical for the acceptance of certain variations in the substrate. Furthermore, we combinatorically mutated three selected positions to change the specificity at the substrate position succeeding the scissile bond (P1’ position). We could identify several TEVp variants with distinct amino acid preferences at P1’, including a variant exhibiting a specificity shift away from the amino acid residues preferred by the wildtype TEVp as well as promiscuous variants that may serve as starting points for further re-engineering efforts. Importantly, the amount and quality of data that can be acquired by the presented DNA recorder has so far been inaccessible, opening up exciting possibilities for data-driven protease engineering. We explore this potential by leveraging the unique sequence-activity data through machine learning and provide a demonstration of the in-silico prediction and design of proteolytic properties. Another limitation of current methods for protease engineering concerns the development of variants that cleave substrates containing non-canonical amino acids, for instance those with post-translational modifications. The difficulty of expressing such substrates in vivo or importing them into cells renders methods relying on intracellular protease testing impractical. Cell surface display (CSD) of proteases bypasses this problem, but existing methods primarily rely on substrate retention on the cell surface to keep the genotype-phenotype linkage and allow for screening via fluorescence activated cell sorting. Alternatively, modern in vitro approaches enable enzyme-substrate testing within microcompartments but involve complex and laborious workflows due to necessity of cell-free transcription and translation. In Chapter 3, we present a platform that combines cell surface display and microfluidic droplet technologies to circumvent these individual limitations and unify the advantages of both methodologies. First, we successfully displayed functional TEVp on the surface of E. coli. We further encapsulated single cells together with a fluorogenic substrate, specifically choosing microfluidic double emulsions for compartmentalization to allow for screening with standard flow-cytometric devices. We validated the feasibility of our method by enriching droplets containing cells that display active TEVp from a mixture with inactive variants. Finally, we screened a library of 250’000 TEVp variants for activity on a substrate containing D-aspartate in its recognition sequence. We conducted four consecutive rounds of encapsulation and sorting and employed next generation sequencing to accurately assess the enrichment in every round and to select promising variants. While we were unsuccessful in identifying a TEVp candidate that could cleave a D-aspartate-containing substrate, this work provides valuable methodological insights into the challenges of developing a high-throughput, microfluidic droplet-based method for protease engineering. We detail the complex interplay of the identified factors critical to the success of such screenings, and thus provide a valuable resource for future efforts to engineer proteases or other enzymes with substrates that cannot be readily supplied within the cell.

Publication status

published

Editor

Contributors

Examiner: Panke, Sven
Examiner : Jeschek, Markus
Examiner : Hauer, Bernhard
Examiner : Reddy, Sai

Book title

Journal / series

Volume

Pages / Article No.

Publisher

ETH Zurich

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

03602 - Panke, Sven / Panke, Sven

Notes

Funding

Related publications and datasets