Lukas Huber


Loading...

Last Name

Huber

First Name

Lukas

Organisational unit

Search Results

Publications 1 - 2 of 2
  • Huber, Lukas (2024)
    Highly specific proteases enable the targeted cleavage of polypeptides and have found various applications in fields such as biotechnology, biological research, and medicine. However, commercially used proteases still mainly rely on natively available substrate specificities. The ability to engineer proteases to selectively cleave new targets would substantially broaden their utility. For example, in the field of proteomics, engineered proteases cleaving sites that contain post-translational modifications have already led to new discoveries. A particularly exciting prospect is the application of designer proteases for therapeutic interventions, where they could once be used to cleave disease-related targets. In recent years, directed evolution has led to remarkable achievements in equipping proteins with improved or novel functions. Key to such engineering efforts is testing a diverse pool of variants for the desired activity and isolating improved candidates. In Chapter 1, we review currently available high-throughput platforms particularly suitable for the development of proteases with reprogrammed specificity. We describe two major limitations of those methods and address them by developing novel high-throughput platforms, each addressing one of the shortcomings. Firstly, standard screening assays do not allow integration of the testing for cleavage of multiple undesired substrates during the screening process. This often leads to variants with expanded, rather than truly shifted specificity. To integrate such controls, we aimed to reinvent how protease activity is assessed, and developed a novel in vivo assay in Chapter 2. We present a DNA recorder for protease activity that enables testing of thousands of protease variants on a plethora of substrate variants in a one vessel-pool of Escherichia coli cells. Relying on next generation sequencing as a read out, we identify each individual protease-substrate combination along with the resulting cleavage efficiency in parallel. We demonstrate the suitability of the method by assessing the activity of around 30’000 variants of Tobacco Etch Virus protease (TEVp) as a model on up to 134 substrates concomitantly, representing ~600’000 protease-substrate combinations. First, we performed a deep mutational scanning of TEVp with all single mutants of its wildtype substrate peptide. This allowed us to identify TEVp positions critical for the acceptance of certain variations in the substrate. Furthermore, we combinatorically mutated three selected positions to change the specificity at the substrate position succeeding the scissile bond (P1’ position). We could identify several TEVp variants with distinct amino acid preferences at P1’, including a variant exhibiting a specificity shift away from the amino acid residues preferred by the wildtype TEVp as well as promiscuous variants that may serve as starting points for further re-engineering efforts. Importantly, the amount and quality of data that can be acquired by the presented DNA recorder has so far been inaccessible, opening up exciting possibilities for data-driven protease engineering. We explore this potential by leveraging the unique sequence-activity data through machine learning and provide a demonstration of the in-silico prediction and design of proteolytic properties. Another limitation of current methods for protease engineering concerns the development of variants that cleave substrates containing non-canonical amino acids, for instance those with post-translational modifications. The difficulty of expressing such substrates in vivo or importing them into cells renders methods relying on intracellular protease testing impractical. Cell surface display (CSD) of proteases bypasses this problem, but existing methods primarily rely on substrate retention on the cell surface to keep the genotype-phenotype linkage and allow for screening via fluorescence activated cell sorting. Alternatively, modern in vitro approaches enable enzyme-substrate testing within microcompartments but involve complex and laborious workflows due to necessity of cell-free transcription and translation. In Chapter 3, we present a platform that combines cell surface display and microfluidic droplet technologies to circumvent these individual limitations and unify the advantages of both methodologies. First, we successfully displayed functional TEVp on the surface of E. coli. We further encapsulated single cells together with a fluorogenic substrate, specifically choosing microfluidic double emulsions for compartmentalization to allow for screening with standard flow-cytometric devices. We validated the feasibility of our method by enriching droplets containing cells that display active TEVp from a mixture with inactive variants. Finally, we screened a library of 250’000 TEVp variants for activity on a substrate containing D-aspartate in its recognition sequence. We conducted four consecutive rounds of encapsulation and sorting and employed next generation sequencing to accurately assess the enrichment in every round and to select promising variants. While we were unsuccessful in identifying a TEVp candidate that could cleave a D-aspartate-containing substrate, this work provides valuable methodological insights into the challenges of developing a high-throughput, microfluidic droplet-based method for protease engineering. We detail the complex interplay of the identified factors critical to the success of such screenings, and thus provide a valuable resource for future efforts to engineer proteases or other enzymes with substrates that cannot be readily supplied within the cell.
  • Huber, Lukas; Kucera, Tim; Höllerer, Simon; et al. (2025)
    Nature Communications
    Protein engineering has recently seen tremendous transformation due to machine learning (ML) tools that predict structure from sequence at unprecedented precision. Predicting catalytic activity, however, remains challenging, restricting our capabilities to design protein sequences with desired catalytic function in silico. This predicament is mainly rooted in a lack of experimental methods capable of recording sequence-activity data in quantities sufficient for data-intensive ML techniques, and the inefficiency of searches in the enormous sequence spaces inherent to proteins. Herein, we address both limitations in the context of engineering proteases with tailored substrate specificity. We introduce a DNA recorder for deep specificity profiling of proteases in Escherichia coli as we demonstrate testing 29,716 candidate proteases against up to 134 substrates in parallel. The resulting sequence-activity data on approximately 600,000 protease-substrate pairs does not only reveal key sequence determinants governing protease specificity, but allows to build a data-efficient deep learning model that accurately predicts protease sequences with desired on- and off-target activities. Moreover, we present epistasis-aware training set design as a generalizable strategy to streamline searches within enormous sequence spaces, which strongly increases model accuracy at given experimental efforts and is thus likely to have implications for protein engineering far beyond proteases.
Publications 1 - 2 of 2