1 |
Combinatorial protein engineering to identify improved CRISPR activatorsGiddins, Marla Jane January 2025 (has links)
Laboratory-engineered proteins such as high-fidelity DNA polymerases, CRISPR base and primeeditors, and chimeric antigen receptors have transformed our ability to probe and manipulate biological systems. To craft these powerful tools, researchers fuse multiple domains into novel chimeras intended to retain the functional properties of their constituent parts. Although this approach has produced a number of important technologies, its low-throughout nature and high costs thwart efforts to explore complex combinatorial landscapes and limit our grasp on the “rules” governing synthetic protein assembly (e.g., which domains work best together, which domain orders are optimal, benefits of fusing multiple copies of the same domain, etc.). Previous state-of-the-art CRISPR activators, including the tripartite activator, VP64-P65- RTA (VPR) and the Synergistic Activation Mediator (SAM), have established the benefit of combining multiple activation domains (ADs) into a single complex for improved transcriptional modulation. While VPR and SAM have proven relatively successful in both in vitro and in vivo applications, neither activator shows uniform activity across targets and cell types. Furthermore, reports that these tools produce toxicity within cellular systems limit their utility in broad-ranging applications.
To probe a vast combinatorial landscape of multi-domain CRISPR activators while bypassing the arduous task of generating each construct one one-by-one, we developed a strategy for constructing large combinatorial libraries of protein variants en masse and used this method to functionally evaluate a library of >15,000 CRISPR activators. Importantly, we conduct our screen on multiple target genes to identify tools with consistent performance across the genome. Our findings bring to light a critical yet often overlooked feature of CRISPR activators: toxicity.
This work not only highlights the prevalence of this problem but also elucidates several biological factors that contribute to it. Our observation that many high-performing activators elicited minimal effects on cell fitness challenges the notion that toxicity is an inevitable byproduct of a potent activation – and suggests that this model greatly oversimplifies the nuanced relationship between these traits. We also explored how the biochemical properties of ADs (e.g., hydrophobicity and intrinsic disorder) and their combinatorial interactions drive activator performance. Finally, we identified two potent activators, MHV and MMH, that show enhanced activity across diverse targets and cell types over one of the gold-standard CRISPR activators, SAM. Our results underscore the power of high-throughput techniques for both improving our understanding of complex protein assemblies and identifying more powerful tools.
|
2 |
High-throughput screening enabled advances in protein engineeringKratz, Alexander Franz January 2024 (has links)
Nature has produced a dazzling array of proteins which perform useful and interesting functions. Over the last 50 years, biologists have begun to re-engineer these tiny machines, either to perform new functions, or to perform their functions more efficiently. However, protein engineering suffers from the massive scale of the search space. To improve the field’s ability to understand and engineer proteins, we present improvements both to generating and understanding large protein-function data sets. We apply these approaches to two tasks, generating data sets that measure the activity of tens of thousands of protein variants, and producing two novel CRISPR activators and an improved machine learning model for designing protein using DMS data as input data.
In the first task, we engineer proteins at the level of domains, recombining trans-activation domains to generate improved CRISPR activators. By analyzing proteins at the level of domains, we simplify the protein engineering task into a smaller combinatorial problem. CRISPRa tools enable biologists to activate transcription at arbitrary locations using an easily retargetable CRISPR guide. In addition to producing two novel CRISPRa tools which outperform the current state-of-the-art, we perform what we believe to be the first systematic evaluation of the toxicity of CRISPRa tools in cells. We also perform a detailed analysis of the ways in which trans- activation domains interact in a multi-domain tool, and the impact of these interactions on both gene activation strength and toxicity.
Our second protein engineering project approaches the problem at the level of individual amino acids. One target is a chaperone protein, DNAJB6, which our lab previously uncovered as a rescuer of toxicity for multiple neuro-degenerative proteins28. We use error-prone PCR to generate a library of over 30,000 compound mutants, which we screen using a yeast-based assay for their ability to rescue toxicity associated with an aggregation prone protein, FUS. We also engineer GFP, which has many existing deep mutational scanning (DMS) datasets.
To engineer these proteins, we develop a machine learning model, OptiProt. Optiprot is trained on DMS data to approximate the sequence-to-score landscape and then perform machine- learning directed evolution (MLDE), designing proteins that meet user-defined criteria. We test OptiProt’s ability to learn and generalize from our DNAJB6 and GFP DMS datasets by adding increasingly difficult constraints and asking it to solve them. OptiProt was able to design proteins which improved on the best variants within the DMS data as well as integrate up to 50 mutations without breaking the functionality of the wildtype protein. Finally, we task OptiProt with difficult challenges such as compensating for a loss-of-function mutant or replacing every instance of a certain amino acid.
|
3 |
Activation of endogenous full-length active LINE-1 RNA using CRISPR activation to study its role during somatic cell reprogrammingAlsolami, Amjad 11 1900 (has links)
The repetitive sequence composes nearly half of human and mouse genome, most of which are scattered repeats of transposable elements (TEs). The non-LTR retrotransposons are the most accumulated TEs in the mammalian genome and L1s are the most active and abundant autonomous retrotransposons. L1s are highly activated during the epigenetic reprogramming of early mammalian embryos and have the highest level of expression among all retrotransposons throughout the preimplantation state. Moreover, the reprogramming of somatic cells into iPSCs is associated with an increase in L1 expression. The transcription of L1 during the early embryogenesis is necessary to regulate developmental genes and prevent heterochromatin formation to maintain cellular pluripotency state, that guarantying an appropriate future differentiation. However, the role of L1 reactivation during the somatic cell reprogramming remains unclear. Therefore, aim of this work is to study the impact of L1 transcription during the reprogramming process of the iPSCs. We used CRISPR-mediated gene activation (CRISPRa) system that fuse a deactivated Cas9 (dCas9) with transactivation domains (VPR). We confirm the ability to overexpress L1 in Human Embryonic Kidney cells (HEK293) and Human Dermal Fibroblasts (HDFs) by utilizing CRISPR activation system and this will provide a good opportunity to study the role of L1 transcripts during the reprogramming of HDFs into iPSCs. Furthermore, we established stable HDFs that able to express combinations of “Yamanaka” reprogramming factors. The model system will allow to investigate the effect of overexpressing L1 with reprogramming factors to answer the question of whether L1 can trigger or facilitate the reprogramming processes and its underlying mechanism.
|
Page generated in 0.1047 seconds