11 |
Identification of novel active Cas9 orthologs from metagenomic dataDemozzi, Michele 12 April 2022 (has links)
CRISPR-Cas is the state-of-the-art biological tool that allows precise and fast manipulation of the genetic information of cellular genomes. The translation of the CRISPR-Cas technology from in vitro studies into clinical applications highlighted a variety of limitations: the currently available systems are limited by their off-target activity, the availability of a Cas-specific PAM sequence next to the target and the size of the Cas protein. In particular, despite high levels of activity, the size of the CRISPR-SpCas9 editing machinery is not compatible with an all-in-one AAV delivery system and the genomic sequences that can be targeted are limited by the 3-NGG PAM-dependency of the SpCas9 protein. To further expand the CRISPR tools repertoire we turned to metagenomic data of the human microbiome to search for uncharacterized CRISPR-Cas9 systems and we identified a set of novel small Cas9 orthologs derived from the analysis of reconstructed bacterial metagenomes. In this thesis study, ten candidates were chosen according to their size (less than 1100aa). The PAM preference of all the ten orthologs was established exploiting a bacterial-based and an in vitro platform. We demonstrated that three of them are active nucleases in human cells and two out of the three showed robust editing levels at endogenous loci, outperforming SpCas9 at particular targets. We expect these new variants to be very useful in expanding the available genome editing tools both in vitro and in vivo. Knock-out-based Cas9 applications are very efficient but many times a precise control of the repair outcome through HDR-mediated gene targeting is required. To address this issue, we also developed an MS2-based reporter platform to measure the frequency of HDR events and evaluate novel HDR-modulating factors. The platform was validated and could allow the screening of libraries of proteins to assess their influence on the HDR pathway.
|
12 |
Cloning and characterization of neuropeptide Y receptors of the Y<sub>1</sub> subfamily in mammals and fishStarbäck, Paula January 2000 (has links)
<p>Neuropeptide Y (NPY) is an abundant neurotransmitter in the nervous system and forms a family of evolutionarily related peptides together with peptide YY (PYY), pancreatic polypeptide (PP) and polypeptide Y (PY). These peptides are ligands to a family of receptors that mediate a wide range of physiological effects including stimulation of appetite. This work describes the molecular cloning of four novel NPY receptors.</p><p>In rat a receptor called PP1, later renamed Y<sub>4</sub>, was cloned and characterized. It displays the highest amino acid sequence identity to the Y<sub>1</sub> receptor. Rat Y<sub>4</sub> differs extensively from human Y<sub>4</sub>, cloned subsequently, in both pharmacological properties, tissue distribution, and amino acid sequence with only 75% identity. Rat and human Y<sub>4 </sub>are the most diverged orthologues in the NPY receptor family.</p><p>In guinea pig, the y<sub>6</sub> receptor gene was found to be a pseudogene with several frameshift mutations. The gene is a pseudogene in human and pig too, but seems to give rise to a functional receptor in mouse and rabbit. This unusual evolutionary situa- tion may be due to inactivation of the gene in a mammalian ancestor and then restoration of expression in mouse and rabbit, but perhaps more likely due to independent inactivations in guinea pig, human and pig.</p><p>In zebrafish, two new intronless receptor genes were cloned. Sequence comparisons suggest that both receptors are distinct from the mammalian receptors Y<sub>1</sub>, Y<sub>4</sub> and y<sub>6</sub>, hence they were named Ya and Yb. Chromosomal localization provides further support that Ya and Yb may be distinct subtypes. </p><p>The discoveries of the rat Y<sub>4</sub> and zebrafish Ya and Yb receptors were unexpected and show that the NPY receptor family is larger than previously thought.</p>
|
13 |
Evolutionary Analysis of the Insulin-Relaxin Gene Family from the Perspective of Gene and Genome Duplication Events / Ewolucyjna Analiza Rodziny Genów Insulin-Relaksyn z Perspektywy Duplikacji Genu i GenomuOlinski, Robert Piotr January 2007 (has links)
<p>Paralogs arise by duplications and belong to families. Ten paralogs (insulin; <i>IGF-1</i> and <i>-2</i>; <i>INSL3-6</i> and 3-relaxins) constitute the human insulin-relaxin family. The aim of this study was to outline the duplications that gave rise to the vertebrate insulin-relaxin genes and the chromosomal regions in which they reside. Neurotrophin and Trk-receptor families with more than 300, otherwise unrelated, families had paralogs in the regions hosting insulin/relaxin genes, defining two quadruplicate paralogy-regions, namely: insulin/IGF and INSL/relaxin paralogons. Thereby, the localization of insulin/relaxins in human shows that these regions were formed during two genome duplications at the stem of the vertebrates.</p><p>We characterized insulin-like genes (<i>INS-L1</i>, <i>-L2</i> and <i>-L3</i>) in the <i>Ciona intestinalis</i> genome, a species that split from the chordate lineage before the genome duplications. Conserved synteny between the Ciona region hosting the <i>INS-Ls</i> and two human paralogons as well as linkage of the actual paralogons, suggest that a segmental duplication gave rise to the entire region prior to the genome duplications. Synteny together with gene and protein structures demonstrate that <i>INS-L1</i> is orthologous to the vertebrate <i>INSLs</i>/relaxins, <i>INS-L2</i> to insulins and <i>INS-L3</i> to <i>IGFs</i>. This indicates that pro-orthologs of the insulin-relaxin family were formed before Ciona. Our analysis also implies that the INSL/relaxin ancestor switched receptor from tyrosine kinase- to GPCR-type. This probably occurred after the Ciona-stage, but before the genome duplications.</p><p>Using genes residing within the analyzed human paralogons that were present in a chromosomal region in the Ciona-human ancestor, we identified 37 segments with conserved synteny between the <i>Drosophila melanogaster</i> and human genomes. Orthologs residing in Ciona-, sea urchin- and the fly syntenic segments imply that such segments approximate an ancestral region from which the human paralogons originated.</p><p>To conclude, the human paralogons are remnants of genome duplications that in addition to segmental- and single duplications, shaped the extant vertebrate genomes. Using the quadruplicate paralogy-regions we were able to deduce duplication events of the insulin-relaxin genes and their chromosomal regions.</p>
|
14 |
The evolution of LOL, the secondary metabolite gene cluster for insecticidal loline alkaloids in fungal endophytes of grasses.Kutil, Brandi Lynn 15 May 2009 (has links)
LOL is a novel secondary metabolite gene cluster associated with the production of loline alkaloids (saturated 1-aminopyrrolizidine alkaloids with an oxygen bridge) exclusively in closely related grass-endophyte species in the genera Epichloë and Neotyphodium. In this study I characterize the LOL cluster in E. festucae, including the presentation of sequence corresponding to 10 individual lol genes as well as defining the boundaries of the cluster and evaluation of the genomic DNA region flanking LOL in E. festucae. In addition to characterizing the LOL cluster in E. festucae, I present LOL sequence from two additional species, Neotyphodium coenophialum and Neotyphodium sp. PauTG-1. Together with two recently published LOL clusters from N. uncinatum, these data allow for a powerful phylogenetic comparison of five clusters from four closely related species. There is a high degree of microsynteny (conserved gene order and orientation) among the five LOL clusters, allowing us to predict potential transcriptional co-regulatory binding motifs in lol promoter regions. The relatedness of LOL clusters is especially interesting in light of the history of interspecific hybridizations that generated the asexual, Neotyphodium lineages. In fact, three of the clusters appear to have been introduced to different Neotyphodium species by the same ancestral Epichloë species, for which present day isolates are no longer able to produce lolines. To address the evolutionary origins of the cluster we have investigated the phylogenetic relationships of particular lol ORFs to their paralogous primary metabolism genes (and gene families) from endophytes, other fungi and even other kingdoms. I present extensive evidence that at least two individual lol genes have evolved from primary metabolism genes within the fungal ancestors of endophytes, rather than being introduced via horizontal gene transfer. I also present complementation studies in Neurospora crassa exploring the functional divergence of one lol gene from its primary metabolism paralog. While it is clear that these insecticidal compounds should convey a selective advantage to the fungus and its host, thus explaining preservation of the trait, this analysis provides an exploration into the evolutionary origin and maintenance of the genes that comprise the LOL and the cluster itself.
|
15 |
Cloning and characterization of neuropeptide Y receptors of the Y1 subfamily in mammals and fishStarbäck, Paula January 2000 (has links)
Neuropeptide Y (NPY) is an abundant neurotransmitter in the nervous system and forms a family of evolutionarily related peptides together with peptide YY (PYY), pancreatic polypeptide (PP) and polypeptide Y (PY). These peptides are ligands to a family of receptors that mediate a wide range of physiological effects including stimulation of appetite. This work describes the molecular cloning of four novel NPY receptors. In rat a receptor called PP1, later renamed Y4, was cloned and characterized. It displays the highest amino acid sequence identity to the Y1 receptor. Rat Y4 differs extensively from human Y4, cloned subsequently, in both pharmacological properties, tissue distribution, and amino acid sequence with only 75% identity. Rat and human Y4 are the most diverged orthologues in the NPY receptor family. In guinea pig, the y6 receptor gene was found to be a pseudogene with several frameshift mutations. The gene is a pseudogene in human and pig too, but seems to give rise to a functional receptor in mouse and rabbit. This unusual evolutionary situa- tion may be due to inactivation of the gene in a mammalian ancestor and then restoration of expression in mouse and rabbit, but perhaps more likely due to independent inactivations in guinea pig, human and pig. In zebrafish, two new intronless receptor genes were cloned. Sequence comparisons suggest that both receptors are distinct from the mammalian receptors Y1, Y4 and y6, hence they were named Ya and Yb. Chromosomal localization provides further support that Ya and Yb may be distinct subtypes. The discoveries of the rat Y4 and zebrafish Ya and Yb receptors were unexpected and show that the NPY receptor family is larger than previously thought.
|
16 |
Evolutionary Analysis of the Insulin-Relaxin Gene Family from the Perspective of Gene and Genome Duplication Events / Ewolucyjna Analiza Rodziny Genów Insulin-Relaksyn z Perspektywy Duplikacji Genu i GenomuOlinski, Robert Piotr January 2007 (has links)
Paralogs arise by duplications and belong to families. Ten paralogs (insulin; IGF-1 and -2; INSL3-6 and 3-relaxins) constitute the human insulin-relaxin family. The aim of this study was to outline the duplications that gave rise to the vertebrate insulin-relaxin genes and the chromosomal regions in which they reside. Neurotrophin and Trk-receptor families with more than 300, otherwise unrelated, families had paralogs in the regions hosting insulin/relaxin genes, defining two quadruplicate paralogy-regions, namely: insulin/IGF and INSL/relaxin paralogons. Thereby, the localization of insulin/relaxins in human shows that these regions were formed during two genome duplications at the stem of the vertebrates. We characterized insulin-like genes (INS-L1, -L2 and -L3) in the Ciona intestinalis genome, a species that split from the chordate lineage before the genome duplications. Conserved synteny between the Ciona region hosting the INS-Ls and two human paralogons as well as linkage of the actual paralogons, suggest that a segmental duplication gave rise to the entire region prior to the genome duplications. Synteny together with gene and protein structures demonstrate that INS-L1 is orthologous to the vertebrate INSLs/relaxins, INS-L2 to insulins and INS-L3 to IGFs. This indicates that pro-orthologs of the insulin-relaxin family were formed before Ciona. Our analysis also implies that the INSL/relaxin ancestor switched receptor from tyrosine kinase- to GPCR-type. This probably occurred after the Ciona-stage, but before the genome duplications. Using genes residing within the analyzed human paralogons that were present in a chromosomal region in the Ciona-human ancestor, we identified 37 segments with conserved synteny between the Drosophila melanogaster and human genomes. Orthologs residing in Ciona-, sea urchin- and the fly syntenic segments imply that such segments approximate an ancestral region from which the human paralogons originated. To conclude, the human paralogons are remnants of genome duplications that in addition to segmental- and single duplications, shaped the extant vertebrate genomes. Using the quadruplicate paralogy-regions we were able to deduce duplication events of the insulin-relaxin genes and their chromosomal regions.
|
17 |
The evolution of LOL, the secondary metabolite gene cluster for insecticidal loline alkaloids in fungal endophytes of grasses.Kutil, Brandi Lynn 15 May 2009 (has links)
LOL is a novel secondary metabolite gene cluster associated with the production of loline alkaloids (saturated 1-aminopyrrolizidine alkaloids with an oxygen bridge) exclusively in closely related grass-endophyte species in the genera Epichloë and Neotyphodium. In this study I characterize the LOL cluster in E. festucae, including the presentation of sequence corresponding to 10 individual lol genes as well as defining the boundaries of the cluster and evaluation of the genomic DNA region flanking LOL in E. festucae. In addition to characterizing the LOL cluster in E. festucae, I present LOL sequence from two additional species, Neotyphodium coenophialum and Neotyphodium sp. PauTG-1. Together with two recently published LOL clusters from N. uncinatum, these data allow for a powerful phylogenetic comparison of five clusters from four closely related species. There is a high degree of microsynteny (conserved gene order and orientation) among the five LOL clusters, allowing us to predict potential transcriptional co-regulatory binding motifs in lol promoter regions. The relatedness of LOL clusters is especially interesting in light of the history of interspecific hybridizations that generated the asexual, Neotyphodium lineages. In fact, three of the clusters appear to have been introduced to different Neotyphodium species by the same ancestral Epichloë species, for which present day isolates are no longer able to produce lolines. To address the evolutionary origins of the cluster we have investigated the phylogenetic relationships of particular lol ORFs to their paralogous primary metabolism genes (and gene families) from endophytes, other fungi and even other kingdoms. I present extensive evidence that at least two individual lol genes have evolved from primary metabolism genes within the fungal ancestors of endophytes, rather than being introduced via horizontal gene transfer. I also present complementation studies in Neurospora crassa exploring the functional divergence of one lol gene from its primary metabolism paralog. While it is clear that these insecticidal compounds should convey a selective advantage to the fungus and its host, thus explaining preservation of the trait, this analysis provides an exploration into the evolutionary origin and maintenance of the genes that comprise the LOL and the cluster itself.
|
18 |
A computational framework for transcriptome assembly and annotation in non-model organisms: the case of venturia inaequalisKimbung, Stanley Mbandi January 2014 (has links)
Philosophiae Doctor - PhD / In this dissertation three computational approaches are presented that enable optimization of reference-free transcriptome reconstruction. The first addresses the selection of bona fide reconstructed transcribed fragments (transfrags) from de novo transcriptome assemblies and annotation with a multiple domain co-occurrence framework. We showed that selected transfrags are functionally relevant and represented over 94% of the information derived from annotation by transference. The second approach relates to quality score based RNA-seq sub-sampling and the description of a novel sequence similarity-derived metric for quality assessment of de novo transcriptome assemblies. A detail systematic analysis of the side effects induced by quality score based trimming and or filtering on artefact removal and transcriptome quality is describe. Aggressive trimming produced incomplete reconstructed and missing transfrags. This approach was applied in generating an optimal transcriptome assembly for a South African isolate of V. inaequalis. The third approach deals with the computational partitioning of transfrags assembled from RNA-Seq of mixed host and pathogen reads. We used this strategy to correct a publicly available transcriptome assembly for V. inaequalis (Indian isolate). We binned 50% of the latter to Apple transfrags and identified putative immunity transcript models. Comparative transcriptomic analysis between fungi transfrags from the Indian and South African isolates reveal effectors or transcripts that may be expressed in planta upon morphogenic differentiation.
These studies have successfully identified V. inaequalis specific transfrags that can facilitate gene discovery. The unique access to an in-house draft genome assembly allowed us to provide preliminary description of genes that are implicated in pathogenesis. Gene prediction with bona fide transfrags produced 11,692 protein-coding genes. We identified two hydrophobin-like genes and six accessory genes of the melanin biosynthetic pathway that are implicated in the invasive action of the appressorium. The cazyome reveals an impressive repertoire of carbohydrate degrading enzymes and carbohydrate-binding modules amongst which are six polysaccharide lyases, and the largest number of carbohydrate esterases (twenty-eight) known in any fungus sequenced to date
|
19 |
An investigation of human protein interactions using the comparative methodUr-Rehman, Saif January 2012 (has links)
There is currently a large increase in the speed of production of DNA sequence data as next generation sequencing technologies become more widespread. As such there is a need for rapid computational techniques to functionally annotate data as it is generated. One computational method for the functional annotation of protein-coding genes is via detection of interaction partners. If the putative partner has a functional annotation then this annotation can be extended to the initial protein via the established principle of “guilt by association”. This work presents a method for rapid detection of functional interaction partners for proteins through the use of the comparative method. Functional links are sought between proteins through analysis of their patterns of presence and absence amongst a set of 54 eukaryotic organisms. These links can be either direct or indirect protein interactions. These patterns are analysed in the context of a phylogenetic tree. The method used is a heuristic combination of an established accurate methodology involving comparison of models of evolution the parameters of which are estimated using maximum likelihood, with a novel technique involving the reconstruction of ancestral states using Dollo parsimony and analysis of these reconstructions through the use of logistic regression. The methodology achieves comparable specificity to the use of gene coexpression as a means to predict functional linkage between proteins. The application of this method permitted a genome-wide analysis of the human genome, which would have otherwise demanded a potentially prohibitive amount of computational resource. Proteins within the human genome were clustered into orthologous groups. 10 of these proteins, which were ubiquitous across all 54 eukaryotes, were used to reconstruct a phylogeny. An application of the heuristic predicted a set of functional protein interactions in human cells. 1,142 functional interactions were predicted. Of these predictions 1,131 were not present in current protein-protein interaction databases.
|
20 |
Nouvelles approches bioinformatiques pour l'étude à grande échelle de l'évolution des activités enzymatiques / New bioinformatic approaches for the large-scale study of the evolution of the enzymatic activitiesPereira, Cécile 11 May 2015 (has links)
Cette thèse a pour objectif de proposer de nouvelles méthodes permettant l'étude de l'évolution du métabolisme. Pour cela, nous avons choisi de nous pencher sur le problème de comparaison du métabolisme de centaines de micro-organismes.Afin de comparer le métabolisme de différentes espèces, il faut dans un premier temps connaître le métabolisme de chacune de ces espèces.Les protéomes des micro-organismes avec lesquels nous souhaitons travailler proviennent de différentes bases de données et ont été séquencés et annotés par différentes équipes, via différentes méthodes. L'annotation fonctionnelle peut donc être de qualité hétérogène. C'est pourquoi il est nécessaire d'effectuer une ré-annotation fonctionnelle standardisée des protéomes des organismes que nous souhaitons comparer.L'annotation de séquences protéiques peut être réalisée par le transfert d'annotations entre séquences orthologues. Il existe plus de 39 bases de données répertoriant des orthologues prédits par différentes méthodes. Il est connu que ces méthodes mènent à des prédictions en partie différentes. Afin de tenir compte des prédictions actuelles tout en ajoutant de l'information pertinente, nous avons développé la méta-approche MARIO. Celle-ci combine les intersections des résultats de plusieurs méthodes de détections de groupes d'orthologues et les enrichit grâce à l'utilisation de profils HMM. Nous montrons que notre méta-approche permet de prédire un plus grand nombre d'orthologues tout en améliorant la similarité de fonction des paires d'orthologues prédites. Cela nous a permis de prédire le répertoire enzymatique de 178 protéomes de micro-organismes (dont 174 champignons).Dans un second temps, nous analysons ces répertoires enzymatiques afin d'en apprendre plus sur l'évolution du métabolisme. Dans ce but, nous cherchons des combinaisons de présence/absence d'activités enzymatiques permettant de caractériser un groupe taxonomique donné. Ainsi, il devient possible de déduire si la création d'un groupe taxonomique particulier peut s'expliquer par (ou a induit) l'apparition de certaines spécificités au niveau de son métabolisme.Pour cela, nous avons appliqué des méthodes d'apprentissage supervisé interprétables (règles et arbres de décision) sur les profils enzymatiques. Nous utilisons comme attributs les activités enzymatiques, comme classe les groupes taxonomiques et comme exemples les champignons. Les résultats obtenus, cohérents avec nos connaissances actuelles sur ces organismes, montrent que l'application de méthodes d'apprentissage supervisé est efficace pour extraire de l'information des profils phylogénétiques. Le métabolisme conserve donc des traces de l'évolution des espèces.De plus, cette approche, dans le cas de prédiction de classifieurs présentant un faible nombre d'erreurs, peut permettre de mettre en évidence l'existence de probables transferts horizontaux. C'est le cas par exemple du transfert du gène codant pour l'EC:3.1.6.6 d'un ancêtre des pezizomycotina vers un ancêtre d'Ustilago maydis. / This thesis has for objective to propose new methods allowing the study of the evolution of the metabolism. For that purpose, we chose to deal with the problem of comparison of the metabolism of hundred microorganisms.To compare the metabolism of various species, it is necessary to know at first the metabolism of each of these species.We work with proteomes of the microorganisms coming from various databases and sequenced and annotated by various teams, via various methods. The functional annotation can thus be of heterogeneous quality. That is why it is necessary to make a standardized functional annotation of this proteomes.The annotation of protein sequences can be realized by the transfer of annotations between orthologs sequences. There are more than 39 databases listing orthologues predicted by various methods. It is known that these methods lead to partially different predictions. To take into account current predictions and also adding relevant information, we developed the meta approach MARIO. This one combines the intersections of the results of several methods of detection of groups of orthologs and add sequences to this groups by using HMM profiles. We show that our meta approach allows to predict a largest number of orthologs while improving the similarity of function of the pairs of predicted orthologs. It allowed us to predict the enzymatic directory of 178 proteomes of microorganisms (among which 174 fungi).Secondly, we analyze these enzymatic directories in order to analyse the evolution of the metabolism. In this purpose, we look for combinations of presence / absence of enzymatic activities allowing to characterize a taxonomic group. So, it becomes possible to deduct if the creation of a particular taxonomic group can give some explanation by (or led to) the appearance of specificities at the level of its metabolism.For that purpose, we applied interpretable machine learning methods (rulers and decision trees) to the enzymatic profiles. We use as attributes the enzymatic activities, as classes the taxonomic groups and as examples the fungi. The results, coherent with our current knowledge on these species, show that the application of methods of machine learning is effective to extract informations of the phylogenetic profiles. The metabolism thus keeps tracks of the evolution of the species.Furthermore, this approach, in the case of prediction of classifiers presenting a low number of errors, can allow to highlight the existence of likely horizontal transfers. It is the case for example of the transfer of the gene coding for the EC:3.1.6.6 of an ancestor of pezizomycotina towards an ancestor of Ustilago maydis.
|
Page generated in 0.0529 seconds