• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 16
  • 3
  • 3
  • 2
  • Tagged with
  • 38
  • 38
  • 10
  • 9
  • 8
  • 8
  • 7
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Chromosome and Genome Evolution in Culicinae Mosquitoes

Masri, Reem Abed 14 July 2021 (has links)
The Culicinae is the most extensive subfamily among the Culicidae family of mosquitoes. Two genera, Culex and Aedes, from this subfamily have world-wide distribution and are responsible for transmitting of several deadly diseases including Zika, West Nile fevers, chikungunya, dengue, and Rift Valley fevers. Developing high-quality genome assembly for mosquitoes, studying their population structure, and evolution can help to facilitate the development of new strategies for vector control. Studies on Aedes albopitcus as well as on species from the Culex pipiens complex, which are widely spread in the United States, provide excellent models on these topics. Ae. albopictus is one of the most dangerous invasive mosquito species in the world that transmits more than 20 arboviruses. This species has highly repetitive genome that is the largest among mosquito genomes sequenced so far. Thus, sequencing and assembling of such genome is extremally challenging. As a result, the lack of high-quality Ae. albopictus genome assembly has delayed the progress in understanding its biology. To produce a high-quality genome assembly, it was important to anchor genomic scaffolds to the cytogenetic map creating a physical map of the genome assembly. We first developed a new gene-based approach for the physical mapping of repeat-rich mosquito genomes. The approach utilized PCR amplification of the DNA probes based on complementary DNA (cDNA) that does not include repetitive DNA sequences. This method was then used for the development of a physical map for Ae. albopictus based on the in situ hybridization of fifty cDNA fragments or gene exons from twenty-four scaffolds to the mitotic chromosomes from imaginal discs. This study resulted in the construction of a first physical map of the Ae. albopictus genome as well as mapping viral integration and polyphenol oxidase genes. Moreover, comparing our present Ae. albopictus physical map to the current Ae. aegypti assembly indicated the presence of multiple chromosomal inversions between them. To better understand population structure and chromosome evolution in Culicinae mosquitoes, especially in the Culex pipiens complex, we studied genomic and chromosomal differentiation between two subspecies Cx. pipiens pipiens and Cx. pipiens molestus. For the species responsible for the spread of human diseases, understanding the population dynamics and processes of taxa diversification is important for an effective mosquito control . Two vectors of West Nile virus, Cx. p. pipiens and Cx. p. molestus, exhibit epidemiologically important behavioral and physiological differences, but the whole-genome divergence between them was unexplored. The first goal of this study was to better understand the level of genomic differentiation and population structures of Cx. p. pipiens and Cx. p. molestus from different continents. We sequenced and compared whole genomes of 40 individual mosquitoes from two locations in Eurasia and two in North America. Principal Component, ADMIXTURE, and neighbor joining analyses of the nuclear genomes identified two major intercontinental, monophyletic clusters of Cx. p. pipiens and Cx. p. molestus. The level of genomic differentiation between the subspecies was uniform along chromosomes. The ADMIXTURE analysis determined signatures of admixture in Cx. p. pipens populations, but not in Cx. p. molestus populations. Thus, our study identified that Cx. p. molestus and Cx. p. pipiens represent different evolutionary units with monophyletic origin that have undergone incipient ecological speciation. The second goal was to study differences at the chromosome level between these two organisms. We first measured whole chromosome and chromosome arm length differences between Cx. p. molestus and Cx. p. pipiens as a basic cytogenetic approach. In addition, we used the novel Hi-C approach to detect chromosomal rearrangements between them since Hi-C was successful in detecting a known inversion in Cx. quinquefasciatus. Cx. p. molestus and Cx. p. pipiens embryos were used to perform the Hi-C technique. Analysis of the Hi-C data showed the presence of two different inversions in Cx. p. pipiens and Cx. p. molestus heatmap, which could explain their different physiology and adaptation in nature. Developing modern genomic and cytogenetic tools is important to enhance the quality of genome assemblies, improve gene annotation, and provide a better framework for comparative and population genomics of mosquitoes; also it is the foundation for the development of novel genome-based approaches for vector control. / Doctor of Philosophy / Mosquitoes are medically important insects because they vector a range of diseases that infect humans. The subfamily Culicinae is responsible for transmitting such diseases as Zika, dengue, and West Nile fevers, which have triggered fatal infections and epidemics in multiple parts of the world. Since 2010-2016, studies have reported exceeding levels of insecticide resistance that slows the disease elimination process. Novel transgenic techniques have a tremendous potential for more efficiently minimizing mosquito-borne diseases and transmission. Availability of high-quality genome assemblies for mosquitoes may help to better understand their population structure and to develop effective and safe vector-control approaches that we urgently need. For the development of high-quality genome assemblies, we need to construct a physical genome map, that shows the physical locations of genes or other DNA sequences of interest along the chromosomes. For this reason, we developed a new gene-based approach for the physical mapping of the mosquito genomes. This method was then used for the development of a physical map for Ae. albopictus. This study resulted in the generation of the first physical map of the Ae. albopictus genome. To understand population structure in Culicinae mosquitoes, we used mosquitoes from the Culex pipiens complex. Species in this complex transmit different arthropod-borne viruses or arboviruses. Notable is the West Nile Virus, which has triggered fatal infections and epidemics in Eastern and Central Europe, North America and is also known in Asia, Australia, Africa, and the Caribbean. We specifically focused on two subspecies in this complex, Cx. pipiens pipiens and Cx. pipiens molestus that are morphologically identical, but are different physiologically and behaviorally. Although they are spread globally in temperate regions, their population structure and taxonomic status remains unclear. The first goal of this study was to better understand the level of genomic differentiation of Cx. p. pipiens and Cx. p. molestus from different continents. We sequenced and compared the whole genomes of 40 individual mosquitoes from two locations in Eurasia and two in North America. Our study identified that Cx. p. molestus and Cx. p. pipiens represent different evolutionary units that are currently undergoing ecological speciation. The second goal was to study differences at the chromosome level between them. Using the Hi-C approach we detected presence of two different inversions in Cx. p. pipiens and Cx. p. molestus, which could potentially explain their different physiology and adaptation.
22

Expanding Genetic and Genomic Resources for Sex Separation and Mosquito Control Strategies

Compton, Austin 26 October 2021 (has links)
Mosquitoes belonging to the genera Anopheles transmit malaria parasites, attributing the highest mortality of any vector-borne disease worldwide. Mosquitoes belonging to the genera Aedes transmit arboviruses including dengue, which has become the most important vector-borne virus due to a drastic surge in disease incidence. The scope of the studies in this dissertation is broad, with investigations bringing together elements of classical genetics, recent advances in sequencing and genome-editing technologies, and the use of modern forward genetics approaches. Chapter 2 of this dissertation explores the use of the Oxford Nanopore Sequencing Technology for the first time in mosquitoes. This new technology provides long reads that were used to piece together the AabS3 chromosomal assembly for Anopheles albimanus. The utility of this genomic resource is demonstrated by the discovery of novel telomeric repeats at the ends of the chromosomes that could have important implications in mosquito biology and control. Chapter 3 describes a forward genetics strategy called 'Marker-Assisted Mapping' (MAM) that enables high-resolution mapping of the causal gene locus of a mutant phenotype. The principle and effectiveness of MAM is first demonstrated by mapping a known transgene insertion. MAM is then used to identify cardinal as a candidate causal gene for the spontaneous red-eye (re) mutation. Genetic crosses between the re mutant and cardinal knocking out individuals generated using CRISPR/Cas9 confirmed that cardinal indeed is the causal gene for re mutation. Chapter 4 explores three innovative strategies for mosquito sex separation by exploiting several sex-linked marker lines. We show that by linking a transgenic marker to the male-determining locus (M locus), or by linking the male-determining Nix gene to a marker, males can be precisely separated from females. We also produce a two-marker transgenic line that allows for both non-transgenic male separation and for efficient line maintenance. Finally, we discuss further applications of the resources generated and future directions stemming from these findings. Altogether, the studies described in this dissertation contribute to the overall goal of understanding mosquito biology and of controlling mosquito-borne infectious diseases. / Doctor of Philosophy / Female mosquitoes bite and transmit deadly pathogens including the malaria parasite, and viruses such as dengue, Zika, and West Nile. Control programs that attempt to limit the spread of these deadly diseases rely on the control of mosquitoes themselves. These mosquito control methods have relied heavily on indoor and outdoor insecticidal spraying. However, the efficacy of these methods has been jeopardized by the increasing prevalence of insecticide resistance. Thus, it is necessary to implement other methods for effective mosquito control. Genetic control strategies such as the Sterile Insect Technique (SIT) and Wolbachia-based Incompatible Insect Technique (IIT) are excellent solutions to overcome the limitations of current control strategies. As female mosquitoes bite and transmit disease-causing pathogens, only males are released, which necessitate the separation of the non-biting males from females before release. The aim of this work was to use recent technological advancements to better understand the genome and basic genetics of vector mosquito species, and to identify possible approaches to improve current sex separation practices. To develop a deep understanding of mosquito biology and genetics, it is crucial that a high-quality and accurate genome assembly is available. However, many mosquito genome assemblies remain fragmented. To address this limitation, we used recent advances in sequencing technologies to produce a high-quality genome assembly for the New World malaria mosquito, Anopheles albimanus. These sequencing and assembly efforts led to the discovery of novel telomere sequences at the ends of chromosomes, which could have implications for mosquito control. Forward genetics, which identifies the gene(s) responsible for a given phenotype, has been hindered by the low recombination rate in the yellow and dengue fever mosquito, Aedes aegypti. We develop a Marker-Assisted Mapping (MAM) strategy to address this problem. We first demonstrate this method by mapping the known insertion of a transgene. MAM is then used to identify cardinal as a candidate causal gene for the spontaneous red-eye (re) mutation. MAM identification of the Cardinal gene was then verified by knocking out Cardinal, which represents the first successful gene mapping in Aedes aegypti using forward genetics. The MAM strategy has broad implications as it could enable the discovery of genes involved in important traits such as insecticide resistance. To improve sex separation methods, we took advantage of several sex-linked transgenic lines to develop three novel strategies. First, we demonstrate that screening for a genetic marker that is tightly linked to the male-determining locus (M locus) is an effective approach to reduce female contamination. Second, we demonstrate that instead of linking a marker to the M locus, we can link the male-determining factor, Nix, to a genetic marker. When a Nix transgene is located adjacent to the red-eye locus with extremely tight linkage, the red-eye phenotype becomes a faithful marker for separation of males and females. Finally, we developed a two-marker genetic sexing strain that produces non-transgenic males that could be used for release, and transgenic marked males and females for efficient line maintenance.
23

Computational Analysis of Viruses in Metagenomic Data

Tithi, Saima Sultana 24 October 2019 (has links)
Viruses have huge impact on controlling diseases and regulating many key ecosystem processes. As metagenomic data can contain many microbiomes including many viruses, by analyzing metagenomic data we can analyze many viruses at the same time. The first step towards analyzing metagenomic data is to identify and quantify viruses present in the data. In order to answer this question, we developed a computational pipeline, FastViromeExplorer. FastViromeExplorer leverages a pseudoalignment based approach, which is faster than the traditional alignment based approach to quickly align millions/billions of reads. Application of FastViromeExplorer on both human gut samples and environmental samples shows that our tool can successfully identify viruses and quantify the abundances of viruses quickly and accurately even for a large data set. As viruses are getting increased attention in recent times, most of the viruses are still unknown or uncategorized. To discover novel viruses from metagenomic data, we developed a computational pipeline named FVE-novel. FVE-novel leverages a hybrid of both reference based and de novo assembly approach to recover novel viruses from metagenomic data. By applying FVE-novel to an ocean metagenome sample, we successfully recovered two novel viruses and two different strains of known phages. Analysis of viral assemblies from metagenomic data reveals that viral assemblies often contain assembly errors like chimeric sequences which means more than one viral genomes are incorrectly assembled together. In order to identify and fix these types of assembly errors, we developed a computational tool called VirChecker. Our tool can identify and fix assembly errors due to chimeric assembly. VirChecker also extends the assembly as much as possible to complete it and then annotates the extended and improved assembly. Application of VirChecker to viral scaffolds collected from an ocean meatgenome sample shows that our tool successfully fixes the assembly errors and extends two novel virus genomes and two strains of known phage genomes. / Doctor of Philosophy / Virus, the most abundant micro-organism on earth has a profound impact on human health and environment. Analyzing metagenomic data for viruses has the beneFIt of analyzing many viruses at a time without the need of cultivating them in the lab environment. Here, in this dissertation, we addressed three research problems of analyzing viruses from metagenomic data. To analyze viruses in metagenomic data, the first question needs to answer is what viruses are there and at what quantity. To answer this question, we developed a computational pipeline, FastViromeExplorer. Our tool can identify viruses from metagenomic data and quantify the abundances of viruses present in the data quickly and accurately even for a large data set. To recover novel virus genomes from metagenomic data, we developed a computational pipeline named FVE-novel. By applying FVE-novel to an ocean metagenome sample, we successfully recovered two novel viruses and two strains of known phages. Examination of viral assemblies from metagenomic data reveals that due to the complex nature of metagenome data, viral assemblies often contain assembly errors and are incomplete. To solve this problem, we developed a computational pipeline, named VirChecker, to polish, extend and annotate viral assemblies. Application of VirChecker to virus genomes recovered from an ocean metagenome sample shows that our tool successfully extended and completed those virus genomes.
24

Um algoritmo para a construção de vetores de sufixo generalizados em memória externa / External memory generalized suffix array construction algorithm

Louza, Felipe Alves da 17 December 2013 (has links)
O vetor de sufixo é uma estrutura de dados importante utilizada em muitos problemas que envolvem cadeias de caracteres. Na literatura, muitos trabalhos têm sido propostos para a construção de vetores de sufixo em memória externa. Entretanto, esses trabalhos não enfocam conjuntos de cadeias, ou seja, não consideram vetores de sufixo generalizados. Essa limitação motiva esta dissertação, a qual avança no estado da arte apresentando o algoritmo eGSA, o primeiro algoritmo proposto para a construção de vetores de sufixo generalizados aumentado com o vetor de prefixo comum mais longo (LCP) e com a transformada de Burrows-Wheeler (BWT) em memória externa. A dissertação foi desenvolvida dentro do contexto de bioinformática, já que avanços tecnológicos recentes têm aumentado o volume de dados biológicos disponíveis, os quais são armazenados como cadeias de caracteres. O algoritmo eGSA foi validado por meio de testes de desempenho com dados reais envolvendo sequências grandes, como DNA, e sequências pequenas, como proteínas. Com relação aos testes comparativos com conjuntos de grandes cadeias de DNA, o algoritmo proposto foi comparado com o algoritmo correlato mais eficiente na literatura de construção de vetores de sufixo, o qual foi adaptado para construção de vetores generalizados. O algoritmo eGSA obteve um tempo médio de 3,2 a 8,3 vezes menor do que o algoritmo correlato e consumiu 50% menos de memória. Para conjuntos de cadeias pequenas de proteínas, foram realizados testes de desempenho apenas com o eGSA, já que no melhor do nosso conhecimento, não existem trabalhos correlatos que possam ser adaptados. Comparado com o tempo médio para conjuntos de cadeias grandes, o eGSA obteve tempos competitivos para conjuntos de cadeias pequenas. Portanto, os resultados dos testes demonstraram que o algoritmo proposto pode ser aplicado eficientemente para indexar tanto conjuntos de cadeias grandes quanto conjuntos de cadeias pequenas / The suffix array is an important data structure used in several string processing problems. In the literature, several approaches have been proposed to deal with external memory suffix array construction. However, these approaches are not specifically aimed to index sets of strings, that is, they do not consider generalized suffix arrays. This limitation motivates this masters thesis, which presents eGSA, the first external memory algorithm developed to construct generalized suffix arrays enhanced with the longest common prefix array (LCP) and the Burrows-Wheeler transform (BWT). We especially focus on the context of bioinformatics, as recent technological advances have increased the volume of biological data available, which are stored as strings. The eGSA algorithm was validated through performance tests with real data from DNA and proteins sequences. Regarding performance tests with large strings of DNA, we compared our algorithm with the most efficient and related suffix array construction algorithm in the literature, which was adapted to construct generalized arrays. The results demonstrated that our algorithm reduced the time spent by a factor of 3.2 to 8.3 and consumed 50% less memory. For sets of small strings of proteins, tests were performed only with the eGSA, since to the best of our knowledge, there is no related work that can be adapted. Compared to the average time spent to index sets of large strings, the eGSA obtained competitive times to index sets of small strings. Therefore, the performance tests demonstrated that the proposed algorithm can be applied efficiently to index both sets of large strings and sets of small strings
25

Um algoritmo para a construção de vetores de sufixo generalizados em memória externa / External memory generalized suffix array construction algorithm

Felipe Alves da Louza 17 December 2013 (has links)
O vetor de sufixo é uma estrutura de dados importante utilizada em muitos problemas que envolvem cadeias de caracteres. Na literatura, muitos trabalhos têm sido propostos para a construção de vetores de sufixo em memória externa. Entretanto, esses trabalhos não enfocam conjuntos de cadeias, ou seja, não consideram vetores de sufixo generalizados. Essa limitação motiva esta dissertação, a qual avança no estado da arte apresentando o algoritmo eGSA, o primeiro algoritmo proposto para a construção de vetores de sufixo generalizados aumentado com o vetor de prefixo comum mais longo (LCP) e com a transformada de Burrows-Wheeler (BWT) em memória externa. A dissertação foi desenvolvida dentro do contexto de bioinformática, já que avanços tecnológicos recentes têm aumentado o volume de dados biológicos disponíveis, os quais são armazenados como cadeias de caracteres. O algoritmo eGSA foi validado por meio de testes de desempenho com dados reais envolvendo sequências grandes, como DNA, e sequências pequenas, como proteínas. Com relação aos testes comparativos com conjuntos de grandes cadeias de DNA, o algoritmo proposto foi comparado com o algoritmo correlato mais eficiente na literatura de construção de vetores de sufixo, o qual foi adaptado para construção de vetores generalizados. O algoritmo eGSA obteve um tempo médio de 3,2 a 8,3 vezes menor do que o algoritmo correlato e consumiu 50% menos de memória. Para conjuntos de cadeias pequenas de proteínas, foram realizados testes de desempenho apenas com o eGSA, já que no melhor do nosso conhecimento, não existem trabalhos correlatos que possam ser adaptados. Comparado com o tempo médio para conjuntos de cadeias grandes, o eGSA obteve tempos competitivos para conjuntos de cadeias pequenas. Portanto, os resultados dos testes demonstraram que o algoritmo proposto pode ser aplicado eficientemente para indexar tanto conjuntos de cadeias grandes quanto conjuntos de cadeias pequenas / The suffix array is an important data structure used in several string processing problems. In the literature, several approaches have been proposed to deal with external memory suffix array construction. However, these approaches are not specifically aimed to index sets of strings, that is, they do not consider generalized suffix arrays. This limitation motivates this masters thesis, which presents eGSA, the first external memory algorithm developed to construct generalized suffix arrays enhanced with the longest common prefix array (LCP) and the Burrows-Wheeler transform (BWT). We especially focus on the context of bioinformatics, as recent technological advances have increased the volume of biological data available, which are stored as strings. The eGSA algorithm was validated through performance tests with real data from DNA and proteins sequences. Regarding performance tests with large strings of DNA, we compared our algorithm with the most efficient and related suffix array construction algorithm in the literature, which was adapted to construct generalized arrays. The results demonstrated that our algorithm reduced the time spent by a factor of 3.2 to 8.3 and consumed 50% less memory. For sets of small strings of proteins, tests were performed only with the eGSA, since to the best of our knowledge, there is no related work that can be adapted. Compared to the average time spent to index sets of large strings, the eGSA obtained competitive times to index sets of small strings. Therefore, the performance tests demonstrated that the proposed algorithm can be applied efficiently to index both sets of large strings and sets of small strings
26

Metody pro vylepšení genomového sestavení založené na signálovém zpracování / Signal processing based methods for genome assembly refinement

Jugas, Robin January 2016 (has links)
The diploma thesis deals with sequencing methods and genome assembly methods including usage of numerical representations. The theoretical part of thesis describes the history of DNA research, generations of sequencing methods, the assembly methods themselves and definiton of numerical representations. Numerical represenatations serve to convert character form of DNA to numerical form and so allow to use digital signal processing methods. There is an algorithm for genome assembly using numerical represenatation proposed in thesis, which is later tested at sequence data.
27

Draft Genome Assembly, Organelle Genome Sequencing and Diversity Analysis of Marama Bean (Tylosema esculentum), the Green Gold of Africa

Li, Jin 26 May 2023 (has links)
No description available.
28

The Genome of Cañahua: An Emerging Andean Super Grain

Mangelson, Hayley Jennifer 01 May 2019 (has links)
Chenopodium pallidicaule, known commonly as cañahua, is a semi-domesticated crop grown in high-altitude regions of the Andes. It is an A-genome diploid (2n = 2x = 18) relative of the allotetraploid (AABB) Chenopodium quinoa and shares many of its nutritional benefits. Both species contain a complete protein, a low glycemic index, and offer a wide variety of nutritionally important vitamins and minerals. Due to its minor crop status, few genomic resources for its improvement have been developed. Here we present a fully annotated, reference-quality assembly of cañahua. The reference assembly was developed using a combination of established techniques, including multiple rounds of Hi-C based proximity-guided assembly. The final assembly consists of 4,633 scaffolds with 96.6% of the assembly contained in nine scaffolds representing the nine haploid chromosomes of the species. Repetitive element analysis classified 52.3% of the assembly as repetitive, with the most common (27.3% of assembly) identified as LTR retrotransposons. MAKER annotation of the assembly yielded 22,832 putative genes with an average length of 4.6 Kb. When compared with quinoa, strong patterns of synteny support the hypothesis that cañahua is a close A-genome diploid relative, and thus potentially a model diploid species for genetic analysis and improvement of quinoa. Resequencing and phylogenetic analysis of a diversity panel of 30 cañahua accessions collected from across the Altiplano suggests that coordinated efforts are needed to enhance genetic diversity conservation within ex situ germplasm collections.
29

Décodage de l'expression de gènes cryptiques

Moreira, Sandrine 08 1900 (has links)
Pour certaines espèces, les nouvelles technologies de séquençage à haut débit et les pipelines automatiques d'annotation permettent actuellement de passer du tube Eppendorf au fichier genbank en un clic de souris, ou presque. D'autres organismes, en revanche, résistent farouchement au bio-informaticien le plus acharné en leur opposant une complexité génomique confondante. Les diplonémides en font partie. Ma thèse est centrée sur la découverte de nouvelles stratégies d'encryptage de l'information génétique chez ces eucaryotes, et l'identification des processus moléculaires de décodage. Les diplonémides sont des protistes marins qui prospèrent à travers tous les océans de la planète. Ils se distinguent par une diversité d'espèces riche et inattendue. Mais la caractéristique la plus fascinante de ce groupe est leur génome mitochondrial en morceaux dont les gènes sont encryptés. Ils sont décodés au niveau ARN par trois processus: (i) l'épissage en trans, (ii) l'édition par polyuridylation à la jonction des fragments de gènes, et (iii) l'édition par substitution de A-vers-I et C-vers-T; une diversité de processus posttranscriptionnels exceptionnelle dans les mitochondries. Par des méthodes bio-informatiques, j'ai reconstitué complètement le transcriptome mitochondrial à partir de données de séquences ARN à haut débit. Nous avons ainsi découvert six nouveaux gènes dont l'un présente des isoformes par épissage alternatif en trans, 216 positions éditées par polyuridylation sur 14 gènes (jusqu'à 29 uridines par position) et 114 positions éditées par déamination de A-vers-I et C-vers-T sur sept gènes (nad4, nad7, rns, y1, y2, y3, y5). Afin d'identifier les composants de la machinerie réalisant la maturation des ARNs mitochondriaux, le génome nucléaire a été séquencé, puis je l'ai assemblé et annoté. Cette machinerie est probablement singulière et complexe car aucun signal en cis ni acteur en trans caractéristiques des machineries d'épissage connues n'a été trouvé. J'ai identifié plusieurs candidats prometteurs qui devront être validés expérimentalement: des ARN ligases, un nombre important de protéines de la famille des PPR impliquées dans l'édition des ARNs dans les organites de plantes, ainsi que plusieurs déaminases. Durant ma thèse, nous avons mis en évidence de nouveaux types de maturation posttranscriptionnelle des ARNs dans la mitochondrie des diplonémides et identifié des candidats prometteurs de la machinerie. Ces composants, capables de lier précisément des fragments d'ARN et de les éditer pourraient trouver des applications biotechnologique. Au niveau évolutif, la caractérisation de nouvelles excentricités moléculaires de ce type nous donne une idée des processus de recrutement de gènes, de leur adaptation à de nouvelles fonctions, et de la mise en place de machineries moléculaires complexes. / Thanks to new high throughput sequencing technologies and automatic annotation pipelines, proceeding from an eppendorf tube to a genbank file can be achieved in a single mouse click or so, for some species. Others, however, fiercely resist bioinformaticians with their confounding genomic complexity. Diplonemids are one of them. My thesis is centered on the discovery of new strategies for encrypting genetic information in eukaryotes, and the identification of molecular decoding processes. Diplonemids are a group of poorly studied marine protists. Unexpectedly, metagenomic studies have recently ranked this group as one of the most diverse in the oceans. Yet, their most distinctive feature is their multipartite mitochondrial genome with genes in pieces, and encryption by nucleotide deletions and substitutions. Genes are decrypted at the RNA level through three processes: (i) trans-splicing, (ii) polyuridylation at the junction of gene pieces and (iii) substitutions of A-to-I and C-to-T. Such a diverse arsenal of mitochondrial post-transcriptional processes is highly exceptional. Using a bioinformatics approach, I have reconstructed the mitochondrial transcriptome from RNA-seq libraries. We have identified six new genes including one that presents alternative trans-splicing isoforms. In total, there are 216 uridines added in 14 genes with up to 29 U insertions, and 114 positions edited by deamination (A-to-I or C-to-T) among seven genes (nad4, nad7, rns, y1, y2, y3, y5). In order to identify the machinery that processes mitochondrial RNAs, the nuclear genome has been sequenced. I have then assembled and annotated the genome. This machinery is probably unique and complex because no cis signal or trans actor typical for known splicing machineries have been found. I have identified promising protein candidates that are worth to be tested experimentally, notably RNA ligases, numerous members of the PPR family involved in plants RNA editing and deaminases. During my thesis, we have identified new types of post-transcriptional RNA processing in diplonemid mitochondria and identified new promising candidates for the machinery. A system capable of joining precisely or editing RNAs could find biotechnological applications. From an evolutionary perspective, the discovery of new molecular systems gives insight into the process of gene recruitment, adaptation to new functions and establishment of complex molecular machineries.
30

Understanding divergent evolution through comparative genomics

Kolora, Sree Rohit Raj 07 January 2019 (has links)
No description available.

Page generated in 0.0365 seconds