Global ETD Search

1	Characterization of genome-wide deviations from Mendelian inheritance in bivalve species Peñaloza Navarro, Carolina Soledad January 2018 (has links) Marine bivalves are a group of species composed of clams, mussels and oysters. Bivalves are keystone species in coastal ecosystems and represent an increasingly important segment of the global aquaculture industry. Domestication of shellfish species is in the early stages, with few organized breeding programmes and a heavy reliance on wild seed. Consequently, the development and use of genomic markers may significantly assist shellfish aquaculture breeding and production. However, molecular genetic markers typically exhibit unusual patterns of segregation in bivalve species, which result in deviations from Mendelian expectations, and could potentially limit their use in parental assignment, mapping of quantitative trait loci and genomic prediction. Previous studies have suggested that segregation distortions originate at the larval stage, as a result of the linkage of markers to deleterious mutations. This high genetic load has been associated with the high fecundity of bivalve species. However, no direct evidence of a high incidence of de novo mutations has been provided. The aim of this thesis is to gain further insight into segregation distortions in bivalve species by studying the phenomenon at a genome-wide scale, using modern high-throughput sequencing technology. The studies presented in this thesis derive from experiments involving genotyping of parents and offspring from pair-crosses of three different bivalve species (the Pacific oyster Crassostrea gigas, the Blue mussel Mytilus edulis, and the GreenshellTM mussel Perna canaliculus) using high throughput sequencing and SNP arrays. The parent and offspring genotype data were used to characterize patterns of segregation distortion at a genome-wide level, followed by exploratory analyses to test hypotheses related to possible causes of this distortion. Three main findings resulted from the genome-wide analysis of segregation patterns. First, by using Restriction site Associated DNA sequencing (RAD-Seq) we observe that technical artefacts are more widespread than previously considered, contributing to apparent distortions via unreliable genotype calls. By analysing read depth data from RAD-Seq, we suggest that apparent homozygous genotype calls may actually be hemizygous, suggesting a very high frequency of null alleles which contribute to distorted segregation patterns. Bioinformatic pipelines to improve RAD-Seq locus assembly and marker genotyping for bivalve species are presented. Second, by using a high-density SNP array and RAD-Seq in pair crosses of Pacific oyster and aligning to the reference genome assembly, we find that segregation distortions cover extensive regions of the genome, and that certain genomic regions are consistently distorted in different families. Finally, following previous suggestions that the reproductive strategies of bivalve species may favour a high mutation rate, we provide preliminary evidence of a high incidence of de novo mutations that appear spontaneously (i) during male and female gamete formation and (ii) post-zygotically, during larval development. This putative high de novo mutation rate is likely to also contribute to deviations from Mendelian inheritance patterns in these species. New genomic technologies have allowed us to gain substantial insight into the intriguing yet poorly understood phenomena related to inheritance in bivalve species. The results have both fundamental and practical implications for genetic analysis interpretation and selective breeding for aquaculture in this large and highly diverse group of species.
2	Hypothesis-free detection of genome-changing events in pedigree sequencing Garimella, Kiran January 2016 (has links) In high-diversity populations, a complete accounting of de novo mutations can be difficult to obtain. Most analyses involve identifying such mutations by sequencing pedigrees on second-generation sequencing platforms and aligning the short reads to a reference assembly, the genomic sequence of a canonical member (or members) of a species. Often, large regions of the genomes under study may be greatly diverged from the reference sequence, or not represented at all (e.g. the HLA, antigenic genes, or other regions under balancing selective pressure). If the haplotypic background upon which a mutation occurs is absent, events can easily be missed (as reads have nowhere to align) and false-positives may abound (as the software forces the reads to align elsewhere). This thesis presents a novel method for de novo mutation discovery and allele identification. Rather than relying on alignment, our method is based on the de novo assembly of short-read sequence data using a multi-color de Bruijn graph. In this data structure, each sample is assigned a unique index (or "color"), reads from each sample are decomposed into smaller subsequences of length k (or "kmers"), and color-specific adjacency information between kmers is recorded. Mutations can be discovered in the graph itself by searching for characteristic motifs (e.g. a "bubble motifs", indicative of a SNP or indel, and "linear motifs" indicative of allelic and non-allelic recombination). De novo mutations differ from inherited mutations in that the kmers spanning the variant allele are absent in the parents; in a sense, they facilitate their own discovery by generating "novel" sequence. We exploit this fact to limit processing of the graph to only those regions containing these novel kmers. We verified our approach using simulations, validation, and visualization. On the simulations, we developed genome and read generation software driven by empirical distributions computed from real data to emit genomes with realistic features: recombinations, de novo variants, read fragment sizes, sequencing errors, and coverage profiles. In 20 artifical samples, we determined our sensitivity and specificity for novel kmer recovery to be approximately 98% and 100% at worst, respectively. Not every novel stretch can be reconstituted as a variant, owing to errors and homology in the graph. In simulations, our false discovery rate was 10% for "bubble" events and 12% for "linear" events. On validation, we obtained a high-quality draft assembly for a single P. falciparum child using a third-generation sequencing platform. We discovered three de novo events in the draft assembly, all three of which are recapitulated in our calls on the second-generation sequencing data for the same sample; no false-positives are present. On visualization, we developed an interactive web application capable of rendering a multi-color subgraph that assists in visually distinguishing between true variation and sequencing artifacts. We applied our caller to real datasets: 115 progeny across four previously analyzed experimental crosses of Plasmodium falciparum. We demonstrate our ability to access subtelomeric compartments of the genome, regions harboring antigenic genes under tremendous selective pressure, thus highly divergent between geographically distinct isolates and routinely masked and ignored in reference-based analyses. We also show our caller's ability to recover an important form of structural de novo variation: non-allelic homologous recombination (NAHR) events, an important mechanism for the pathogen to diversify its own antigenic repertoire. We demonstrate our ability to recover the few events in these samples known to exist, and overturn some previous findings indicating exchanges between "core" (non-subtelomeric) genes. We compute the SNP mutation rate to be approximately 2.91 per sample, insertion and deletion mutation rates to be 0.55 and 1.04 per sample, respectively, multi-nucleotide polymorphisms to be 0.72 per sample, and NAHR events to be 0.33 per sample. These findings are consistent across crosses. Finally, we investigated our method's scaling capabilities by processing a quintet of previously analyzed Pan troglodytes verus (western chimpanzee) samples. The genome of the chimpanzee is two orders of magnitude larger than the malaria parasite's (3, 300 Mbp versus 23 Mbp), diploid rather than haploid, poorly assembled, and the read dataset is lower coverage (20x versus 120x). Comparing to Sequenom validation data as well as visual validation, our sensitivity is expectedly low. However, this can be attributed to overaggressiveness in data cleaning applied by the de novo assembler atop which our software is built. We discuss the precise changes that would likely need to be made in future work to adapt our method to low-coverage samples.
3	Anomalies du tube neural : mieux comprendre les causes génétiques de cette pathologie complexe Lemay, Philippe 08 1900 (has links) No description available. Anomalies du tube neural GRHL3 SHROOM3 Séquençage de l’exome entier Génétique complexe Mutations de novo Neural tube defects Whole exome sequencing Complex genetics De novo mutations

1

Page generated in 0.1028 seconds