• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

LD-based SNP and genotype calling from next-generation sequencing reads

Menelaou, Androniki January 2012 (has links)
Next-generation sequencing is revolutionising in genetics, where base-by base information for the whole genome is available for a large sample of individuals. This type of data is becoming commonly used and will continue to be in the near future. One of the first questions arising is the identification of novel variants and subsequently genotype calling of the individuals in the sample. However, given the cost of sequencing, so far most projects are sequencing individuals in low to medium coverage. In this thesis, we present two distinct methods for SNP and genotype calling from low-coverage sequencing data, TreeCall and MVNcall, that combine sequencing and Linkage Disequilibrium (LD) information. We begin by describing the pipeline for next-generation sequencing analysis and existing methods for SNP and genotype calling using low-coverage sequencing information. Subsequently, we present the two novel LD-based methods for SNP and genotype calling. The two methods developed assume a study design where the individuals are both genotyped and sequenced at low-coverage. The genotypes are used to construct a haplotype scaffold, where the LD information is extracted, either by the construction of genealogical trees (TreeCall), or the approximation of a windows of contiguous SNPs of the scaffold by a multivariate normal distribution (MVNcall). Both methods have been applied on real datasets from the 1000 Genomes project and compared to other LD-based methods applied on the same datasets, mainly in terms of genotype calling and phasing. Whereas TreeCall gives lower genotype concordance rates than the other methods, MVNcall provides the highest genotype concordance rates for a dataset with a small sample size (Lowcoverage pilot of the 1000 Genomes project). Applying the MVNcall on a larger dataset (Phase 1 of the 1000 Genomes project), it achieves an overall genotype discordance rate of 0.58%, whereas SNPTools achieves an overall genotype discordance rate of 0.57%, Thunder 0.56%, and BEAGLE 0.61% (comparison based on Axiom chip). The main advantage of MVNcall is in terms of phasing accuracy, where by using a haplotype scaffold, and especially in the case where the haplotype scaffold is phased using pedigree information, it provides accurate haplotypes. MVNcall is also extended to incorporate trio information for genotype calling. Experiment on a deeply sequenced trio leads to an accurate set of haplotypes of the trio with switch error rates as low as ~0.28 for the parents and ~0.12 for the offspring.
2

Parallel Processing of Large Scale Genomic Data

Kutlu, Mucahid 09 October 2015 (has links)
No description available.
3

Comparação de transcriptomas por sequenciamento de próxima geração em tecidos de cabeça de duas espécies de moscas-das-frutas, Anastrepha fratercules e Anastrepha obliqua

Rezende, Victor Borges 28 February 2014 (has links)
Submitted by Alison Vanceto (alison-vanceto@hotmail.com) on 2016-10-04T12:02:05Z No. of bitstreams: 1 DissVBR.pdf: 1898665 bytes, checksum: 77cd0f5baccb8a694beb9e7f230bab15 (MD5) / Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2016-10-04T17:30:10Z (GMT) No. of bitstreams: 1 DissVBR.pdf: 1898665 bytes, checksum: 77cd0f5baccb8a694beb9e7f230bab15 (MD5) / Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2016-10-04T17:30:20Z (GMT) No. of bitstreams: 1 DissVBR.pdf: 1898665 bytes, checksum: 77cd0f5baccb8a694beb9e7f230bab15 (MD5) / Made available in DSpace on 2016-10-04T17:47:42Z (GMT). No. of bitstreams: 1 DissVBR.pdf: 1898665 bytes, checksum: 77cd0f5baccb8a694beb9e7f230bab15 (MD5) Previous issue date: 2014-02-28 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) / Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) / We studied patterns of gene expression in two closely related species of fruit flies of the genus Anastrepha (Diptera: Tephritidea), A. fraterculus and A. obliqua, with the goal of finding candidate genes related to the recent differentiation process between these species. In order to do this, we used the Next-generation sequencing (NGS) with RNA-Seq methodology in head tissues of the two species of flies at different stages of the reproductive life for both sexes. After processing and removal of low quality reads we retained over 140 million paired-end reads. These sequences were assembled into individual transcriptomes for each species and a pooled transcriptome, with 154,787 contigs, representing both species. Based on the results of the assemblies, annotation and mapping we prepared two separate manuscripts, one describing the libraries for each species and a combined analysis and a second that investigate the contigs involved with species differences and their patterns of expression. These data reveal 1991 genes with differential expression in at least one comparison among different reproductive stages. It is noteworthy that we observed twice as many genes with differential expression when contrasting males than with females. Several of these genes were associated to odour, such as Odorant Binding proteins and visgum, suggesting that behavioral changes in food sources, mating choice, or breeding and oviposition sites might be involved with species differences.We also identified two large sets of genes that are differentially expressed between the species, one being underexpressed in A. obliqua and overexpressed in A. fraterculus and the other that had a reverse pattern. We used a differentiation index of SNPs per contig () and pairwise tests of positive selection between the sequences, and analysis of the substitution amino acid type caused by SNPs to identify 7 genes there are candidates to be related to the speciation process between A. fraterculus and A. obliqua. / Investigamos os padrões de expressão gênica em tecidos cefálicos de duas espécies de moscas-das- frutas do gênero Anastrepha (Diptera: Tephritidea), A. fraterculus e A. obliqua, proximamente relacionadas, identificando SNPs com alto grau de diferenciação entre as espécies e realizando análises evolutivas com o objetivo de encontrar genes candidatos relacionados ao recente processo de separação dessas espécies. Para isso, utilizamos as novas tecnologias de sequenciamento em larga escala (NGS) com a metodologia de RNA-Seq em tecidos cefálicos das duas espécies de moscas em diferentes fases da vida reprodutiva dos dois sexos. Após processamento e retirada das sequências com baixa qualidade utilizamos mais de 140 milhões de sequências paired-end para montarmos um transcriptoma conjunto das espécies, com mais de 154 mil contigs, e outros dois separados por espécie. Estes resultados estão apresentados em dois manuscritos distintos, um primeiro que descreve as bibliotecas produzidas para as diferentes espécies e um segundo que investiga padrões de expressão e genes envolvidos na diferenciação das espécies. Estes dados revelaram 1991 genes com expressão diferencial em pelo menos uma comparação de fase de vida reprodutiva entre as espécies, sendo que encontramos duas vezes mais genes com diferença na expressão entre as espécies em machos do que em fêmeas. Diversos destes genes foram associados a genes relacionados ao olfato, como a família gênica das Obps (odorant-binding protein) e o gene visgun, o que pode indicar uma mudança comportamental na preferência por alimento, parceiros ou sítios para cópula e oviposição. Encontramos também dois conjuntos de genes que são diferencialmente expressos entre as espécies, sendo um conjunto super-expresso em A. obliqua e sub-expresso em A. fraterculus e outro conjunto com um padrão revertido. Análises de padrão de seleção nestes genes sugerem sete que apresentam indícios de seleção positiva e ao menos um SNP com altos índices de diferenciação entre as espécies.

Page generated in 0.0749 seconds