• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 187
  • 58
  • 50
  • 33
  • 22
  • 6
  • 5
  • 4
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 453
  • 453
  • 453
  • 70
  • 68
  • 67
  • 58
  • 54
  • 54
  • 53
  • 52
  • 48
  • 48
  • 46
  • 46
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Identificação de polimorfismos em região do cromossomo 3 da galinha associado ao desempenho de deposição de gordura / Identification of polymorphisms in a region of chicken chromosome 3 associated with the performance of the fat deposition

Moreira, Gabriel Costa Monteiro 12 February 2014 (has links)
Dezoito galinhas de uma população experimental utilizada em um cruzamento recíproco entre as linhagens de frangos de corte (TT) e de postura (CC) foram sequenciadas pela tecnologia de nova geração na plataforma Illumina com uma cobertura média de 10X. A descoberta de variantes genéticas foi realizada em uma região de locos de característica quantitativa (Quantitative Trait Locus, QTL), associado anteriormente com peso e percentagem de gordura abdominal no cromossomo 3 da galinha (GGA3), entre os marcadores microssatélites LEI0161 e ADL0371 (33,595,706-42,632,651 pb). O programa SAMtools foi utilizado na identificação de 136.054 SNPs únicos e 15.496 INDELs únicas nos 18 animais sequenciados e após a filtragem das mutações, 92.518 SNPs únicos e 9.298 INDELs únicas foram mantidas. Uma lista de 77 genes foi analisada buscando genes relacionados ao metabolismo de lipídios. Variantes localizadas na região codificante (386 SNPs e 15 INDELs) foram identificadas e associadas com vias metabólicas importantes. Variantes nos genes LOC771163, EGLN1, GNPAT, FAM120B, THBS2 e GGPS1 foram identificadas e podem ser responsáveis pela associação do QTL com a deposição de gordura na carcaça em galinhas. / Eighteen chickens from a parental generation used in a reciprocal cross with broiler and layer lines were sequenced by new generation technology with an average of 10-fold coverage. The DNA sequencing was performed by Illumina next generation platform. The genetic variants discovery was performed in a quantitative trait loci (QTL) region which was previously associated with abdominal fat weight and percentage in chicken chromosome 3 (GGA3) between the microsatellite markers LEI0161 and ADL0371 (33,595,706-42,632,651 bp). SAMtools software was used to detect 136,054 unique SNPs and 15,496 unique INDELs for the 18 chickens, and after quality filtration 92,518 unique SNPs and 9,298 unique INDELs were retained. One list of 77 genes was analised and genes related to lipid metabolism were searched. Variants located in coding region (386 SNPs and 15 INDELs) were identified and associated with important metabolic pathways. Loss of functional variants in the genes LOC771163, EGLN1, GNPAT, FAM120B, THBS2 and GGPS1 may be responsible for the QTL associated with fat deposition in chicken.
82

Identificação de polimorfismos em região do cromossomo 2 da galinha associado a deposição de músculo / Identification of polymorphisms in the chicken chromosome 2 region associated with muscle deposition

Godoy, Thaís Fernanda 13 February 2014 (has links)
A produção brasileira de carne de frango tem uma grande importância econômica no mundo todo devido principalmente aos avanços do melhoramento genético. O surgimento de novas tecnologias de sequenciamento (sequenciamento de nova geração) tem se tornado uma ferramenta poderosa, pois por meio da identificação de SNPs (polimorfismo de nucleotídeo único) e INDELs (deleções/inserções) possibilita a adição de novas informações ao melhoramento genético. A deposição de músculo, em especial o músculo de peito, é uma das características que mais merecem destaque por causa da sua importância nutricional e econômica. Sendo assim o objetivo deste trabalho foi ressequenciar o genoma de 18 aves de duas linhagens distintas experimentais e identificar SNPs e INDELs em uma região de QTL no cromossomo 2 da galinha associado anteriormente com deposição de músculo do peito, além de caracterizar variantes potencialemente funcionais e propor mutações candidatas para estudos futuros. Para isso, dezoito galinhas de duas diferentes linhagens experimentais (corte e postura), ambas desenvolvidas pela Embrapa Suíno e Aves, foram sequenciadas pela plataforma de nova geração da Illumina. SNPs e INDELs foram identificados por meio de ferramentas de bioinformática em uma região de QTL no cromossomo 2 da galinha (105.848.755-112.648.761 pb) que foi previamente associada com deposição de músculo de peito. O sequenciamento dos 18 animais gerou em torno 2,7 bilhões de reads e após a filtragem por qualidade foram mantidas 77% das reads. Em seguida, as reads foram alinhadas ao genoma referência (Gallus_gallus-4.0, NCBI) pela ferramenta Bowtie2 e gerou em média 10,6X de cobertura de sequenciamento na região-alvo. , Foram identificados 722.832 SNPs e 63.727 INDELs para os 18 animais por meio do programa SAMtools, e após uma filtragem rigorosa, foram mantidos 77% dos SNPs (n=558.767) e 60% das INDELs (n=38.402). Com base nas variantes únicas para os 18 animais (85.765 SNPs e 7.824 INDELs) foi realizada a anotação funcional por meio da ferramenta ANNOVAR. Dentre os SNPs não sinônimos (n=153) e stopgain (n=3), 15 foram classificados como deletérios. Um dos SNPs deletérios que já foi depositado em banco de dados foi identificado no gene RB1CC1, que tem sua função relacionada ao desenvolvimento do músculo de peito. Utilizando a ferramenta DAVID foi possível analisar 37 genes relacionados aos SNPs não sinônimos, stopgain, INDELs frameshift e não frameshift. Dentre estes genes, três (DTNA, RB1CC1 e C-MOS) foram selecionados por terem suas funções relacionadas ao desenvolvimento muscular e suas mutações foram analisadas. Sendo assim, futuros estudos podem ser realizados nestes genes candidatos e nas mutações identificadas, por meio de análises de associação e validação em populações comerciais, permitindo assim uma melhor explicação o efeito do QTL estudado. / The Brazilian chicken meat production has a great economic importance in worldwide mainly due to advances in breeding. The emergence of new techniques of sequencing (nextgeneration sequencing) becomes a powerful tool because through identification of SNPs (single nucleotide polymorphism) and INDELs (deletions/insertions) allows the addition of new information for genetic improvement. The muscle deposition, particularly the breast muscle, is one of the features that are most noteworthy because of its nutritional and economic importance. Therefore the aim of this study was to perform the genome resequencing of 18 chicken from two distinct experimental lines and identify SNPs and INDELs in a QTL region on chromosome 2 previously associated with breast muscle, and characterize the variants to identify potentially function ones and propose candidate mutations for future studies. To achieve these objectives, eighteen chickens of two different experimental lines (broiler and layer), both developed by Embrapa Swine and Poultry were sequenced by Illumina next-generation platform. SNPs and INDELs were identified by bioinformatic tools in a QTL region on chicken chromosome 2 (105,848,755-112,648,761 bp) which was previously associated with breast muscle deposition. Sequencing of the eighteen animals generated around 2.7 billion of reads, and 77% of the reads were retained after filtering. The reads were aligned against the chicken genome reference (Gallus_gallus-4.0, NCBI) by Bowtie2 tool resulting in a 10.6X coverage across the target region. Using SAMtools, 722,832 SNPs and 63,727 INDELs were identified in the all individuals, and after a stringent filtration, 77% of SNPs (n=558,767) and 60% of INDELs (n=38,402) were maintained. Based on unique variants for all the animal (85,765 SNPs and 7,828 INDELs) were performed the functional annotation by ANNOVAR tool. Among the non-synonymous SNPs (n=153) and stopgain (n=3), fifteen were predicted like a deleterious mutation. One of deleterious SNPs has already deposited in public database, and it was identified in RB1CC1 gene, which function is related to breast muscle development. Using the DAVID tool was possible to analyze the 37 genes related to the non-synonymous SNPs, stopgain, frameshift and non-frameshift INDELs. Among these genes, three (DTNA, RB1CC1 and C-MOS) were selected due their functions related to muscle development and their mutations were analyzed. Therefore, further association studies can be performed with these candidate genes and their mutations, and also validation in commercial populations, allowing a better explanation of QTL effects.
83

Understanding inflammatory bowel disease using high-throughput sequencing

de Lange, Katrina Melanie January 2017 (has links)
For over two decades, the study of genetics has been making significant progress towards understanding the causes of common disease. Across a wide range of complex disorders there have been hundreds of associated loci identified, largely driven by common genetic variation. Now, with the advent of next-generation sequencing technology, we are able to interrogate rare and low frequency variation in a high throughput manner for the first time. This provides an exciting opportunity to investigate the role of rarer variation in complex disease risk on a genome-wide scale, potentially o↵ering novel insights into the biological mechanisms underlying disease pathogenesis. In this thesis I will assess the potential of this technology to further our understanding of the genetics of complex disease, using inflammatory bowel disease (IBD) as an example. After first reviewing the history of genetic studies into IBD, I will describe the analytical challenges that can occur when using sequencing to perform case-control association testing at scale, and the methods that can be used to overcome these. I then test for novel IBD associations in a low coverage whole genome sequencing dataset, and uncover a significant burden of rare, damaging missense variation in the gene NOD2, as well as a more general burden of such variation amongst known inflammatory bowel disease risk genes. Through imputation into both new and existing genotyped cohorts, I also describe the discovery of 26 novel IBD-associated loci, including a low frequency missense variant in ADCY7 that approximately doubles the risk of ulcerative colitis. I resolve biological associations underlying several of these novel associations, including a number of signals associated with monocyte-specific changes in integrin gene expression following immune stimulation. These results reveal important insights into the genetic architecture of inflammatory bowel disease, and suggest that a combination of continued array-based genome- wide association studies, imputed using substantial new reference panels, and large scale deep sequencing projects will be required in order to fully understand the genetic basis of complex diseases like IBD.
84

Statistical Methods for Characterizing Genomic Heterogeneity in Mixed Samples

Zhang, Fan 12 December 2016 (has links)
"Recently, sequencing technologies have generated massive and heterogeneous data sets. However, interpretation of these data sets is a major barrier to understand genomic heterogeneity in complex diseases. In this dissertation, we develop a Bayesian statistical method for single nucleotide level analysis and a global optimization method for gene expression level analysis to characterize genomic heterogeneity in mixed samples. The detection of rare single nucleotide variants (SNVs) is important for understanding genetic heterogeneity using next-generation sequencing (NGS) data. Various computational algorithms have been proposed to detect variants at the single nucleotide level in mixed samples. Yet, the noise inherent in the biological processes involved in NGS technology necessitates the development of statistically accurate methods to identify true rare variants. At the single nucleotide level, we propose a Bayesian probabilistic model and a variational expectation maximization (EM) algorithm to estimate non-reference allele frequency (NRAF) and identify SNVs in heterogeneous cell populations. We demonstrate that our variational EM algorithm has comparable sensitivity and specificity compared with a Markov Chain Monte Carlo (MCMC) sampling inference algorithm, and is more computationally efficient on tests of relatively low coverage (27x and 298x) data. Furthermore, we show that our model with a variational EM inference algorithm has higher specificity than many state-of-the-art algorithms. In an analysis of a directed evolution longitudinal yeast data set, we are able to identify a time-series trend in non-reference allele frequency and detect novel variants that have not yet been reported. Our model also detects the emergence of a beneficial variant earlier than was previously shown, and a pair of concomitant variants. Characterization of heterogeneity in gene expression data is a critical challenge for personalized treatment and drug resistance due to intra-tumor heterogeneity. Mixed membership factorization has become popular for analyzing data sets that have within-sample heterogeneity. In recent years, several algorithms have been developed for mixed membership matrix factorization, but they only guarantee estimates from a local optimum. At the gene expression level, we derive a global optimization (GOP) algorithm that provides a guaranteed epsilon-global optimum for a sparse mixed membership matrix factorization problem for molecular subtype classification. We test the algorithm on simulated data and find the algorithm always bounds the global optimum across random initializations and explores multiple modes efficiently. The GOP algorithm is well-suited for parallel computations in the key optimization steps. "
85

Evaluation of Next-Generation Sequencing as a clinical and research modality in the diagnosis of hereditary breast cancer

Dougherty, Kristen Elizabeth 08 April 2016 (has links)
Next-Generation Sequencing has opened the doors to nearly limitless amounts of genomic data, but the clinical utility of this data is not yet clear. From examining at sequencing data of known familial cancer genes in hereditary cancer patients, the NCGENES study found a clear molecular diagnosis in about 5% of patients and an uncertain molecular result in about 15% of patients. The remaining 80% of hereditary cancer patients received a negative result for the screening of known cancer genes. These latter patients were followed up by whole exome sequencing analysis, and the data was used to perform a research sweep to potentially identify mutation(s) in gene(s) that have yet to be clearly associated with their phenotype. Hereditary breast cancer has a relatively well-established set of susceptibility genes, yet a large percentage of the molecular etiology is still unknown. There are many genes that are good candidates for breast cancer genes based on their protein's function, but they may not actually contribute to breast cancer susceptibility. The ClinGen consortium is aiming to establish the clinical validity of gene-disease associations so that clinicians and patients can better interpret and utilize sequencing results. Six breast cancer susceptibility genes were evaluated using the ClinGen clinical validity framework with the goal of both evaluating the genes already on hereditary breast cancer panels and evaluating genes not yet widely tested to determine if there is enough evidence to support their role in disease to warrant widespread testing. These genes have varying levels of evidence supporting their role in breast cancer susceptibility. The variants in each of the six genes were compared between a cancer patient cohort and a non-cancer patient cohort enrolled in the NCGENES whole exome sequencing study. One likely pathogenic variant and several variants of unknown significance were identified in various genes, and the burden of variants in cancer cases versus controls was evaluated, although the controls were not matched to the cancer cohort in any way. Research sweeps were performed for patients with VUSs to ensure that there were no other mutations in genes that would better fit the phenotype. This thesis presents a method for evaluating gene-disease associations and for utilizing whole exome sequencing data to pinpoint a molecular diagnosis in hereditary breast cancer patients. Overall, it was found that the ClinGen method of evaluating clinical validity of gene-disease associations could be helpful when determining if variants are pathogenic or benign. A new gene, RINT1, was found to have enough evidence to be moderately associated with hereditary breast cancer and it was subsequently added to the diagnostic list so that all cancer patients will now be screened for RINT1 variants. In addition, it was found that two of the genes currently on the diagnostic list, RAD51C and RAD51D, have "disputed" evidence with respect to breast cancer susceptibility. Interestingly, they have much more evidence for an association with ovarian cancer, so if variants are found in these genes, the patient's phenotype should be considered when evaluating them. It was also shown that PALB2, an established breast cancer susceptibility gene, indeed is definitively associated with breast cancer, and the NCGENES cancer patients have more truncating variants than the controls, further validating the clinical validity assertion. Finally, an ovarian cancer patient with two interesting variants, one in SLX4 and one in GEN1, were evaluated. Studies showed that knocking out both of these genes' pathways was highly destructive to the cell. A VUS was found in each of these genes, and it was hypothesized that perhaps these two variants together may be sufficient to contribute to this patient's cancer susceptibility.
86

New approaches for measuring fitness of Plasmodium falciparum mutations implicated in drug resistance

Carrasquilla, Manuela January 2019 (has links)
The repeated emergence of drug resistance in Plasmodium falciparum underscores the importance of understanding the genetic architecture of current resistance pathways, as well as any associated fitness costs. Why resistance emerges in particular regions of the world has been linked to particular genetic backgrounds that better tolerate resistance-associated polymorphisms; this is likely to play a key role in driving the epidemiology of drug resistance, however is infrequently studied at a large scale in a laboratory setting. The first results chapter establishes a barcoding approach for P. falciparum with the aim of tracking parasite growth in vitro. The strategy used was adapted for P. falciparum by using a pseudogene (PfRh3) as a safe harbour to insert unique molecular barcodes. These libraries of barcoded P. falciparum vectors were also used as a readout of transfection efficiency. The second chapter establishes a proof of principle for phenotyping by barcode sequencing, using a panel of barcoded parasites generated in different genetic backgrounds that comprise sufficient genetic diversity to pilot the method. These were grown in the presence and absence of antimalarial compounds, and growth phenotypes were measured in parallel using BarSeq. The third results chapter studies the contribution of mutations in Pfkelch13, a molecular marker of artemisinin resistance, to parasite fitness. Combining CRISPR/Cas9-based genome editing and high throughput sequencing, the impact of Pfkelch13 alleles on fitness in the context of particular strain backgrounds is revealed. In particular, the impact of genetic background in the emergence and spread of drug-resistant lineages (referred to as KEL1) in Southeast Asia carrying a Y580 Pfkelch13 allele. Overall, given the current pace of genome sequencing of pathogenic organisms such as P. falciparum, it will be important to increase the scale of experimental genetics, in order to tackle in real-time natural variation that might be under constant selection from drugs, thus anticipating the emergence of drug resistance in changing parasite populations. Through this work, tools were developed to facilitate parallel phenotyping by measuring in vitro growth using high-throughput sequencing. The work also develops novel approaches to address the importance of genetic background and a potential role for positive epistasis in a lineage responsible for the recent outbreak of drug-resistant malaria in Southeast Asia.
87

Identificação de polimorfismos em região do cromossomo 2 da galinha associado a deposição de músculo / Identification of polymorphisms in the chicken chromosome 2 region associated with muscle deposition

Thaís Fernanda Godoy 13 February 2014 (has links)
A produção brasileira de carne de frango tem uma grande importância econômica no mundo todo devido principalmente aos avanços do melhoramento genético. O surgimento de novas tecnologias de sequenciamento (sequenciamento de nova geração) tem se tornado uma ferramenta poderosa, pois por meio da identificação de SNPs (polimorfismo de nucleotídeo único) e INDELs (deleções/inserções) possibilita a adição de novas informações ao melhoramento genético. A deposição de músculo, em especial o músculo de peito, é uma das características que mais merecem destaque por causa da sua importância nutricional e econômica. Sendo assim o objetivo deste trabalho foi ressequenciar o genoma de 18 aves de duas linhagens distintas experimentais e identificar SNPs e INDELs em uma região de QTL no cromossomo 2 da galinha associado anteriormente com deposição de músculo do peito, além de caracterizar variantes potencialemente funcionais e propor mutações candidatas para estudos futuros. Para isso, dezoito galinhas de duas diferentes linhagens experimentais (corte e postura), ambas desenvolvidas pela Embrapa Suíno e Aves, foram sequenciadas pela plataforma de nova geração da Illumina. SNPs e INDELs foram identificados por meio de ferramentas de bioinformática em uma região de QTL no cromossomo 2 da galinha (105.848.755-112.648.761 pb) que foi previamente associada com deposição de músculo de peito. O sequenciamento dos 18 animais gerou em torno 2,7 bilhões de reads e após a filtragem por qualidade foram mantidas 77% das reads. Em seguida, as reads foram alinhadas ao genoma referência (Gallus_gallus-4.0, NCBI) pela ferramenta Bowtie2 e gerou em média 10,6X de cobertura de sequenciamento na região-alvo. , Foram identificados 722.832 SNPs e 63.727 INDELs para os 18 animais por meio do programa SAMtools, e após uma filtragem rigorosa, foram mantidos 77% dos SNPs (n=558.767) e 60% das INDELs (n=38.402). Com base nas variantes únicas para os 18 animais (85.765 SNPs e 7.824 INDELs) foi realizada a anotação funcional por meio da ferramenta ANNOVAR. Dentre os SNPs não sinônimos (n=153) e stopgain (n=3), 15 foram classificados como deletérios. Um dos SNPs deletérios que já foi depositado em banco de dados foi identificado no gene RB1CC1, que tem sua função relacionada ao desenvolvimento do músculo de peito. Utilizando a ferramenta DAVID foi possível analisar 37 genes relacionados aos SNPs não sinônimos, stopgain, INDELs frameshift e não frameshift. Dentre estes genes, três (DTNA, RB1CC1 e C-MOS) foram selecionados por terem suas funções relacionadas ao desenvolvimento muscular e suas mutações foram analisadas. Sendo assim, futuros estudos podem ser realizados nestes genes candidatos e nas mutações identificadas, por meio de análises de associação e validação em populações comerciais, permitindo assim uma melhor explicação o efeito do QTL estudado. / The Brazilian chicken meat production has a great economic importance in worldwide mainly due to advances in breeding. The emergence of new techniques of sequencing (nextgeneration sequencing) becomes a powerful tool because through identification of SNPs (single nucleotide polymorphism) and INDELs (deletions/insertions) allows the addition of new information for genetic improvement. The muscle deposition, particularly the breast muscle, is one of the features that are most noteworthy because of its nutritional and economic importance. Therefore the aim of this study was to perform the genome resequencing of 18 chicken from two distinct experimental lines and identify SNPs and INDELs in a QTL region on chromosome 2 previously associated with breast muscle, and characterize the variants to identify potentially function ones and propose candidate mutations for future studies. To achieve these objectives, eighteen chickens of two different experimental lines (broiler and layer), both developed by Embrapa Swine and Poultry were sequenced by Illumina next-generation platform. SNPs and INDELs were identified by bioinformatic tools in a QTL region on chicken chromosome 2 (105,848,755-112,648,761 bp) which was previously associated with breast muscle deposition. Sequencing of the eighteen animals generated around 2.7 billion of reads, and 77% of the reads were retained after filtering. The reads were aligned against the chicken genome reference (Gallus_gallus-4.0, NCBI) by Bowtie2 tool resulting in a 10.6X coverage across the target region. Using SAMtools, 722,832 SNPs and 63,727 INDELs were identified in the all individuals, and after a stringent filtration, 77% of SNPs (n=558,767) and 60% of INDELs (n=38,402) were maintained. Based on unique variants for all the animal (85,765 SNPs and 7,828 INDELs) were performed the functional annotation by ANNOVAR tool. Among the non-synonymous SNPs (n=153) and stopgain (n=3), fifteen were predicted like a deleterious mutation. One of deleterious SNPs has already deposited in public database, and it was identified in RB1CC1 gene, which function is related to breast muscle development. Using the DAVID tool was possible to analyze the 37 genes related to the non-synonymous SNPs, stopgain, frameshift and non-frameshift INDELs. Among these genes, three (DTNA, RB1CC1 and C-MOS) were selected due their functions related to muscle development and their mutations were analyzed. Therefore, further association studies can be performed with these candidate genes and their mutations, and also validation in commercial populations, allowing a better explanation of QTL effects.
88

Estudo da diversidade dos genes MC1R e SLC24A5 em populações globais: avaliação de aspectos evolutivos e ambientais / MC1R and SLC24A5 gene diversity among global populations: assessment of environmental and evolutionary aspects

Marano, Leonardo Arduino 11 December 2015 (has links)
Dentre os vários marcadores genéticos existentes, alguns SNPs (Single Nucleotide Polymorphisms) podem estar associados à determinação de uma série de características fenotípicas (cor de pele, olhos e cabelos, estatura, forma do rosto, espessura do fio de cabelo) tendo sua predição um grande valor nas investigações forenses. Dentre os principais genes conhecidos por controlarem a pigmentação humana, através da produção da melanina, o SLC24A5 (solute carrier family 24, member 5) e o MC1R (melanocortin 1-receptor) apresentam um papel fundamental na melanogênese. O advento do sequenciamento de nova geração permitiu o processamento de várias regiões genômicas e indivíduos simultaneamente, aumentando a disponibilidade e precisão de dados de genomas completos, como alcançado em estudos como o 1000 Genomes Project. Nossas análises preliminares demonstraram que os dados de Fase 3 do 1000 Genomes são muito mais confiáveis do que as versões anteriores. Com esses dados, obtidos para 26 populações globais, foram realizadas análises populacionais (diversidade haplotípica, desequilíbrio de ligação, redes de haplótipo e de variância molecular) para os genes MC1R e SLC24A5 a fim de se compreender melhor seus padrões de diversidade, correlacionando-os à prováveis eventos de seleção natural que tenham moldado sua história evolutiva, como por exemplo a intensidade da radiação UV nas diferentes regiões geográficas. Alguns padrões foram encontradas entre os grupos africano, europeu e asiático, de acordo com a história evolutiva já descrita para estes grupos. Apesar disso, foi observado um padrão claro de varredura seletiva para o SLC24A5 nos resultados obtidos, como o alto desequilíbrio de ligação e baixa diversidade haplotípica. As análises da diversidade do SLC24A5 associadas com as zonas de incidência UV, no entanto, não apresentaram uma correlação clara entre a intensidade da radiação UV e a diversidade haplotípica / Among the various existing genetic markers, some SNPs may be associated with the determination of a series of phenotypic characteristics (skin, eyes and hair color, height, face shape, hair thickness) and its prediction could have a great value in forensic investigations. Among the major genes known to control human pigmentation through melanin production, SLC24A5 (solute carrier family 24, member 5) and MC1R (melanocortin-1 receptor) have a major role in melanogenesis. The advent of next-generation sequencing has enabled processing of several individuals and genomic regions simultaneously while increasing the availability and accuracy of whole genomes data, such as 1000 Genomes Project has achieved. Our preliminary analysis showed that Phase 3 data from the 1000 Genomes are far more reliable than previous versions. Using this data, obtained for 26 global populations, several analyzes were performed (haplotype diversity, linkage disequilibrium, haplotype networks and molecular variance) for MC1R and SLC24A5 in order to better understand their diversity patterns, correlating them to natural selection events which may have shaped their evolutionary history, such as UV radiation intensity in different geographical regions. Some patterns were found between African, European and Asian groups, according to the evolutionary history already described for these groups. Nevertheless, a strong pattern of selective sweep was observed for SLC24A5 in our data, such as high linkage disequilibrium and low haplotype diversity. Analysis of SLC24A5 diversity associated to UV, however, did not show a clear correlation between the UV radiation intensity and haplotype diversity
89

Identification, Validation and Characterization of the Mutation on Chromosome 18p which is Responsible for Causing Myoclonus-Dystonia

Vanstone, Megan 02 November 2012 (has links)
Myoclonus-Dystonia (MD) is an inherited, rare, autosomal dominant movement disorder characterized by quick, involuntary muscle jerking or twitching (myoclonus) and involuntary muscle contractions that cause twisting and pulling movements, resulting in abnormal postures (dystonia). The first MD locus was mapped to 7q21-q31 and called DYT11; this locus corresponds to the SGCE gene. Our group previously identified a second MD locus (DYT15) which maps to a 3.18 Mb region on 18p11. Two patients were chosen to undergo next-generation sequencing, which identified 2,292 shared novel variants within the critical region. Analysis of these variants revealed a 3 bp duplication in a transcript referred to as CD108131, which is believed to be a long non-coding RNA. Characterization of this transcript determined that it is 863 bp in size, it is ubiquitously expressed, with high expression in the cerebellum, and it accounts for ~3% of MD cases.
90

Inferring Genomic Sequences

Astrovskaya, Irina A 07 May 2011 (has links)
Recent advances in next generation sequencing have provided unprecedented opportunities for high-throughput genomic research, inexpensively producing millions of genomic sequences in a single run. Analysis of massive volumes of data results in a more accurate picture of the genome complexity and requires adequate bioinformatics support. We explore computational challenges of applying next generation sequencing to particular applications, focusing on the problem of reconstructing viral quasispecies spectrum from pyrosequencing shotgun reads and problem of inferring informative single nucleotide polymorphisms (SNPs), statistically covering genetic variation of a genome region in genome-wide association studies. The genomic diversity of viral quasispecies is a subject of a great interest, particularly for chronic infections, since it can lead to resistance to existing therapies. High-throughput sequencing is a promising approach to characterizing viral diversity, but unfortunately standard assembly software cannot be used to simultaneously assemble and estimate the abundance of multiple closely related (but non-identical) quasispecies sequences. Here, we introduce a new Viral Spectrum Assembler (ViSpA) for inferring quasispecies spectrum and compare it with the state-of-the-art ShoRAH tool on both synthetic and real 454 pyrosequencing shotgun reads from HCV and HIV quasispecies. While ShoRAH has an advanced error correction algorithm, ViSpA is better at quasispecies assembling, producing more accurate reconstruction of a viral population. We also foresee ViSpA application to the analysis of high-throughput sequencing data from bacterial metagenomic samples and ecological samples of eukaryote populations. Due to the large data volume in genome-wide association studies, it is desirable to find a small subset of SNPs (tags) that covers the genetic variation of the entire set. We explore the trade-off between the number of tags used per non-tagged SNP and possible overfitting and propose an efficient 2LR-Tagging heuristic.

Page generated in 0.1307 seconds