Spelling suggestions: "subject:"nextgeneration sequencing"" "subject:"textgeneration sequencing""
221 |
Detection and characterization of gene-fusions in breast and ovarian cancer using high-throughput sequencingMittal, Vinay K. 21 September 2015 (has links)
Gene-fusions are a prevalent class of genetic variants that are often employed as cancer biomarkers and therapeutic targets. In recent years, high-throughput sequencing of the cellular genome and transcriptome have emerged as a promising approach for the investigation of gene-fusions at the DNA and RNA level. Although, large volumes of sequencing data and complexity of gene-fusion structures presents unique computational challenges. This dissertation describes research that first addresses the bioinformatics challenges associated with the analysis of the massive volumes of sequencing data by developing bioinformatics pipeline and more applied integrated computational workflows. Application of high-throughput sequencing and the proposed bioinformatics approaches for the breast and ovarian cancer study reveals unexpected complex structures of gene-fusions and their functional significance in the onset and progression of cancer. Integrative analysis of gene-fusions at DNA and RNA level shows the key importance of the regulation of gene-fusion at the transcription level in cancer.
|
222 |
Expression and Splicing of Alzheimer’s Disease Risk Gene Phosphatidylinositol-Binding Clathrin Assembly ProteinParikh, Ishita 01 January 2014 (has links)
Recent Genome Wide Association Studies (GWAS) have identified a series of single nucleotide polymorphism (SNP)s that are associated with Alzheimer’s disease (AD). One of the SNPs, rs3851179 (G/A), is near the gene phosphatidylinositol-binding clathrin assembly protein (PICALM). To evaluate whether this SNP is associated with PICALM expression, we quantified PICALM mRNA in 56 brain cDNA samples. Using linear regression analysis, we analyzed PICALM expression relative to rs3851179, AD status, and cell type specific markers. An association was detected between rs3851179 and PICALM, microvessel mRNA, glial fibrillary acidic protein (GFAP) mRNA, and synaptophysin (SYN) mRNA. To gain clarity into other possible SNP mechanisms, we searched brain cDNA for PICALM splice variants. We identified several PICALM splice variants involving exons 13-19. To identify and gain an estimation of relative abundance of splice variants, we PCR-amplified across exons 13-20 in cDNA from six individuals, three rs3851179 GG individuals and three rs3851179 AA individuals. Sequencing the cloned isoforms we found that PICALM lacking exon 13 (delta 13) is the most abundant isoform. Other isoforms detected included deletion of exon 18-19. We targeted the latter part of the gene, exon 17-20, to investigate unequal allelic expression using next generation sequencing. Individuals heterozygous for rs76719109 (n= 35), located in exon 17, were used to study the abundance of G/T allele in cDNA and genomic DNA. When we analyzed the T:G allelic ratio, the variant lacking exons 18 and 19 showed unequal allelic expression (p-value < 0.001) in a subset of individuals. One individual was an outlier, showing overall unequal allelic expression, which maybe be harboring a rare mutation capable of modifying PICALM expression. The PICALM intronic SNP rs588076 was associated with delta 18-19 isoform splicing (p-value < 0.001). In conclusion, this study gained a greater insight into the role of AD genetics in PICALM expression and splicing.
|
223 |
BACTERIA IN BIOETHANOL FERMENTATIONSLi, Qing 01 January 2014 (has links)
To gain a better understanding of contaminating bacteria in bioethanol industry, we profiled the bacterial community structure in corn-based bioethanol fermentations and evaluated its correlation to environmental variables. Twenty-three batches of corn-mash sample were collected from six bioethanol facilities. The V4 region of the collective bacterial 16S rRNA genes was analyzed by Illumina Miseq sequencing to investigate the bacterial community structure. Non-metric multidimensional scaling (NMDS) ordination plots were constructed to visualize bacterial community structure groupings among different samples, as well as the effects of multiple environmental variables on community structure variation. Our results suggest that bacterial community structure is facility-specific, although there are two core bacterial phyla, Firmicutes and Proteobacteria. Feedstock, facility, and fermentation technology may explain the difference in community structure between different facilities. Lactic acid, the most important environmental variable that influences bacterial community structure grouping, could be utilized as an indicator of bacterial contamination. We also identified genes responsible for the multiple antibiotic-resistance phenotype of an Enterobacter cloacae strain isolated from a bioethanol fermentation facility. We performed PCR assays and revealed the presence of canonical genes encoding resistance to penicillin and erythromycin. However, a gene encoding resistance to virginiamycin was not detected.
|
224 |
Rule-based Models of Transcriptional Regulation and Complex Diseases : Applications and DevelopmentBornelöv, Susanne January 2014 (has links)
As we gain increased understanding of genetic disorders and gene regulation more focus has turned towards complex interactions. Combinations of genes or gene and environmental factors have been suggested to explain the missing heritability behind complex diseases. Furthermore, gene activation and splicing seem to be governed by a complex machinery of histone modification (HM), transcription factor (TF), and DNA sequence signals. This thesis aimed to apply and develop multivariate machine learning methods for use on such biological problems. Monte Carlo feature selection was combined with rule-based classification to identify interactions between HMs and to study the interplay of factors with importance for asthma and allergy. Firstly, publicly available ChIP-seq data (Paper I) for 38 HMs was studied. We trained a classifier for predicting exon inclusion levels based on the HMs signals. We identified HMs important for splicing and illustrated that splicing could be predicted from the HM patterns. Next, we applied a similar methodology on data from two large birth cohorts describing asthma and allergy in children (Paper II). We identified genetic and environmental factors with importance for allergic diseases which confirmed earlier results and found candidate gene-gene and gene-environment interactions. In order to interpret and present the classifiers we developed Ciruvis, a web-based tool for network visualization of classification rules (Paper III). We applied Ciruvis on classifiers trained on both simulated and real data and compared our tool to another methodology for interaction detection using classification. Finally, we continued the earlier study on epigenetics by analyzing HM and TF signals in genes with or without evidence of bidirectional transcription (Paper IV). We identified several HMs and TFs with different signals between unidirectional and bidirectional genes. Among these, the CTCF TF was shown to have a well-positioned peak 60-80 bp upstream of the transcription start site in unidirectional genes.
|
225 |
The Microbial Associates and Putative Venoms of Seed Chalcid Wasps (Hymenoptera: Torymidae: Megastigmus)Paulson, Amber Rose 20 December 2013 (has links)
Conifer seed-infesting chalcids of the genus Megastigmus (Hymenoptera: Torymidae) are important forest pests. At least one species, M. spermotrophus Wachtl, has been shown to be able to manipulate the seed development of its host, Douglas-fir (Pseudotsuga menziesii) in remarkable ways, such as redirecting unfertilized ovules that would normally abort. The mechanism of host manipulation is currently unknown. Microbial associates and venoms are two potential mechanisms of host manipulation. Microbial associates are emerging as an important player in insect-plant interactions. There is also evidence that venoms may be important in gall-induction by phytophagous wasps. PCR and 16S rRNA pyrosequencing was used to characterize the microbial associates of Megastigmus and transcriptomic sequencing was used to identify putative venoms that were highly expressed in female M. spermotrophus. The common inherited bacterial symbionts Wolbachia and Rickettsia were found to be prevalent among several populations of Megastigmus spp. screened using a targeted PCR approach. A member of the Betaproteobacteria, Ralstonia, was identified as the dominant microbial associate of M. spermotrophus using 16S rRNA pyrosequencing. The transcriptome of M. spermotrophus was assembled de novo and three putative venoms were identified as highly expressed in females. One of these putative venoms, Aspartylglucosaminidase, (AGA) appears to have originated through gene duplication within the Hymenoptera and has been identified as a major venom component of two divergent parasitoid wasps. AGA was identified as a promising candidate for further investigation as a potential mechanism of early host manipulation by M. spermotrophus. / Graduate / 0353 / 0410 / 0715 / apaulson@shaw.ca
|
226 |
Understanding and improving high-throughput sequencing data production and analysisKircher, Martin 27 July 2011 (has links) (PDF)
Advances in DNA sequencing revolutionized the field of genomics over the last 5 years. New sequencing instruments make it possible to rapidly generate large amounts of sequence data at substantially lower cost. These high-throughput sequencing technologies (e.g. Roche 454 FLX, Life Technology SOLiD, Dover Polonator, Helicos HeliScope and Illumina Genome Analyzer) make whole genome sequencing and resequencing, transcript sequencing as well as quantification of gene expression, DNA-protein interactions and DNA methylation feasible at an unanticipated scale.
In the field of evolutionary genomics, high-throughput sequencing permitted studies of whole genomes from ancient specimens of different hominin groups. Further, it allowed large-scale population genetics studies of present-day humans as well as different types of sequence-based comparative genomics studies in primates. Such comparisons of humans with closely related apes and hominins are important not only to better understand human origins and the biological background of what sets humans apart from other organisms, but also for understanding the molecular basis for diseases and disorders, particularly those that affect uniquely human traits, such as speech disorders, autism or schizophrenia. However, while the cost and time required to create comparative data sets have been greatly reduced, the error profiles and limitations of the new platforms differ significantly from those of previous approaches. This requires a specific experimental design in order to circumvent these issues, or to handle them during data analysis.
During the course of my PhD, I analyzed and improved current protocols and algorithms for next generation sequencing data, taking into account the specific characteristics of these new sequencing technologies. The presented approaches and algorithms were applied in different projects and are widely used within the department of Evolutionary Genetics at the Max Planck Institute of Evolutionary Anthropology. In this thesis, I will present selected analyses from the whole genome shotgun sequencing of two ancient hominins and the quantification of gene expression from short-sequence tags in five tissues from three primates.
|
227 |
Genomic Insights into Sexual Selection and the Evolution of Reproductive Genes in Teleost FishesSmall, Clayton 2012 August 1900 (has links)
Sexual selection has long been a working explanation for the elaboration of appreciable traits in plants and animals, but the idea that it is an equally potent agent of change at the level of individual molecules is relatively recent. Indications that genes associated with reproductive biology evolve especially rapidly planted this notion, but many details about the genomics of sex remain elusive. Numerous studies have characterized rapid sequence and expression divergence of sex-related molecules, but few if any have demonstrated convincingly that these patterns exist as a result of sexual selection. This dissertation describes several genome-scale studies related to reproduction and the sexes in teleost fishes, a group of animals underexploited in regard to this topic.
Using commercial microarrays I measured the extent of sexually dimorphic gene expression in the zebrafish, Danio rerio. Sex-biased patterns of gene expression in this species are similar to those described in other animals. A number of genes expressed at high levels in ovaries and testes relative to the body were identified as a product of the study, and these data may be useful for future studies of reproductive genes in Danio fishes.
In a second study, the recent advent of high throughput cDNA pyrosequencing was leveraged to characterize the relationships between tissue-, sex-, and species-specific expression patterns of genes and rates of sequence evolution in swordtail fishes (Xiphophorus). I discovered ample evidence for expression biases of all three types, and a generally positive but idiosyncratic relationship between the magnitude of expression bias and rates of protein-coding sequence evolution.
Pyrosequencing of cDNA was also used to explore the possibility that postcopulatory sexual selection drives the rapid evolution of male pregnancy genes, a novel class of reproductive molecules unique to syngnathid fishes (seahorses and pipefishes). Genes differentially expressed in the male brooding tissues as a function of pregnancy status evolve more rapidly at the amino acid level than genes exhibiting static expression. Brooding tissue genes expressed during male pregnancy have evolved especially rapidly in polyandrous lineages, a finding that supports the hypothesized relationship between postcopulatory sexual selection and the adaptive evolution of reproductive molecules.
|
228 |
Birds as a Model for Comparative Genomic StudiesKünstner, Axel January 2011 (has links)
Comparative genomics provides a tool to investigate large biological datasets, i.e. genomic datasets. In my thesis I focused on inferring patterns of selection in coding and non-coding regions of avian genomes. Until recently, large comparative studies on selection were mainly restricted to model species with sequenced genomes. This limitation has been overcome with advances in sequencing technologies and it is now possible to gather large genomic data sets for non-model species. Next-generation sequencing data was used to study patterns of nucleotide substitutions and from this we inferred how selection has acted in the genomes of 10 non-model bird species. In general, we found evidence for a negative correlation between neutral substitution rate and chromosome size in birds. In a follow up study, we investigated two closely related bird species, to study expression levels in different tissues and pattern of selection. We found that between 2% and 18% of all genes were differentially expressed between the two species. We showed that non-coding regions adjacent to genes are under evolutionary constraint in birds, which suggests that noncoding DNA plays an important functional role in the genome. Regions downstream to genes (3’) showed particularly high level of constraint. The level of constraint in these regions was not correlated to the length of untranslated regions, which suggests that other causes play also a role in sequence conservation. We compared the rate of nonsynonymous substitutions to the rate of synonymous substitutions in order to infer levels of selection in protein-coding sequences. Synonymous substitutions are often assumed to evolve neutrally. We studied synonymous substitutions by estimating constraint on 4-fold degenerate sites of avian genes and found significant evolutionary constraint on this category of sites (between 24% and 43%). These results call for a reappraisal of synonymous substitution rates being used as neutral standards in molecular evolutionary analysis (e.g. the dN/dS ratio to infer positive selection). Finally, the problem of sequencing errors in next-generation sequencing data was investigated. We developed a program that removes erroneous bases from the reads. We showed that low coverage sequencing projects and large genome sequencing projects will especially gain from trimming erroneous reads.
|
229 |
Filogenia de Seirinae (Collembola, Entomobryoidea, Entomobryidae) na regi?o Neotropical baseada em genomas mitocondriais completosGodeiro, Neriv?nia Nunes 27 September 2017 (has links)
Submitted by Automa??o e Estat?stica (sst@bczm.ufrn.br) on 2017-12-12T19:48:28Z
No. of bitstreams: 1
NerivaniaNunesGodeiro_TESE.pdf: 42759157 bytes, checksum: 096a840516dd298c1623004a0fa0476d (MD5) / Approved for entry into archive by Arlan Eloi Leite Silva (eloihistoriador@yahoo.com.br) on 2017-12-14T20:25:43Z (GMT) No. of bitstreams: 1
NerivaniaNunesGodeiro_TESE.pdf: 42759157 bytes, checksum: 096a840516dd298c1623004a0fa0476d (MD5) / Made available in DSpace on 2017-12-14T20:25:43Z (GMT). No. of bitstreams: 1
NerivaniaNunesGodeiro_TESE.pdf: 42759157 bytes, checksum: 096a840516dd298c1623004a0fa0476d (MD5)
Previous issue date: 2017-09-27 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior (CAPES) / Seirinae ? uma das mais diversas subfam?lias de Collembola, e grande parte dessa diversidade ? devida a Seira Lubbock que possui, aproximadamente, 220 esp?cies reconhecidas. At? o momento, nenhuma filogenia interna foi proposta para o t?xon, o que dificulta a organiza??o do conhecimento para compara??o, descri??o de novas esp?cies e g?neros, al?m da pr?pria compreens?o dos seus padr?es evolutivos. A quetotaxia dorsal ? o principal componente morfol?gico utilizado para distinguir esp?cies, e embora comprovadamente diagn?stico, pode ser vari?vel intraespecificamente. O principal objetivo deste trabalho ? esclarecer as rela??es filogen?ticas entre os Seirinae neotropicais, do ponto de vista molecular e morfol?gico, o que poder? resultar numa melhor organiza??o interna da subfam?lia. Para tanto, foram sequenciadas 27 amostras de diferentes esp?cies de Entomobryidae e uma de Paronellidae. Para as an?lises moleculares, foi extra?do e quantificado o DNA total de um indiv?duo/amostra e bibliotecas foram constru?das e sequenciadas por Next Generation Sequencing utilizando o HiSeq 2000. O genoma mitocondrial (DNAmt) completo
das esp?cies foi reconstru?do atrav?s de an?lises de bioinform?tica utilizando duas metodologias: SOAPdenovo_Trans e MIRA/MITOBim. Duas filogenias foram propostas: uma contendo somente os genomas reconstru?dos neste trabalho e outra complementar, onde foram inclu?dos 11 DNAmt de Collembola disponibilizados em bancos de dados online. As filogenias foram feitas por an?lises Bayesianas utilizando os treze genes codificantes proteicos que correspondem a quase totalidade do DNAmt. Os resultados corroboram com a proposta atual que a ordem Poduromorpha ? a mais basal de Collembola; a ordem Symphypleona aparece como grupo-irm?o de Entomobryomorpha, que apresenta clara divis?o em duas superfam?lias,
Isotomoidea e Entomobryoidea; o posicionamento dos g?neros Lepidocyrtoides Sch?tt e Lepidosira Sch?tt dentro de Entomobryinae corroboram com a mais recente filogenia publicada; a monofilia de Seirinae e seus grandes grupos internos foi comprovada pela primeira vez por dados moleculares com alto apoio nodal; o g?nero Tyrannoseira Bellini & Zeppelini, recentemente descrito, foi validado filogeneticamente; Lepidocyrtinus B?rner foi al?ado a status de g?nero; e tr?s sinon?mias de esp?cies foram propostas; por fim, algumas caracter?sticas morfol?gicas de Seirinae foram identificadas como diagn?sticas e com sinal filogen?tico, como
por exemplo, a quantidade de macroquetas no primeiro segmento abdominal. / Seirinae is one of the most diverse subfamilies of Collembola, and a considerable part of this diversity is comprised by Seira Lubbock, which currently gathers approximately 220 species.
So far no internal phylogeny Seirinae was proposed, what leads to difficulties in the establishment of comparative knowledge, description of new taxa, and also the understanding
of the evolutionary patterns within this taxon. The dorsal chaetotaxy is the main morphological component utilised to distinguish species, and although undoubtedly diagnostic, it can be variable interspecifically. The main aim of this work is to clarify the phylogenetic relations within the Neotropical Seirinae based on both molecular and morphological data, which might result in a better internal organization of the subfamily. For this aim, 27 samples of different species belonging to Entomobryidae and one of Paronellidae were sequenced. As for molecular
analyses genomic DNA of one individual/sample was extracted and quantified and sequencing
libraries were built and sequenced using Next-Generation Sequencing on HiSeq 2000. The whole mitochondrial genome (DNAmt) of the species was reconstructed by two methods: SOAPdenovo_Trans and MIRA/MITOBim. Two phylogenies were then proposed: one containing only genomes reconstructed in this study as well as a complementary one, where 11
Collembola DNAmt available in a public database were also included. The phylogenies were generated through Bayesian analyses using the thirteen protein coding genes that almost correspond to the entire DNAmt. The results corroborate the current proposal which claims the order Poduromorpha as the most basal order of Collembola; the order Symphypleona as the sister-group of Entomobryomorpha, which shows clear division into two superfamilies,
Isotomoidea and Entomobryoidea; the placement of Lepidocyrtoides Sch?tt and Lepidosira Sch?tt genera inside Entomobryinae corroborates the most recently published
phylogeny; the monophyly of the internal groups of Seirinae based on molecular evidence was confirmed for the first time showing high nodal support; Tyrannoseira Bellini & Zeppelini,
recently described, was validated phylogenetically; Lepidocyrtinus B?rner was elevated to
genus status; and three species synonyms were proposed; finally some morphological characteristics of Seirinae were identified as diagnostic and having phylogenetic signal, for
instance, the quantity of macrochaetae on the first abdominal segment.
|
230 |
Understanding early transcriptional events in Staphylococcus aureus infectionLindemann, Claudia January 2017 (has links)
Staphylococcus aureus remains an important pathogen, which, due to its capability to develop antimicrobial resistance, imposes an increasing threat to human health. Developing preventive means to decrease disease burden is a major aim. However, the development of an S. aureus vaccine, which would be one strategy to achieve such goals, has been complicated through limited understanding of the bacterium's pathogenic mechanisms. This work uses four approaches to address these limitations: Firstly, a reproducible RNA sequencing based method for the determination of gene transcription by S. aureus in vivo during mammalian infection. Secondly, examination of the impact of the bacterial transcription regulator 'Rsp' on the bacterium, which shows that mutations in this gene have profound functional and transcriptional impacts. Thirdly, by examining the in vivo transcription of multiple S. aureus strains during infection, proposing a 'core in vivo transcriptome' of induced genes under the conditions tested. Some of these genes are known to be involved in pathogenesis, others are not completely characterised and may represent suitable vaccine antigens. Finally, this work addresses limited understanding of S. aureus pathogenesis through defining transcriptional changes in vivo, which are induced by an altered immune response in immunised hosts. Together, this body of work contributes to the understanding of S. aureus pathogenesis and provides candidate antigens for future vaccine development.
|
Page generated in 0.1567 seconds