• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 189
  • 58
  • 50
  • 33
  • 22
  • 6
  • 5
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • Tagged with
  • 456
  • 456
  • 456
  • 70
  • 68
  • 67
  • 58
  • 55
  • 54
  • 53
  • 52
  • 49
  • 48
  • 46
  • 46
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

Systematic Analysis of Suppressor Mutations in S. cerevisiae Strains with Deleted Genome Integrity Genes

Yamaguchi, Takafumi 11 December 2013 (has links)
The effects of a mutation in one gene can occasionally be suppressed by mutation in another gene. Genetic suppression indicates functional relationships and provides clues about the mechanism and order of action in genetic pathways. Here I explored the existing yeast deletion collection to identify suppressor relationships. The collection was released in 2000 and it is known that some strains in the collection have acquired mutations. Whole genome sequencing of 48 yeast deletion strains corresponding to 26 genome integrity genes was performed. High-throughput sequencing revealed a broad mutational spectrum including point mutations, indels, and copy number variations. I identified and experimentally validated two new suppressor mutations (sgs1 mutations in both top3Δ and rmi1Δ strains) corresponding to gene pairs with previously known suppressor relationships. Thus, high-throughput sequencing and analysis of yeast deletion strains can identify suppressor mutations. The resulting genome sequences also provide a baseline for future laboratory evolution experiments.
102

Identification and Characterization of Pathogenic Mutations in Neurodevelopmental Disorders Discovered by Next-Generation Sequencing

Ruzzo, Elizabeth Kathryn January 2014 (has links)
<p>Neurodevelopmental disorders develop over time and are characterized by a wide variety of mental, behavioral, and physical phenotypes. The categorization of neurodevelopmental disorders encompasses a broad range of conditions including intellectual disability, autism spectrum disorder, attention deficit hyperactivity disorder, cerebral palsy, schizophrenia, bipolar disorder, and epilepsy, among others. Diagnostic classifications of neurodevelopmental disorders are complicated by comorbidities among these neurodevelopmental disorders, unidentified causal genes, and growing evidence of shared genetic risk factors. </p><p>We sought to identify the genetic underpinnings of a variety of neurodevelopmental disorders, with a particular emphasis on the epilepsies, by employing next&ndash;generation sequencing to thoroughly interrogate genetic variation in the human genome/exome. First, we investigated four families presenting with a seemingly identical and previously undescribed neurodevelopmental disorder characterized by congenital microcephaly, intellectual disability, progressive cerebral atrophy, and intractable seizures. These families all exhibited an apparent autosomal recessive pattern of inheritance. Second, we investigated a heterogeneous cohort of &sim;60 undiagnosed patients, the majority of whom suffered from severe neurodevelopmental disorders with a suspected genetic etiology. Third, we investigated 264 patients with epileptic encephalopathies &mdash; severe childhood epilepsy disorders &mdash; looking specifically at infantile spasms and Lennox&ndash;Gastaut syndrome. Finally, we investigated &sim;40 large multiplex epilepsy families with complex phenotypic constellations and unclear modes of inheritance. The studied neurodevelopmental disorders exhibited a range of genetic complexity, from clear Mendelian disorders to common complex disorders, resulting in varying degrees of success in the identification of clearly causal genetic variants. </p><p>In the first project, we successfully identified the disease&ndash;causing gene. We show that recessive mutations in <italic>ASNS </italic> (encoding asparagine synthetase) are responsible for this previously undescribed neurodevelopmental disorder. We also characterized the causal mutations <italic>in vitro</italic> and studied Asns&ndash;deficient mice that mimicked aspects of the patient phenotype. This work describes ASNS deficiency as a novel neurodevelopmental disorder, identifies three distinct causal mutations in the ASNS gene, and indicates that asparagine synthesis is essential for the proper development and function of the brain.</p><p>In the second project, we exome sequenced 62 undiagnosed patients and their unaffected biological parents (trios). By analyzing all identified variants that were annotated as putatively functional and observed as a novel genotype in the probands (not observed in the unaffected parents or controls), we obtained a genetic diagnosis for 32% (20/62) of these patients. Additionally, we identify strong candidate variants in 31% (13/42) of the undiagnosed cases. We also present additional analysis methods for moving beyond traditional screens, e.g., considering only securely implicated genes, or subjecting qualifying variants from any gene to two unique analysis approaches. This work adds to the growing evidence for the utility of diagnostic exome sequencing, increases patient sizes for rare neurodevelopmental disorders (enabling more detailed analyses of the phenotypic spectrum), and proposes novel analysis approaches which will likely become beneficial as the number of sequenced undiagnosed patients grows. </p><p>In the third project, we again employ a trio&ndash;based exome sequencing design to investigate the role of <italic>de novo</italic> mutations in two classical forms of epileptic encephalopathy. We find a significant excess of <italic>de novo</italic> mutations in the &sim;4,000 genes that are the most intolerant to functional genetic variation in the human population (P = 2.9 x 10<super>&ndash;3</super>, likelihood analysis). We provide clear statistical evidence for two novel genes associated with epileptic encephalopathy &mdash; <italic>GABRB3</italic> and <italic>ALG13</italic>. Together with the 15 well&ndash;established epileptic encephalopathy genes, we statistically confirm the association of an additional ten putative epileptic encephalopathy genes. We show that only &sim;12% of epileptic encephalopathy patients in our cohort are explained by <italic>de novo</italic> mutations in one of these 24 genes, highlighting the extreme locus heterogeneity of the epileptic encephalopathies. </p><p>Finally, we investigated multiplex epilepsy families to uncover novel epilepsy susceptibility factors. Candidate variants emerging from sequencing within discovery families were further assessed by cosegregation testing, variant association testing in a case&ndash;control cohort, and gene&ndash;based resequencing in a cohort of additional multiplex epilepsy families. Despite employing multiple approaches, we did not identify any clear genetic associations with epilepsy. This work has, however, identified a set of candidates that may include real risk factors for epilepsy; the most promising of these is the <italic>MYCBP2</italic> gene. This work emphasizes the extremely high locus and allelic heterogeneity of the epilepsies and demonstrates that very large sample sizes are needed to uncover novel genetic risk factors. </p><p>Collectively, this body of work has securely implicated three novel neurodevelopmental disease genes that inform the underlying pathology of these disorders. Furthermore, in the final three studies, this work has highlighted additional candidate variants and genes that may ultimately be validated as disease&ndash;causing as sample sizes increase.</p> / Dissertation
103

Towards understanding mastrevirus dynamics and the use of viral metagenomic approaches to identify novel gemini-like circular DNA viruses

Kraberger, Simona January 2015 (has links)
Mastreviruses (family Geminiviridae) are plant-infecting viruses with circular single-stranded (ss) DNA genomes (~2.7kb). The genus Mastrevirus is comprised of thirty-two species which are transmitted by leafhoppers belonging to the genus Cicadulina. Mastreviruses are widely distributed and have been found in the Middle East, Europe, Asia, Australia, Africa and surrounding islands. Only one species, dragonfly-associated mastrevirus has so far been identified in the Americas, isolated from a dragonfly in Puerto Rico. Species can be group based on the host(s) they infect, those which infect monocotyledonous (monocot) plants and those which infect dicotyledonous (dicot) plants. In recent years many new mastrevirus species have been discovered. Several of these new discoveries can largely been attributed to the development of new molecular tools. The current state of sequencing platforms has made it affordable and easier to characterise mastreviruses at a genome level thus allowing scientists to delve deeper into understanding the dynamics of mastreviruses. A few mastrevirus species have been identified as important agricultural pathogens and as a result have been the focus of much of the mastrevirus research. Maize streak virus, strain A (MSV-A) has been the most extensively studied due to the devastating impact it has on maize production in Africa. Studies have shown that MSV-A likely emerged as a pathogen of maize less than 250 years following introduction of maize in Africa by early European settlers. There is compelling evidence to suggest that MSV-A is likely the result of recombination events between wild grass adapted MSV strains. It therefore is equally important to monitor viruses infecting non-cultivated plants in order to gain a greater understanding of the epidemiological dynamics of mastreviruses, which in turn is essential for implementing disease management strategies. The objective of the research undertaken as part of this PhD thesis was to investigate global mastrevirus dynamics focusing on diversity, host and geographic ranges, mechanisms of evolution, phylogeography and possible origins of these viruses. In addition to this a viral metagenomic approach was used in order to identify novel mastreviruses or mastrevirus-like present in New Zealand. The dynamics of the monocot-infecting mastreviruses are investigated in Chapter Two and Three. The work described in these two chapters focus mainly on mastreviruses which infect non-cultivated grasses in Africa and Australia, a total of 161 full mastrevirus genomes were recovered collectively in the two studies. Chapter Two reveals a high level of mastrevirus diversity present in Australia with the discovery of four new species and several new strains of previously characterised species. An extensive sampling effort in Africa undertaken in Chapter Three reveals a broader host range and geographic distribution of the African monocot-infecting mastreviruses than previously documented. Mosaic patterns of recombination are evident among both the Australian and African monocot-infecting mastreviruses. In Chapters Four, Five and Six a comprehensive investigation was undertaken focusing on the dicot-infecting mastreviruses. The study undertaken in Chapter Four entailed the recovery of 49 full mastrevirus genomes from Australia, the Middle East, Africa, Turkey and the Indian Subcontinent to investigate the diversity of dicot-infecting mastreviruses from a global context. Analyses revealed a high degree of CpCDV strain diversity and extended the known geographic range of CpCDV. For the first time phylogeographic analysis was able to investigate the origins of the dicot-infecting mastreviruses. Results revealed the likely origin of the most recent common ancestor (MRCA) of these viruses is likely closer to Australia than anywhere else that dicot-infecting mastreviruses have been sampled and illuminated a supported series of historical movements following the emergence of the MRCA. In Chapter Five two novel mastreviruses Australian-like mastreviruses were isolated from chickpea material from Pakistan. A comprehensive analysis of CpCDV isolates in the major pulse growing regions of Sudan in Chapter Six reveals that this region harbours a high degree of strain diversity. Complex patterns of intra-species recombination indicate these strains are evidently circulating in these regions and infecting the same hosts, driving the emergence of new CpCDV strains. Collectively the results discussed in Chapters Two through Six extended the current knowledge of mastrevirus diversity. The natural host range of many mastreviruses has proven to be more extensive than previously documented, with many species having overlapping host ranges and hence these hosts could be acting as ‘mixing vessels’ enabling inter-species recombination. Patterns of recombination and selection were observed in both the monocot-infecting and the dicot-infecting mastreviruses further elucidating the mechanisms these viruses employ to evolve rapidly. Extensive sampling in a wide range of geographic regions provides insights into the true geographic range of species such as MSV and CpCDV. Given that mastreviruses have been able to move globally and Australia has been identified as a major mastrevirus diversity hotspot it is conceivable that mastreviruses are also present in New Zealand. In Chapter Seven and Eight this is explored by using a viral metagenomic approach to investigate the ssDNA viral populations associated with wild grasses and sewage material in New Zealand. Although no mastreviruses were recovered, this endeavour resulted in the discovery of more than 50 novel circular Rep-encoding ssDNA (CRESS DNA) viruses associated with non-cultivated grasses and treated sewage material, many of which are similar to mastreviruses and other geminiviruses. These discoveries expand current knowledge on the diversity of ssDNA viruses present in New Zealand and further highlight this viral metagenomic approach as an effective method for ssDNA virus discovery. Overall the results discussed in this thesis provide insights into mastrevirus diversity and dynamics as well as revealing a wealth of novel CRESS DNA viruses, some of which share similarities to geminiviruses.
104

Genomic variations in the EGFR pathway in relation to skin toxicity of EGFR inhibitors analyzed by deep sequencing

Hasheminasab, Sayedmohammad 22 April 2015 (has links)
No description available.
105

Bioinformatics challenges of high-throughput SNP discovery and utilization in non-model organisms

2014 October 1900 (has links)
A current trend in biological science is the increased use of computational tools for both the production and analysis of experimental data. This is especially true in the field of genomics, where advancements in DNA sequencing technology have dramatically decreased the time and cost associated with DNA sequencing resulting in increased pressure on the time required to prepare and analyze data generated during these experiments. As a result, the role of computational science in such biological research is increasing. This thesis seeks to address several major questions with respect to the development and application of single nucleotide polymorphism (SNP) resources in non-model organisms. Traditional SNP discovery using polymerase chain reaction (PCR) amplification and low-throughput DNA sequencing is a time consuming and laborious process, which is often limited by the time required to design intron-spanning PCR primers. While next-generation DNA sequencing (NGS) has largely supplanted low-throughput sequencing for SNP discovery applications, the PCR based SNP discovery method remains in use for cost effective, targeted SNP discovery. This thesis seeks to develop an automated method for intron-spanning PCR design which would remove a significant bottleneck in this process. This work develops algorithms for combining SNP data from multiple individuals, independent of the DNA sequencing platforms, for the purpose of developing SNP genotyping arrays. Additionally, tools for the filtering and selection of SNPs will be developed, providing start to finish support for the development of SNP genotyping arrays in complex polyploids using NGS. The result of this work includes two automated pipelines for the design of intron-spanning PCR primers, one which designs a single primer pair per target and another that designs multiple primer pairs per target. These automated pipelines are shown to reduce the time required to design primers from one hour per primer pair using the semi-automated method to 10 minutes per 100 primer pairs while maintaining a very high efficacy. Efficacy is tested by comparing the number of successful PCR amplifications of the semi- automated method with that of the automated pipelines. Using the Chi-squared test, the semi-automated and automated approaches are determined not to differ in efficacy. Three algorithms for combining SNP output from NGS data from multiple individuals are developed and evaluated for their time and space complexities. These algorithms were found to be computationally efficient, requiring time and space linear to the size of the input. These algorithms are then implemented in the Perl language and their time and memory performance profiled using experimental data. Profiling results are evaluated by applying linear models, which allow for predictions of resource requirements for various input sizes. Additional tools for the filtering of SNPs and selection of SNPs for a SNP array are developed and applied to the creation of two SNP arrays in the polyploid crop Brassica napus. These arrays, when compared to arrays in similar species, show higher numbers of polymorphic markers and better 3-cluster genotype separation, a viable method for determining the efficacy of design in complex genomes.
106

Sviluppo ed applicazione di pipilines bioinformatiche per l'analisi di dati NGS / DEVELOPMENT AND APPLICATION OF BIOINFORMATICS PIPELINES FOR NEXT GENERATION SEQUENCING DATA ANALYSIS

LAMONTANARA, ANTONELLA 28 January 2015 (has links)
Lo sviluppo delle tecnologie di sequenziamento ha portato alla nascita di strumenti in grado di produrre gigabasi di dati di sequenziamento in una singola corsa. Queste tecnologie, comunemente indicate come Next Generation Sequencing o NGS, producono grandi e complessi dataset la cui analisi comporta diversi problemi a livello bioinformatico. L'analisi di questo tipo di dati richiede la messa a punto di pipelines computazionali il cui sviluppo richiede un lavoro di scripting necessario per concatenare i softwares già esistenti. Questa tesi tratta l'aspetto metodologico dell'analisi di dati NGS ottenuti con tecnologia Illumina. In particolare in essa sono state sviluppate tre pipelines bioinformatiche applicate ai seguenti casi studio: 1) uno studio di espressione genica mediante RNA-seq in "Olea europaea" finalizzato all’indagine dei meccanismi molecolari alla base dell’acclimatazione al freddo in questa specie; 2) uno studio mediante RNA-seq finalizzato all’identificazione dei polimorfismi di sequenza nel trascrittoma di due razze bovine mirato a produrre un ampio catalogo di marcatori di tipo SNPs; 3) il sequenziamento, l’assemblaggio e l’annotazione del genoma di un ceppo di Lactobacillus plantarum che mostrava potenziali proprietà probiotiche. / The advance in sequencing technologies has led to the birth of sequencing platforms able to produce gigabases of sequencing data in a single run. These technologies commonly referred to as Next Generation Sequencing or NGS produce millions of short sequences called “reads” generating large and complex datasets that pose several challenges for Bioinformatics. The analysis of large omics dataset require the development of bioinformatics pipelines that are the organization of the bioinformatics tools in computational chains in which the output of one analysis is the input of the subsequent analysis. A work of scripting is needed to chain together a group of existing software tools.This thesis deals with the methodological aspect of the data analysis in NGS sequencing performed with the Illumina technology. In this thesis three bioinformatics pipelines were developed.to the following cases of study: 1) a global transcriptome profiling of “Oleaeuropeae” during cold acclimation, aimed to unravel the molecular mechanisms of cold acclimation in this species; 2) a SNPs profiling in the transcriptome of two cattle breeds aimed to produce an extensive catalogue of SNPs; 3) the genome sequencing, the assembly and annotation of the genome of a Lactobacillus plantarum strain showing probiotic properties.
107

MR-CUDASW - GPU accelerated Smith-Waterman algorithm for medium-length (meta)genomic data

2014 November 1900 (has links)
The idea of using a graphics processing unit (GPU) for more than simply graphic output purposes has been around for quite some time in scientific communities. However, it is only recently that its benefits for a range of bioinformatics and life sciences compute-intensive tasks has been recognized. This thesis investigates the possibility of improving the performance of the overlap determination stage of an Overlap Layout Consensus (OLC)-based assembler by using a GPU-based implementation of the Smith-Waterman algorithm. In this thesis an existing GPU-accelerated sequence alignment algorithm is adapted and expanded to reduce its completion time. A number of improvements and changes are made to the original software. Workload distribution, query profile construction, and thread scheduling techniques implemented by the original program are replaced by custom methods specifically designed to handle medium-length reads. Accordingly, this algorithm is the first highly parallel solution that has been specifically optimized to process medium-length nucleotide reads (DNA/RNA) from modern sequencing machines (i.e. Ion Torrent). Results show that the software reaches up to 82 GCUPS (Giga Cell Updates Per Second) on a single-GPU graphic card running on a commodity desktop hardware. As a result it is the fastest GPU-based implemen- tation of the Smith-Waterman algorithm tailored for processing medium-length nucleotide reads. Despite being designed for performing the Smith-Waterman algorithm on medium-length nucleotide sequences, this program also presents great potential for improving heterogeneous computing with CUDA-enabled GPUs in general and is expected to make contributions to other research problems that require sensitive pairwise alignment to be applied to a large number of reads. Our results show that it is possible to improve the performance of bioinformatics algorithms by taking full advantage of the compute resources of the underlying commodity hardware and further, these results are especially encouraging since GPU performance grows faster than multi-core CPUs.
108

Proximity Ligation Assays for Disease Biomarkers Analysis

Nong, Rachel Yuan January 2011 (has links)
One of the pressing needs in the field of disease biomarker discovery is new technologies that could allow high performance protein analysis in different types of clinical material, such as blood and solid tissues. This thesis includes four approaches that address important limitations of current technologies, thus enabling highly sensitive, specific and parallel protein measurements. Paper I describes a method for sensitive singleplex protein detection in complex biological samples, namely solid phase proximity ligation assay (SP-PLA). SP-PLA exhibited improved sensitivity compared to conventional sandwich immunoassays. We applied SP-PLA to validate the potential of GDF-15 as a biomarker for cardiovascular disease.   Paper II describes ProteinSeq, a multiplexed immunoassay based on the principle of SP-PLA, for parallel detection of 36 proteins using next-generation sequencing as readout. ProteinSeq exhibited improved sensitivity compared to multiplexed sandwich immunoassays, and the potential to achieve even higher levels of multiplexing while preserving a high sensitivity and specificity. We applied ProteinSeq to analyze 36 proteins, including one internal control, in 5 μl of plasma samples in a cohort of patients with cardiovascular disease and healthy controls. Paper III describes PLA-DTM, a strategy for recording all possible interactions between sets of proteins in clinical samples. Individual proteins and their interactions are first encoded to dual barcoded DNA by PLA, and the barcodes are interrogated by a method named dual tag microarray (DTM). We applied the method for studying interactions among protein members of the NFκB signaling pathway. Paper IV describes a novel probing strategy for analyzing individual biomolecules in solution or in situ. The technique employs a new class of probes for unfolding proximity ligation assays - uPLA probes. The probes are designed so that each probe set is sufficient in forming and replicating circular DNA reporter, without interactions among themselves when incubated with the sample. The uPLA probing strategy provides ease in the design of multiple probe sets in parallelized assays while enhancing the specificity of detection. We used the uPLA probes to detect various targets, including synthetic DNA and cancer-related transcripts in situ.
109

The detection of mycoviral sequences in grapevine using next-generation sequencing

Espach, Yolandi 03 1900 (has links)
Thesis (MSc)--Stellenbosch University, 2013. / ENGLISH ABSTRACT: Metagenomic studies that make use of next-generation sequencing (NGS) generate large amounts of sequence data, representing the genomes of multiple organisms of which no prior knowledge is necessarily available. In this study, a metagenomic NGS approach was used to detect multiple novel mycoviral sequences in grapevine phloem tissue. Individual sequencing libraries of doublestranded RNA (dsRNA) from two grapevine leafroll diseased (GLD) and three shiraz diseased (SD) vines were sequenced using an Illumina HiScanSQ instrument. Over 3.2 million reads were generated from each of the samples and these reads were trimmed and filtered for quality before being de novo assembled into longer contigs. The assembled contigs were subjected to BLAST (Basic Local Alignment Search Tool) analyses against the NCBI (National Centre for Biotechnology Information) database and classified according to database sequences with which they had the highest identity. Twenty-six putative mycovirus species were identified, belonging to the families Chrysoviridae, Endornaviridae, Narnaviridae, Partitiviridae and Totiviridae. Two of the identified mycoviruses, namely grapevine-associated chrysovirus (GaCV) and grapevine-associated mycovirus 1 (GaMV-1) have previously been identified in grapevine while the rest appeared to be novel mycoviruses not present in the NCBI database. Primers were designed from the de novo assembled mycoviral sequences and used to screen the grapevine dsRNA used for sequencing as well as endophytic fungi isolated from the five sample vines. Only two mycoviruses, related to sclerotinia sclerotiorum partitivirus S and chalara elegans endornavirus 1 (CeEV-1), could be detected in grapevine dsRNA and in fungus isolates. In order to validate the presence of mycoviruses in grapevine phloem tissue, two additional sequencing runs, using an Illumina HiScanSQ and an Applied Biosystems (ABI) SOLiD 5500xl instrument respectively, were performed. These runs generated more and higher quality sequence data than the first sequencing run. Twenty-two of the putative mycoviral sequences initially detected were detected in the subsequent sequence datasets, as well as an additional 29 species not identified in the first HiScanSQ sequence datasets. The samples harboured diverse mycovirus populations, with as many as 19 putative species identified in a single vine. This indicates that the complete virome of diseased grapevines will include a high number of mycoviruses. Additionally, the complete genome of a novel endornavirus, for which we propose the name grapevine endophyte endornavirus (GEEV), was assembled from one of the second HiScanSQ sequence datasets. This is the first complete genome of a mycovirus detected in grapevine. Grapevine endophyte endornavirus has the highest sequence similarity to CeEV-1 and is the same virus that was previously detected in fungus isolates using the mycovirus primers. The virus was detected in two fungus isolates, namely Stemphylium sp. and Aureobasidium pullulans, which is of interest since mycoviruses are not known to be naturally associated with two distinctly different fungus genera. Mycoviral sequence data generated in this study can be used to further investigate the diversity and the effect of mycoviruses in grapevine. / AFRIKAANSE OPSOMMING: Metagenomiese studies, wat gebruik maak van volgende-generasie volgordebepalingstegnologie, het die vermoë om die genetiese samestelling van veelvoudige onbekende organismes te bepaal deurdat dit groot hoeveelhede data genereer. Die bogenoemde tegniek was in hierdie studie aangewend om aantal nuwe mikovirusse in die floëem weefsel van wingerd te identifiseer. Dubbelstring-RNS was gesuiwer vanuit twee druiwestokke met rolbladsiekte en drie met shirazsiekte en Illumina HiScanSQ instrument is gebruik om meer as 3.2 miljoen volgorde fragmente te genereer van elk van die monsters. Lae-kwaliteit volgordes was verwyder en die oorblywende kort volgorde fragmente was saamgestel om langer konstrukte te vorm wat met behulp van BLAST soektogte teen die NCBI databasis geïdentifiseer kon word. Ses-en-twintig mikovirus spesies, wat aan die families Chrysoviridae, Endornaviridae, Narnaviridae, Partitiviridae en Totiviridae behoort, was geïdentifiseer. Twee van die geïdentifiseerde mikovirusse, naamlik grapevine-associated chrysovirus (GaCV) en grapevine-associated mycovirus 1 (GaMV-1), was voorheen al in wingerd gekry terwyl die res nuwe mikovirusse is wat tans nie in die NCBI databasis voorkom nie. Inleiers was ontwerp vanaf die saamgestelde mikovirus basisvolgordes en gebruik om wingerd dubbelstring-RNS sowel as swamme wat vanuit die wingerd geïsoleer is te toets vir die teenwoordigheid van hierdie mikovirusse. Slegs twee mikovirusse, wat onderskeidelik verwant is aan sclerotinia sclerotiorum partitivirus S en chalara elegans endornavirus 1 (CeEV-1), kon deur middel van die inleiers in wingerd en swam isolate geïdentifiseer word. Twee addisionele volgordebepalingsreaksies, wat gebruik gemaak het van die Illumina HiScanSQ en ABI SOLiD 5500xl volgordebepalingsplatforms, was gebruik om die teenwoordigheid van mikovirusse in wingerd te bevestig. Groter hoeveelheid volgorde fragmente was geprodusser wat ook van hoër gehalte was as dié van die eerste volgordebepalingsreaksie. Twee-en-twintig mikovirus spesies kon weer geïdentifiseer word, sowel as 29 spesies wat nie in die eerste HiScanSQ basisvolgorde datastelle gevind was nie. Die wingerdstokke wat in hierdie studie ondersoek was, het hoë diversiteit van mikovirusse bevat aangesien daar tot 19 mikovirus spesies in enkele wingerdstok geïdentifiseer was. Dit is aanduiding dat volledige virus profiele van siek wingerdstokke aantal mikovirusse sal insluit. Die vollengte genoomvolgorde van voorheen onbekende endornavirus was saamgestel vanuit een van die tweede HiScanSQ volgorde datastelle. Dit is die eerste mikovirus wat in wingerd gevind word waarvan die volledige genoomvolgorde bepaal is en ons stel die naam grapevine endophyte endornavirus (GEEV) voor vir hierdie virus. Grapevine endophyte endornavirus is die naaste verwant aan CeEV-1 en is dieselfde virus wat voorheen in wingerd dubbelstring-RNS en swam isolate gevind was deur middel van die mikovirus inleiers. Swam isolate waarin GEEV gevind is, was geïdentifiseer as Stemphylium sp. en Aureobasidium pullulans. Dit is van belang dat GEEV in twee swam isolate gevind is wat aan verskillende genusse behoort aangesien hierdie verskynsel nog nie voorheen in die natuur gevind is nie. Mikovirus nukleiensuurvolgordes wat in hierdie studie bepaal was kan gebruik word in toekomstige studies om die verskeidenheid en impak van mikovirusse in wingerd verder te ondersoek. / National Research Foundation (NRF) / Stellenbosch University
110

Computational methods for RNA integrative biology

Selega, Alina January 2018 (has links)
Ribonucleic acid (RNA) is an essential molecule, which carries out a wide variety of functions within the cell, from its crucial involvement in protein synthesis to catalysing biochemical reactions and regulating gene expression. Such diverse functional repertoire is indebted to complex structures that RNA can adopt and its flexibility as an interacting molecule. It has become possible to experimentally measure these two crucial aspects of RNA regulatory role with such technological advancements as next-generation sequencing (NGS). NGS methods can rapidly obtain the nucleotide sequence of many molecules in parallel. Designing experiments, where only the desired parts of the molecule (or specific parts of the transcriptome) are sequenced, allows to study various aspects of RNA biology. Analysis of NGS data is insurmountable without computational methods. One such experimental method is RNA structure probing, which aims to infer RNA structure from sequencing chemically altered transcripts. RNA structure probing data is inherently noisy, affected both by technological biases and the stochasticity of the underlying process. Most existing methods do not adequately address the issue of noise, resorting to heuristics and limiting the informativeness of their output. In this thesis, a statistical pipeline was developed for modelling RNA structure probing data, which explicitly captures biological variability, provides automated bias-correcting strategies, and generates a probabilistic output based on experimental measurements. The output of our method agrees with known RNA structures, can be used to constrain structure prediction algorithms, and remains robust to reduced sequence coverage, thereby increasing sensitivity of the technology. Another recent experimental innovation maps RNA-protein interactions at very high temporal resolution, making it possible to study rapid binding events happening on a minute time scale. In this thesis, a non-parametric algorithm was developed for identifying significant changes in RNA-protein binding time-series between different conditions. The method was applied to novel yeast RNA-protein binding time-course data to study the role of RNA degradation in stress response. It revealed pervasive changes in the binding to the transcriptome of the yeast transcription termination factor Nab3 and the cytoplasmic exoribonuclease Xrn1 under nutrient stress. This challenged the common assumption of viewing transcriptional changes as the major driver of changes in RNA expression during stress and highlighted the importance of degradation. These findings inspired a dynamical model for RNA expression, where transcription and degradation rates are modelled using RNA-protein binding time-series data.

Page generated in 0.1631 seconds