• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2450
  • 314
  • 255
  • 242
  • 52
  • 46
  • 31
  • 31
  • 31
  • 31
  • 31
  • 31
  • 20
  • 20
  • 14
  • Tagged with
  • 4116
  • 1474
  • 559
  • 550
  • 529
  • 452
  • 444
  • 442
  • 441
  • 417
  • 340
  • 337
  • 335
  • 332
  • 327
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
391

Assembling improved gene annotations in Clostridium acetobutylicum with RNA sequencing

Ralston, Matthew T. 25 March 2015 (has links)
<p> The <i>C. acetobutylicum</i> genome annotation has been markedly improved by integrating bioinformatic predictions with RNA sequencing(RNA-seq) data. Samples were acquired under butanol, butyrate, and unstressed treatments across various growth stages to sample the transcriptome from a range of physiologically relevant conditions. Analysis of an initial assembly revealed errors due to technical and biological background signals, challenges with few solutions. Hurdles for RNA-seq transcriptome mapping research include optimizing library complexity and sequencing depth, yet most studies in bacteria report low depth and ignore the effect of ribosomal RNA abundance and other sources on the effective sequencing depth. </p><p> In this work, workflows were established to address type I and II errors associated with these challenges. An integrative analysis method was developed to combine motif predictions, single-nucleotide resolution sequencing depth, and library complexity to resolve these errors during assembly curation. This contextualization minimized false positive error and determined gene boundaries, in some cases, to the exact basepair of prior studies. Curation of the pSOL1 megaplasmid reconciled transcriptome assembly statistics with findings from <i>E. coli</i>. </p><p> The resulting annotation can be readily explored and downloaded through a customized genome browser, enabling future genomic and transcriptomic research in this organism. This work demonstrates the first strand-specific transcriptome assembly in a <i>Clostridium</i> organism. This method can improve the precision of transcript boundary estimates in bacterial transcriptome mapping studies.</p>
392

Copy Number Analysis of Cancer

Mayrhofer, Markus January 2015 (has links)
By accurately describing cancer genomes, we may link genomic mutations to phenotypic effects and eventually treat cancer patients based on the molecular cause of their disease, rather than generalizing treatment based on cell morphology or tissue of origin. Alteration of DNA copy number is a driving mutational process in the formation and progression of cancer. Deletions and amplifications of specific chromosomal regions are important for cancer diagnosis and prognosis, and copy number analysis has become standard practice for many clinicians and researchers. In this thesis we describe the development of two computational methods, TAPS and Patchwork, for analysis of genome-wide absolute allele-specific copy number per cell in tumour samples. TAPS is used with SNP microarray data and Patchwork with whole genome sequencing data. Both are suitable for unknown average ploidy of the tumour cells, are robust to admixture of genetically normal cells, and may be used to detect genetic heterogeneity in the tumour cell population. We also present two studies where TAPS was used to find copy number alterations associated with risk of recurrence after surgery, in ovarian cancer and colon cancer. We discuss the potential of such prognostic markers and the use of allele-specific copy number analysis in research and diagnostics.
393

Monovalent cation effects on histidine kinase CheA activity

Balboa, Marie Sophie 01 July 2015 (has links)
<p> CheA is a multi-domain histidine kinase believed to be a central component of all bacterial and archaeal chemosensory arrays. The best studied CheA proteins are those of <i>Escherichia coli, Salmonella typhimurium,</i> and <i> Thermotoga maritima.</i> The P4 catalytic domain of CheA is of particular interest due to its role in ATP binding and auto-phosphorylation of the substrate His residue on the P1 substrate domain, a process which involves Mg<sup> 2+</sup> in the P4 domain ATP-binding pocket. CheA auto-phosphorylation reaction has also long been known to be sensitive to the monovalent cation composition of the reaction buffer. Recent studies in GHKL superfamily members MutL (from <i>E coli)</i> and BCK (from <i>Rattus norvegicus) </i> demonstrated improved activity in the presence of monovalent K<sup> +</sup>. Here experimental and computational analyses assess the effect of monovalent cations on the activity of <i>Salmonella typhimurium</i> CheA, and generate a model for a putative monovalent cation binding pocket. The findings show that CheA prefers the physiological K<sup>+</sup> ion over the other monovalent cations tested, indicating the site is specific and tuned to possess the appropriate K<sup>+</sup> affinity for the cytoplasmic environment.</p>
394

Purification and characterisation of plasmodium falciparum Hypoxanthine phosphoribosyltransferase

Murungi, Edwin Kimathi. January 2007 (has links)
Malaria remains the most important parasitic disease worldwide. It is estimated that over 500 million infections and more that 2.7 million deaths arising from malaria occur each year. Most (90%) of the infections occur in Africa with the most affected groups being children of less than five years of age and women. this dire situation is exacerbated by the emrggence of drug resistant strains of Plasmodium falciparum. The work reported in this thesis focuses on improving the purification of PfHPRT by investigating the characteristics of anion exchange DE-52 chromatography (the first stage of purification), developing an HPLC gel filtration method for examining the quaternary structure of the protein and possible end stage purification, and initialcrystalization trials. a homology model of the open, unligaded PfHPRT is constructed using the atoomic structures of human, T.ccruz and STryphimurium HPRT as templates. / Magister Scientiae - MSc
395

A web semantic for SBML merge

Thavappiragasam, Mathialakan 05 November 2014 (has links)
<p> The manipulation of XML based relational representations of biological systems (BioML for Bioscience Markup Language) is a big challenge in systems biology. The needs of biologists, like translational study of biological systems, cause their challenges to become grater due to the material received in next generation sequencing. Among these BioML's, SBML is the de facto standard file format for the storage and exchange of quantitative computational models in systems biology, supported by more than 257 software packages to date. The SBML standard is used by several biological systems modeling tools and several databases for representation and knowledge sharing. Several sub systems are integrated in order to construct a complex bio system. The issue of combining biological sub-systems by merging SBML files has been addressed in several algorithms and tools. But it remains impossible to build an automatic merge system that implements reusability, flexibility, scalability and sharability. The technique existing algorithms use is name based component comparisons. This does not allow integration into Workflow Management System (WMS) to build pipelines and also does not include the mapping of quantitative data needed for a good analysis of the biological system. In this work, we present a deterministic merging algorithm that is consumable in a given WMS engine, and designed using a novel biological model similarity algorithm. This model merging system is designed with integration of four sub modules: SBMLChecker, SBMLAnot, SBMLCompare, and SBMLMerge, for model quality checking, annotation, comparison, and merging respectively. The tools are integrated into the BioExtract server leveraging iPlant collaborative resources to support users by allowing them to process large models and design work flows. These tools are also embedded into a user friendly online version SW4SBMLm.</p>
396

Evaluating Transcriptional Regulation Through Multiple Lenses

Grubisic, Ivan 19 November 2014 (has links)
<p> Scientific research, especially within the space of translational research is becoming increasingly multidisciplinary. With the development of each new method there is not only a need for a broad fundamental understanding of all the sciences and mathematics, but also an acute awareness of how errors propagate across methods, the limitation of the methods and what contextual frameworks need to be used for the interpretation. The ability to understand transcriptional mechanisms and the affect that subtle changes in equilibrium may have on cell fate decisions has been greatly advanced by next generation sequencing and subsequent tools that have been developed. Bioinformatic techniques can serve multiple roles. They fundamentally provide a global picture of what is happening within an experimental condition which can then be used to either confirm individual experimental findings as globally relevant, or to discover new insights to inform the next iteration of experiments. Many of the experiments are done in <i>in vitro</i> conditions and therefore I have also focused energy on trying to understand how the mechanical inputs, largely not representative of what is occurring <i>in vivo,</i> from these methods affect transcriptional regulation. Much of this research requires the switching of frameworks to understand how results from disparate data sources can be correlated. I then applied a similar thought process to the development of <i>Lens.</i> Without an effective means of communicating research findings in an elegant and streamline mannered, we are slowing down the ability for researchers to learn new frame- works to efficiently approach the next research questions. In addition to better methods of communicating, we also need more modular and simplified tools that can be applied to various experimental systems to increase the speed and efficiency of translational research.</p>
397

Taxonomic assignment of gene sequences using hidden Markov models

Huang, Huanhua 16 October 2014 (has links)
<p> Our ability to study communities of microorganisms has been vastly improved by the development of high-throughput DNA sequences. These technologies however can only sequence short fragments of organism's genomes at a time, which introduces many challenges in translating sequences results to biological insight. The field of bioinformatics has arisen in part to address these problems. </p><p> One bioinformatics problem is assigning a genetic sequence to a source organism. It is now common to use high&minus;throughput, short&minus;read sequencing technologies, such as the Illumina MiSeq, to sequence the 16S rRNA gene from a community of microorganisms. Researchers use this information to generate a profile of the different microbial organisms (i.e., the taxonomic composition) present in an environmental sample. There are a number of approaches for assigning taxonomy to genetic sequences, but all suffer from problems with accuracy. The methods that have been most widely used are pairwise alignment methods, like BLAST, UCLUST, and RTAX, and probability-based methods, such as RDP and MOTHUR. These methods can classify microbial sequences with high accuracy when sequences are long (e.g., thousand bases), however accuracy decreases as sequences are shorter. Current high&minus;throughout sequencing technologies generates sequences between about 150 and 500 bases in length. </p><p> In my thesis I have developed new software for assigning taxonomy to short DNA sequences using profile Hidden Markov Models (HMMs). HMMs have been applied in related areas, such as assigning biological functions to protein sequences, and I hypothesize that it might be useful for achieving high accuracy taxonomic assignments from 16S rRNA gene sequences. My method builds models of 16S rRNA sequences for different taxonomic groups (kingdom, phylum, class, order, family genus and species) using the Greengenes 16S rRNA database. Given a sequence with unknown taxonomic origin, my method searches each kingdom model to determine the most likely kingdom. It then searches all of the phyla within the highest scoring kingdom to determine the most likely phylum. This iterative process continues until the sequence cannot be assigned at a taxonomic level with a user-defined confidence level, or until a species-level assignment is made that meets the user-defined confidence level. </p><p> I next evaluated this method on both artificial and real microbial community data, with both qualitative and quantitative metrics of method performance. The evaluation results showed that in the qualitative analyses (specificity and sensitivity) my method is not as good as the previously existing methods. However, the accuracy in the quantitative analysis was better than some other pre-existing methods. This suggests that my current implementation is sensitive to false positives, but is better at classifying more sequences than the other methods. </p><p> I present my method, my evaluations, and suggestions for next steps that might improve the performance of my HMM-based taxonomic classifier.</p>
398

Ambiguous fragment assignment for high-throughput sequencing experiments

Roberts, Adam 28 May 2014 (has links)
<p> As the cost of short-read, high-throughput DNA sequencing continues to fall rapidly, new uses for the technology have been developed aside from its original purpose in determining the genome of various species. Many of these new experiments use the sequencer as a digital counter for measuring biological activities such as gene expression (RNA-Seq) or protein binding (ChIP-Seq). </p><p> A common problem faced in the analysis of these data is that of sequenced fragments that are "ambiguous", meaning they resemble multiple loci in a reference genome or other sequence. In early analyses, such ambiguous fragments were ignored or were assigned to loci using simple heuristics. However, statistical approaches using maximum likelihood estimation have been shown to greatly improve the accuracy of downstream analyses and have become widely adopted Optimization based on the expectation-maximization (EM) algorithm are often employed by these methods to find the optimal sets of alignments, with frequent enhancements to the model. Nevertheless, these improvements increase complexity, which, along with an exponential growth in the size of sequencing datasets, has led to new computational challenges. </p><p> Herein, we present our model for ambiguous fragment assignment for RNA-Seq, which includes the most comprehensive set of parameters of any model introduced to date, as well as various methods we have explored for scaling our optimization procedure. These methods include the use of an online EM algorithm and a distributed EM solution implemented on the Spark cluster computing system. Our advances have resulted in the first efficient solution to the problem of fragment assignment in sequencing.</p><p> Furthermore, we are the first to create a fully generalized model for ambiguous fragment assignment and present details on how our method can provide solutions for additional high-throughput sequencing assays including ChIP-Seq, Allele-Specific Expression (ASE), and the detection of RNA-DNA Differences (RDDs) in RNA-Seq.</p>
399

Annotation Concept Synthesis and Enrichment Analysis: a Logic-Based Approach to the Interpretation of High-Throughput Biological Experiments

Jiline, Mikhail 26 January 2011 (has links)
Annotation Enrichment Analysis is a widely used analytical methodology to process data generated by high-throughput genomic and proteomic experiments such as gene expression microarrays. The analysis uncovers and summarizes discriminating background information for sets of genes identified by the previous processing stages (e.g., a set of differentially expressed genes, a cluster). Enrichment analysis algorithms attach annotations to the genes and then discover statistical fluctuations of individual annotation terms in a given gene subset. The annotation terms represent different aspects of biological knowledge and come from databases such as GO, BIND, KEGG. Typical statistical models used to detect enrichments or depletions of annotation terms are hypergeometric, binomial and X2. At the end, the discovered information is utilized by human experts to find biological interpretations of the experiments. The main drawback of AEA is that it isolates and tests for overrepresentation of isolated individual annotation terms or groups of similar terms. As a result, AEA is limited in its ability to uncover complex phenomena involving relationships between multiple annotation terms from various knowledge bases. Also, AEA assumes that annotations describe the whole object of interest, which makes it difficult to apply it to sets of compound objects (e.g., sets of protein-protein interactions) and to sets of objects having an internal structure (e.g., protein complexes). To overcome this shortcoming, we propose a novel logic-based Annotation Concept Synthesis and Enrichment Analysis (ACSEA) approach. In this approach, the source annotation information, experimental data and uncovered enriched annotations are represented as First-Order Logic (FOL) statements. ACSEA uses the fusion of inductive logic reasoning with statistical inference to uncover more complex phenomena captured by the experiments. The proposed paradigm allows a synthesis of enriched annotation concepts that better describe the observed biological processes. The methodological advantage of Annotation Concept Synthesis and Enrichment Analysis is six-fold. Firstly, it is easier to represent complex, structural annotation information. Information already captured and formalized in OWL and RDF knowledge bases can be directly utilized. Secondly, it is possible to synthesize and analyze complex annotation concepts. Thirdly, it is possible to perform the enrichment analysis for sets of aggregate objects (such as sets of genetic interactions, physical protein-protein interactions or sets of protein complexes). Fourthly, annotation concepts are straightforward to interpret by a human expert. Fifthly, the logic data model and logic induction are a common platform that can integrate specialized analytical tools (e.g. tools for numerical, structural and sequential analysis). Sixthly, used statistical inference methods are robust on noisy and incomplete data, scalable and trusted by human experts in the field. In this thesis we developed and implemented the ACSEA approach. We evaluate it on large-scale datasets from several microarray experiments and on a clustered genome-wide genetic interaction network using different biological knowledge bases. Also, we define a statistical model of experimental and annotation data and evaluate ACSEA on synthetic datasets. The discovered interpretations are more enriched in terms of P- and Q-values than the interpretations found by AEA, are highly integrative in nature, and include analysis of quantitative and structured information present in the knowledge bases. The results suggest that ACSEA can significantly boost the effectiveness of the processing of high-throughput experiment data.
400

RiboFSM: Frequent Subgraph Mining for the Discovery of RNA Structures and Interactions

Gawronski, Alexander 05 November 2013 (has links)
Frequent subgraph mining is a useful method for extracting biologically relevant patterns from a set of graphs or a single large graph. Here, the graph represents all possible RNA structures and interactions. Patterns that are significantly more frequent in this graph over a random graph are extracted. We hypothesize that these patterns are most likely to represent a biological mechanisms. The graph representation used is a directed dual graph, extended to handle intermolecular interactions. The graph is sampled for subgraphs, which are labeled using a canonical labeling method and counted. The resulting patterns are compared to those created from a randomized dataset and scored. The algorithm was applied to the mitochondrial genome of the kinetoplastid species Trypanosoma brucei. This species has a unique RNA editing mechanism that has been well studied, making it a good model organism to test RiboFSM. The most significant patterns contain two stem-loops, indicative of gRNA, and represent interactions of these structures with target mRNA.

Page generated in 0.079 seconds