Spelling suggestions: "subject:"bioinformatics"" "subject:"ioinformatics""
111 |
Dynamics of Microbial Genome EvolutionHooper, Sean January 2003 (has links)
The success of microbial life on Earth can be attributed not only to environmental factors, but also to the surprising hardiness, adaptability and flexibility of the microbes themselves. They are able to quickly adapt to new niches or circumstances through gene evolution and also by sheer strength of numbers, where statistics favor otherwise rare events. An integral part of adaptation is the plasticity of the genome; losing and acquiring genes depending on whether they are needed or not. Genomes can also be the birthplace of new gene functions, by duplicating and modifying existing genes. Genes can also be acquired from outside, transcending species boundaries. In this work, the focus is set primarily on duplication, deletion and import (lateral transfer) of genes – three factors contributing to the versatility and success of microbial life throughout the biosphere. We have developed a compositional method of identifying genes that have been imported into a genome, and the rate of import/deletion turnover has been appreciated in a number of organisms. Furthermore, we propose a model of genome evolution by duplication, where through the principle of gene amplification, novel gene functions are discovered within genes with weak- or secondary protein functions. Subsequently, the novel function is maintained by selection and eventually optimized. Finally, we discuss a possible synergic link between lateral transfer and duplicative processes in gene innovation.
|
112 |
Predicting Function of Genes and Proteins from Sequence, Structure and Expression DataHvidsten, Torgeir R. January 2004 (has links)
Functional genomics refers to the task of determining gene and protein function for whole genomes, and requires computational analysis of large amounts of biological data including DNA and protein sequences, protein structures and gene expressions. Machine learning methods provide a powerful tool to this end by first inducing general models from such data and already characterized genes or proteins and then by providing hypotheses on the functions of the remaining, uncharacterized cases. This study contains four parts giving novel contributions to functional genomics through the analysis of different biological data and different aspects of biological functions. Gene Ontology played an important part in this research providing a controlled vocabulary for describing the cellular roles of genes and proteins in terms of specific molecular functions and broad biological processes. The first part used gene expression time profiles to learn models capable of predicting the participation of genes in biological processes. The model consists of IF-THEN rules associating biological processes with minimal set of discrete changes in expression level over limited periods of time. The models were used to hypothesize new biological processes for both characterized and uncharacterized genes. The second part investigated the combinatorial nature of gene regulation by inducing IF-THEN rules associating minimal combinations of sequence motifs common to genes with similar expression profiles. Such combinations were shown to be significantly correlated to function, and provided hypotheses on the mechanisms behind the regulation of gene expression in several biological responses. The third part used a novel concept of local descriptors of protein structure to investigate sequence patterns governing protein structure at a local level and to predict the topological class (fold) of protein domains from sequence. Finally, the fourth part used local descriptors to represent protein structure and induced IF-THEN rule models predicting molecular function from structure.
|
113 |
Method for recognizing local descriptors of protein structures using Hidden Markov ModelsBjörkholm, Patrik January 2008 (has links)
<p>Being able to predict the sequence-structure relationship in proteins will extend the scope of many bioinformatics tools relying on structure information. Here we use Hidden Markov models (HMM) to recognize and pinpoint the location in target sequences of local structural motifs (local descriptors of protein structure, LDPS) These substructures are composed of three or more segments of amino acid backbone structures that are in proximity with each other in space but not necessarily along the amino acid sequence. We were able to align descriptors to their proper locations in 41.1% of the cases when using models solely built from amino acid information. Using models that also incorporated secondary structure information, we were able to assign 57.8% of the local descriptors to their proper location. Further enhancements in performance was yielded when threading a profile through the Hidden Markov models together with the secondary structure, with this material we were able assign 58,5% of the descriptors to their proper locations. Hidden Markov models were shown to be able to locate LDPS in target sequences, the performance accuracy increases when secondary structure and the profile for the target sequence were used in the models.</p>
|
114 |
Enhanced bioinformatics data modeling concepts and their use in querying and integrationJi, Feng. January 2008 (has links)
Thesis (Ph.D.) -- University of Texas at Arlington, 2008.
|
115 |
Protein fold recognition using neural networksLin, Guang January 2003 (has links)
No description available.
|
116 |
A widespread impact on small RNAs and gene networks in rice MSP1/OsTDL1a mutants, partners with key roles in anther developmentFei, Qili 09 May 2017 (has links)
<p> Dissection of the genetic pathways and mechanisms by which anther development occurs in grasses is crucial for both a basic understanding of plant development and for traits of agronomic importance like male sterility. In rice, MULTIPLE SPOROCYTES1 (MSP1), a leucine-rich-repeat receptor kinase, play an important role in anther development by limiting the number of sporocytes. OsTDL1a (a TPD1-like gene in rice) encodes a small protein which acts as a cofactor of MSP1 in the same regulatory pathway. In this study, we analyzed small RNA and mRNA changes in different stages of spikelets from wildtype rice, and from msp1 and ostdl1a mutants. Analysis across different stages of rice spikelets of the small RNA data identified miRNAs demonstrating differential abundances. miR2275 was depleted in the two rice mutants; this miRNA is specifically enriched in anthers and functions to trigger the production of 24-nt phased secondary siRNAs (phasiRNAs) from <i>PHAS</i> loci. We observed that the 24-nt phasiRNAs as well as their precursor <i>PHAS</i> mRNAs were also depleted in the two mutants. Based on comparisons of transcript levels across the spikelet stages and mutants, we identified 22 transcription factors as candidates to have roles specific to anther development, potentially acting downstream of the OsTDL1a-MSP1 pathway. An analysis of co-expression identified three Argonaute-encoding genes (<i>OsAGO1d, OsAGO2b,</i> and <i> OsAGO18</i>) that accumulate transcripts coordinately with phasiRNAs, suggesting a functional relationship. By mRNA in situ analysis, we demonstrated a strong correlation between the spatiotemporal pattern of accumulation of these OsAGO transcripts with previously-published phasiRNA accumulation patterns from maize.</p>
|
117 |
Exploring the Plasmodium falciparum Transcriptome Using Hypergeometric Analysis of Time Series (HATS)Scanfeld, Daniel January 2013 (has links)
Malaria poses a significant public health and economic threat in many regions of the world, disproportionately affecting children in sub-Saharan Africa under the age of five. Though success has been celebrated in lowering infection rates, it remains a serious challenge, causing at least 200 million infections and 655,000 deaths per year, with deleterious effects on economic growth and development. Investigation of the malaria parasite Plasmodium falciparum has entered the post-genomics age, with several strains sequenced and many microarray gene expression studies performed. Gene expression studies allow a full sampling of the genomic repertoire of a parasite, and their detailed analysis will prove invaluable in deciphering novel parasite biology as well as the modes of action of antimalarial drug resistance. We have developed a computational pipeline that converts a series of fluorescence readings from a DNA microarray into a meaningful set of biological hypotheses based on the comparison of two lines, generally one that is drug sensitive and one that is drug resistant. Each step of the computational pipeline is described in detail in this thesis, beginning with data normalization and alignment, followed by visualization through dimensionality reduction, and finally a direct analysis of the differences and similarities between the two lines. Comparisons and analyses were performed at both the individual gene and gene set level. An important component of the analytical methods we have developed is a suite of visualization tools that help to easily identify outliers and experimental flaws, measure the significance of predictions, show how lines relate and how well they can be aligned, and demonstrate the results of an analysis. These visualization tools should be used as a starting point for further biological study to test the resulting hypotheses. We also developed a software tool, Gene Attribute and Set Enrichment Ranking (GASER), which combines a wealth of genomic data from the TDR Targets web site along with expression data from a variety of sources, and allows researchers to create sophisticated weighted queries to undercover potential drug targets. Queries in our system can be updated in real time, along with their accompanying gene and gene set lists. We analyzed all possible pair-wise combinations of 11 parasite lines to create baseline distributions for gene and gene set enrichment. Using the baseline as a comparison, we identified and discarded spurious results and recognized stochastic genes and gene sets. We analyzed three major sets of parasite lines: those involving manipulation of the multidrug resistance-1 (PfMDR1) transporter, a key resistance determinant; those involving manipulation of the P. falciparum chloroquine resistance transporter (PfCRT), another important resistance determinant; and finally a set of parasites that had varying sensitivity to artemisinins. This analysis resulted in a rich library of high scoring genes that may merit further exploration as potential modes of action of resistance. More specifically, we found that manipulation of pfcrt expression resulted in an up-regulation of tRNA synthetases, which might serve to increase protein production in response to reduced amino acid availability from degraded hemoglobin. We observed that a copy number increase in pfmdr1 resulted in increases in glycerophospholipid metabolism and up-regulation of a number of ABC transporters. Finally, when comparing artemisinin sensitive to artemisinin tolerant lines, we found an increased abundance of redox metabolites and the transcripts involved in redox regulation, and significant reduction in transcription and altered expression of transcripts encoding for core histone proteins. These alterations could help confer an increased tolerance to drug induced redox perturbation by lowering endogenous redox stress. We also offer a robust computational tool, Hypergeometric Analysis of Time Series (HATS), to handle challenging biological questions related to comparison of time series experiments. Our pipeline provides a rigorous method for aligning expression experiments and then determining which genes and gene sets differ most between them. The changes in gene expression level between drug-sensitive and drug-resistant lines offer important clues in our quest for understanding mechanisms of resistance and identifying new drug targets. Our pipeline allows for comparison of future lines with our base set and holds potential for other organisms, especially those similar to Plasmodium with a strong time-dependent component. The full excel files of all the analyses performed in this thesis can be found at: (http://www.fidock.org/dan).
|
118 |
Dissecting genetic determinants of transcription factor activityLee, Eunjee January 2013 (has links)
Understanding how phenotype relates to genotype, in terms of the myriad molecular processes that govern the behavior of cells and organisms, has been one of the central goals of biology for a long time. Transcription factors (TFs) play a mediating role connecting genotype with gene expression, which provides high-dimensional information about end phenotype. In particular, gene expression levels depend on their cis-regulatory sequence bound by TFs and condition-specific regulatory activity of TFs determined by its modulators through interaction with cofactors or signaling molecules. This thesis consists of two parts that related to the overall goal of dissecting upstream modulators of transcription factor activity. The first study is to dissect genetic determinants of transcription factor activity in a segregating population. We exploit prior knowledge about the in vitro DNA-binding specificity of a TF in order to map the loci (`aQTLs') whose inheritance modulates its protein-level regulatory activity. The second study is to identify regulatory mechanisms underlying tumorigenesis in mice by using genotyping and gene expression data across a set of 97 splenic tumors induced by retroviral insertional mutagenesis. We identify several instances of sequence-specific TFs whose activities are specifically affected by insertions mutations. Our results underscore the value of explicitly modeling TF activity and a strategy for finding upstream modulators of TF activity.
|
119 |
Structural Determinants of DNA-binding Specificity for Hox ProteinsLiu, Peng January 2012 (has links)
Hox proteins are a group of homeodomain-containing transcription factors that define the body plan in both vertebrates and invertebrates. Mutations in Hox proteins lead to limb malformations or cancer in humans. Despite having homeodomains with similar sequences and structures, the eight Hox proteins in Drosophila exhibit a variety of DNA-binding specificities when they are in complex with their cofactor Extradenticle (Exd), raising the question of how such diverse specificity is generated.
We have identified DNA minor groove shape as a structural determinant for Hox specificity. Using Monte Carlo simulations, we predicted the minor groove widths for Hox-binding sites obtained from a high-throughput experiment - Systematic Evolution of Ligands by Exponential Enrichment with massive parallel sequencing (SELEX-seq). We found that DNA sites selected by anterior Hox proteins have two narrow regions in the minor groove where Hox-Exd binds. In contrast, DNA sites favored by posterior Hox proteins have only one narrow region. Moreover, clustering of Hox proteins based on their preference of DNA minor groove shape reproduced the ordering of Hox genes along the chromosome, suggesting a striking relationship between body axis morphogenesis and nuances in DNA shape.
Intrigued by the question of how DNA shape is recognized, we studied the interactions between an anterior Hox protein, Sex combs reduced (Scr), and its preferential DNA sites identified from SELEX-seq. Through structure-based homology modeling, we found that two Arg residues on the N-terminal arm of Scr specifically recognize the two narrow regions in the minor groove of Scr-favored sites, regardless of their nucleotide identities. Our work leads to a new understanding of the structural basis of specific DNA-binding for Drosophila Hox proteins, linking preference of DNA-binding sites to DNA minor groove shape.
Our studies on Hox-cofactor-DNA structures revealed highly conserved features of protein-DNA recognition, e.g. Hox's Asn51 forms hydrogen bonds to an adenine, which are essential for Hox-DNA binding. In order to automatically identify this type of important interactions, we developed a computational module based on the functional annotation server MarkUs. This module displays a variety of protein-DNA interactions inside query structure and illustrates their degrees of conservation by comparing query structure with its structural homologs. This functional annotation module provides an effective way to analyze protein-DNA recognition and to identify essential interactions.
In this dissertation, Chapter 1 introduces the field of protein-DNA specific recognition from the perspectives of three-dimensional structures, high-throughput experiments, and computational modeling approaches. Chapter 2 introduces the biological background of Hox proteins, focusing on their biological functions, three-dimensional structures, and previous studies on their DNA-binding specificity. Chapter 3 presents the investigation of DNA-binding specificity for Hox-Exd complexes. The role of DNA minor groove width as a structural determinant is demonstrated through Monte Carlo simulations. Chapter 4 describes the homology modeling method for studying DNA minor groove recognition for Scr. The recognition mode of Scr-favored SELEX-seq sequences is inferred through protein-DNA docking and interface optimization. Chapter 5 elucidates the functional annotation module for protein-DNA structures. The functions and features of this module are demonstrated through a case study on a Scr-Exd-DNA structure. Chapter 6 summarizes my research projects described in this dissertation and proposes future directions for studying specific protein-DNA recognition.
|
120 |
viSNE and Wanderlust, two algorithms for the visualization and analysis of high-dimensional single-cell dataAmir, El-ad David January 2014 (has links)
The immune system presents a unique opportunity for studying development in mammals. White blood cells undergo differentiation and proliferation, a never-ending process throughout the life of the organism. Hematopoiesis, the development of cells in the immune system, depends upon the interaction between many different cell types (some of which comprise less than a tenth of a percent of the population), transient regulatory decisions, genomic rearrangement events, cell proliferation, and death. To capture these events we employ mass cytometry, a novel technology that measures fifty proteins simultaneously in single cells. Mass cytometry results in large quantities of high-dimensional data which challenges existing computational techniques. To address these challenges, we developed two dimensionality reduction algorithms for analyzing mass cytometry and other single-cell data. The first, viSNE, transforms high-dimensional data into an intuitive two-dimensional map, making it accessible to visual exploration. The second algorithm, Wanderlust, receives as input a static snapshot (where cells occupy different stages of their development) and constructs their developmental ordering: the developmental trajectory. viSNE maps healthy bone marrow into a canonical shape that separates cell subtypes. In leukemia, however, the shape is malformed: the maps of cancer samples are distinct from the healthy map and from each other. The algorithm highlights structure in the heterogeneity of surface phenotype expression in cancer, traverses the progression from diagnosis to relapse, and identifies a rare leukemia population in minimal residual disease settings. Wanderlust was applied to healthy B lineage cells, where the trajectory follows known marker expression trends and genetic recombination events. Using the Wanderlust trajectory we identified CD24 as an early marker of B cell development. The trajectory captures the coordination between several regulatory mechanisms (surface marker expression, signaling, proliferation and apoptosis) during crucial development checkpoints. As new technologies raise the number of simultaneously measured parameters in each cell to the hundreds, viSNE and Wanderlust will become a mainstay in analyzing and interpreting such experiments.
|
Page generated in 0.0545 seconds