• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 368
  • 47
  • 33
  • 20
  • 17
  • 10
  • 8
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 694
  • 694
  • 359
  • 185
  • 169
  • 105
  • 96
  • 94
  • 87
  • 80
  • 78
  • 77
  • 77
  • 74
  • 73
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Accurate annotation of non-coding RNAs in practical time /

Weinberg, Zasha. January 2005 (has links)
Thesis (Ph. D.)--University of Washington, 2005. / Vita. Includes bibliographical references (p. 257-268).
92

In silico evolution of protein-protein interactions : from altered specificities to de novo complexes /

Joachimiak, Lukasz A. January 2007 (has links)
Thesis (Ph. D.)--University of Washington, 2007. / Vita. Includes bibliographical references (leaves 139-147).
93

A graph-theoretical treatment of protein domain evolution

Shakhnovich, Boris E. January 2004 (has links)
Thesis (Ph.D.)--Boston University. PLEASE NOTE: Boston University Libraries did not receive an Authorization To Manage form for this thesis or dissertation. It is therefore not openly accessible, though it may be available by request. If you are the author or principal advisor of this work and would like to request open access for it, please contact us at open-help@bu.edu. Thank you. / Understanding the mechanisms and driving forces behind molecular evolution is the defining challenge ofcomputational biology. However, a comprehensive, quantitative theory ofmolecular evolution remains elusive. We evaluate a new graph-theoretic treatment ofthis problem. We start by defining a multi-dimensional protein domain universe graph (PDUG). The nodes in this graph are the atomic units of evolution - structures ofrecurring domains and sequences that fold into those structures. Each ofthe three dimensions in PDUG-structure, function and phylogeny represents a potential constraint from evolutionary pressure. We go on to characterize graph-theoretic properties such as phase transitions, power-law degree distributions, and correlations between the three dimensions. We compare the observed properties with those expected from random graphs. The comparison enables us to identify the likely contours of sets of co-evolved proteins. We further our understanding by assessing several computationally tractable models of evolution that recapitulate some fundamental characteristics of PDUG. We go on to define fitness characteristics derived from simple physical properties of structure and function that serve to clarify the uneven relationship between fold and sequence space topology. However, we also find that evolutionary history plays a crucial role since structural fitness is only the potential for sequence entropy, while variable time of evolutionary search determines the fulfillment of that potential. Armed with our new understanding of protein fitness we describe its progression over time. We establish that eukaryotic domains enjoy a faster exploration of sequence and function space than prokaryotic ones. We further note that biological phenomena such as thermophilic adaptation and duplication success may be explained in light of our newly found understanding ofprotein fitness. Finally, we employ the newly developed PDUG paradigm to quantify the structure-function relationship. We show through modeling of divergent evolution that functions coalesce non-randomly as sfructural clusters grow. We fmd that the widely held hierarchical description of structure space has theoretical underpinnings in the natural clustering of the PDUG. We finish by calculating the theoretical lower limit of uncertainty inherent in structure function correlation of protein domains. / 2031-01-02
94

A computational biology approach to studying algae-bacterial interactions

Kudahl, Ulrich Johan January 2018 (has links)
Microalgae have a profound effect on the world due to their large contribution to net carbon fixation. Although they are phototrophic, more than 50% of microalgae are thought to depend on external supply of metabolites such as B-vitamins. In oceans, algae are therefore often found together with a community of bacteria and form intricate networks where metabolites are exchanged. Currently, only a fraction of the related mechanisms and metabolite exchanges between algae and bacteria have been uncovered and many more are likely to exist. The work presented in this thesis is based on a model system for algae-bacterial interactions made up of the green alga, Lobomonas rostrata and the alpha-proteobacterium Mesorhizobium loti. In the model system, it is known that the bacterium provides vitamin B12 to the alga and itself, whilst the alga provides fixed carbon. I have applied methods from the field of computational biology to study the interactions between these organisms and other similar partnerships, with the aim of uncovering new insights. The thesis is made up of three research chapters, each focused on using a specific method to study algae-bacterial interactions. I developed a genome scale metabolic model of metabolism of M. loti that enabled simulation of growth. The model simulates 1908 enzymatic reactions and takes 1804 metabolites into account. Using the model, I simulated growth of the bacterium on 1018 different substrates with the aim of identifying substrates supplied by L. rostrata when the two organisms are co-cultured. In addition, I carried out a set of simulations studying the bacterium’s ability to produce B12 from 1368 different substrates. The modelling efforts in this project was successful in enabling simulations, but it was not possible to validate the simulations with experimental data. A transcriptomics experiment was undertaken with the aim of identifying genes related to the interaction between L. rostrata and M. loti. In the experiment, the partners from the model system was grown in axenic and co-culture conditions and RNA samples were taken from each state. Using RNA-seq, the RNA samples were sequenced and from this a candidate transcriptome was created. The expression of each putative gene was then quantified and differentially expressed genes were identified. Based on sequence similarity, candidate functions were assigned where possible. In the analysis of differentially expressed genes, it was found that there appears to be an increased expression of a transporter responsible for uptake of the plant hormone, auxin. Currently, only a small fraction of all bacteria has been shown to produce B12 and it is not clear in which phylogenetic groups this is a common trait. I therefore applied methods from comparative genomics to study the synthesis of this metabolite in more than 8000 bacterial species. This involved developing a computational framework that allowed me to search for the presence of more than 50 genes in more than 8000 genomes in a rapid manner. I found that 37.2% of bacteria can synthesis B12 and that this capability is very common in some phylogenetic groups such as Cyanobacteria, but extremely rare in others such as Lactobacillus. I was also able to confirm that cyanobacteria are not able to make cobalamin, a variant of B12 used by eukaryotic algae, and thus they are unlikely to support algal growth in the photic zone. In the final section of the thesis, I discuss the application of computational biology methods in this field and summarise my experience from applying genome scale modelling, comparative genomics and transcriptomics to study algae-bacterial interactions.
95

Computational methods for the measurement of protein-DNA interactions

James, Daniel Peter January 2018 (has links)
It is of interest to know where in the genome DNA binding proteins act in order to effect their gene regulatory function. For many sequence specific DNA binding proteins we plan to predict the location of their action by having a model of their affinity to short DNA sequences. Existing and new models of protein sequence specificty are investigated and their ability to predict genomic locations is evaluated. Public data from a micro-fluidic experiment is used to fit a matrix model of binding specificity for a single transcription factor. Physical association and disassociation constants from the experiment enable a biophysical interpretation of the data to be made in this case. The matrix model is shown to provide a better fit to the experimental data than a model initially published with the data. Public data from 172 protein binding micro-array experiments is used to fit a new type of model to 82 unique proteins. Each experiment provides measurements of the binding specificity of an individual protein to approximately 40000 DNA probes. Statistical, `DNA word', models are assessed for their ability to predict held back data and perform very well in many cases. Where available, ChIP-seq data from the ENCODE project is used to assess the ability of a selection of the DNA word models to predict ChIP-seq peaks and how they compare to matrix models in doing so. This $\textit{in vitro}$ data is the closest proxy to the true sites of the proteins' regulatory action that we have.
96

Method for recognizing local descriptors of protein structures using Hidden Markov Models

Björkholm, Patrik January 2008 (has links)
Being able to predict the sequence-structure relationship in proteins will extend the scope of many bioinformatics tools relying on structure information. Here we use Hidden Markov models (HMM) to recognize and pinpoint the location in target sequences of local structural motifs (local descriptors of protein structure, LDPS) These substructures are composed of three or more segments of amino acid backbone structures that are in proximity with each other in space but not necessarily along the amino acid sequence. We were able to align descriptors to their proper locations in 41.1% of the cases when using models solely built from amino acid information. Using models that also incorporated secondary structure information, we were able to assign 57.8% of the local descriptors to their proper location. Further enhancements in performance was yielded when threading a profile through the Hidden Markov models together with the secondary structure, with this material we were able assign 58,5% of the descriptors to their proper locations. Hidden Markov models were shown to be able to locate LDPS in target sequences, the performance accuracy increases when secondary structure and the profile for the target sequence were used in the models.
97

GENOMEWIDE GENE EXPRESSION ANALYSIS TO INVESTIGATE SYMPTOM ONSET AND EXACERBATION IN TOURETTE SYNDROME

Hanrui Wu (9523997) 16 December 2020 (has links)
Tourette Syndrome (TS) is a neurodevelopmental disorder that starts in childhood. It is marked by multiple motor and vocal tics and a fluctuating course with recurrent time periods of symptom exacerbation followed by symptom remission. TS is considered a disorder of complex etiology that involves the interaction of multiple genetic and environmental factors. Previous studies have implicated regulation of the immune system in TS patients, although this remains a matter of debate. <br>Here we present a genome-wide gene expression study comparing data from TS patients at time points of symptom exacerbation and remission as well as upon symptom onset for newly diagnosed patients, with differentially gene expression analysis, co-expression analysis and pathway analysis. Although the fold change of each regulated gene was very mild, it is the first time that changes in the regulation of the immune system have been implicated at different time points in TS patients.<br><br>
98

The role of RFX-target genes in neurodevelopmental and psychiatric disorders

Ganesan, Abhishekapriya January 2021 (has links)
Neurodevelopmental disorders such as autism spectrum disorder (ASD) and psychiatric disorders, for example, schizophrenia (SCZ) represent a large spectrum of disorders that manifest through cognitive and behavioural problems. ASD and SCZ are both highly heritable, and some phenotypic similarities between ASD and SCZ have sparked an interest in understanding their genetic commonalities. The genetics of both disorders exhibit significant heterogeneity. Developments in genomics and systems biology, continually increases people’s understanding of these disorders. Recently, pathogenic genetic variants in the regulatory factor X (RFX) family of transcription factors have been identified in a number of ASD cases. In this thesis, common genetic variants and expression patterns of genes identified to have a conserved promotor X-Box motif region, a binding site of RFX factors, are studied. Significant common variants identified through expression quantitative trait loci (eQTLs) and genome wide association studies (GWAS) are mapped to the regulatory regions of these genes and analysed for putative enrichment. In addition, single-cell RNA sequencing data is utilised to examine enrichment of cell types having high X-Box gene expression in the developing human cortex. Through the study, genes that have eQTLs or SNPs in the genomic regulatory regions of the X-Box genes have been identified. While there were no eQTLs or GWAS SNPs in the X-Box motifs, in the X-Box promoter regions some common variants were found. By hypergeometric distribution testing and the subsequent p-values obtained, all of these distributions are statistically under-enriched. Further, major cell types in the cortical region with increased expression of the X-Box genes and most expressed genes among these enriched cell types have been identified. Among the 11 cell types seven were found to be enriched for X-Box genes and many of the most expressed genes in these cell-types were similar. A further study into the cell types and genes identified, along with additional systems biological data analysis, could reveal a larger list of X-Box genes involved in ASD and SCZ and the specific roles of these genes.
99

A Model for Bioaugmented Anaerobic Granule

Mahajan, Amitesh 01 May 2018 (has links)
In this study, we have created a simulation model which is concerned about digesting cellulose, as a major component of microalgae in a bioreactor. This model is designed to generate a computational model that simulates the process of granulation in anaerobic sludge and aims to investigate scenarios of possible granular bioaugmentation. Once a mature granule is formed, pro- tein is used as an alternative substrate that will be supplied to a mature granule. Protein, being a main component of cyanobacteria, will promote growth and incorporation of a cell type that can degrade protein (selective pressure). The model developed in a cDynoMiCs simulation environment successfully demonstrated the process of granule formation and bioaugmentation in an Anaerobic granule. Bioaugmentation is a common strategy in the field of wastewater treatment, used to in- troduce a new metabolic capability to either aerobic or anaerobic granules. The end product of our work is a model that can visually demonstrate varying stratifications of different trophic microbial groups that will be of help for the engineers and researchers, who are operating both laboratory and industrial-scale anaerobic digesters and wish to enhance reactor performance. The working model that we have developed has been validated using the existing literature and lab experiments. The model successfully demonstrates granulation in a cellobiose fed system with formation of 0.63 mm mature granule in 59 days with the production of good amount of methane that could be used commercially as a green fuel. This model is extended to perform bioaugmentation by chaining different simulations.
100

Discovery and Interpretation of Subspace Structures in Omics Data by Low-Rank Representation

Lu, Xiaoyu 10 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Biological functions in cells are highly complicated and heterogenous, and can be reflected by omics data, such as gene expression levels. Detecting subspace structures in omics data and understanding the diversity of the biological processes is essential to the full comprehension of biological mechanisms and complicated biological systems. In this thesis, we are developing novel statistical learning approaches to reveal the subspace structures in omics data. Specifically, we focus on three types of subspace structures: low-rank subspace, sparse subspace and covariates explainable subspace. For low-rank subspace, we developed a semi-supervised model SSMD to detect cell type specific low-rank structures and predict their relative proportions across different tissue samples. SSMD is the first computational tool that utilizes semi-supervised identification of cell types and their marker genes specific to each mouse tissue transcriptomics data, for better understanding of the disease microenvironment and downstream disease mechanism. For sparsity-driven sparse subspace, we proposed a novel positive and unlabeled learning model, namely PLUS, that could identify cancer metastasis related genes, predict cancer metastasis status and specifically address the under-diagnosis issue in studying metastasis potential. We found PLUS predicted metastasis potential at diagnosis have significantly strong association with patient’s progression-free survival in their follow-up data. Lastly, to discover the covariates explainable subspace, we proposed an analytical pipeline based on covariance regression, namely, scCovReg. We utilized scCovReg to detect the pathway level second-order variations using scRNA-Seq data in a statistically powerful manner, and to associate the second-order variations with important subject-level characteristics, such as disease status. In conclusion, we presented a set of state-of-the-art computational solutions for identifying sparse subspaces in omics data, which promise to provide insights into the mechanism in complex diseases.

Page generated in 0.1435 seconds