Spelling suggestions: "subject:"computational genomics"" "subject:"eomputational genomics""
1 |
Experimental and computational studies of DNA structure using the hydroxyl radical as a chemical probeGreenbaum, Jason Adam January 2006 (has links)
Thesis (Ph.D.)--Boston University / PLEASE NOTE: Boston University Libraries did not receive an Authorization To Manage form for this thesis or dissertation. It is therefore not openly accessible, though it may be available by request. If you are the author or principal advisor of this work and would like to request open access for it, please contact us at open-help@bu.edu. Thank you. / We have constructed a database of hydroxyl radical (•OH) cleavage patterns of DNA in order to investigate the relationship between the sequence of a DNA molecule and its three-dimensional structure. The hydroxyl radical cuts DNA at every nucleotide, with the amount of cutting proportional to the solvent accessible surface area (SASA) of the deoxyribose hydrogen atoms. Cleavage fragments are quantified by a fluorescence sequencer, followed by normalization and deposition into the database.
Our database currently contains 151 DNA sequences with lengths ranging from 35 to 41 nucleotides. These data have enabled us to develop some general rules regarding the sequence-dependence of DNA structure as well as to predict the cleavage pattern of any given DNA sequence with remarkable precision. Using this prediction algorithm, it is possible to construct structural maps of entire genomes.
As there are many examples of DNA binding proteins with highly degenerate binding sites, the use of structural information to locate these sites may be helpful. There also exists other signals, including the signal for nucleosome positioning, which have no apparent consensus, making it likely that the structure of DNA is of critical importance.
We have developed algorithms to identify regions of conserved structure using •OH cleavage intensity as a proxy. Within a set of DNase I hypersensitive sites (DHS) obtained from the ENCODE Consortium, we were able to identify a stretch of 12 nucleotides for which the structural conservation is much greater than the sequence conservation. These sites have been dubbed Conserved •OH Radical Cleavage Signatures, or CORCS. Upon further analysis, these CORCS were found to be 17-fold enriched for DHS as compared to shuffled elements.
Through the continued analysis of hydroxyl radical cleavage data and development of algorithms to employ the data in biologically meaningful ways, we hope to further our understanding of the relationship between DNA sequence and structure, and how the local structural heterogeneity of genomic DNA contributes to biological function. / 2031-01-02
|
2 |
Decoding function through comparative genomics: from animal evolution to human diseaseMaxwell, Evan Kyle 12 March 2016 (has links)
Deciphering the functionality encoded in the genome constitutes an essential first step to understanding the context through which mutations can cause human disease. In this dissertation, I present multiple studies based on the use or development of comparative genomics techniques to elucidate function (or lack of function) from the genomes of humans and other animal species. Collectively, these studies focus on two biological entities encoded in the human genome: genes related to human disease susceptibility and those that encode microRNAs - small RNAs that have important gene-regulatory roles in normal biological function and in human disease. Extending this work, I investigated the evolution of these biological entities within animals to shed light on how their underlying functions arose and how they can be modeled in non-human species. Additionally, I present a new tool that uses large-scale clinical genomic data to identify human mutations that may affect microRNA regulatory functions, thereby providing a method by which state-of-the-art genomic technologies can be fully utilized in the search for new disease mechanisms and potential drug targets.
The scientific contributions made in this dissertation utilize current data sets generated using high-throughput sequencing technologies. For example, recent whole-genome sequencing studies of the most distant animal lineages have effectively restructured the animal tree of life as we understand it. The first two chapters utilize data from this new high-confidence animal phylogeny - in addition to data generated in the course of my work - to demonstrate that (1) certain classes of human disease have uncommonly large proportions of genes that evolved with the earliest animals and/or vertebrates, and (2) that canonical microRNA functionality - absent in at least two of the early branching animal lineages - likely evolved after the first animals. In the third chapter, I expand upon recent research in predicting microRNA target sites, describing a novel tool for predicting clinically significant microRNA target site variants and demonstrating its applicability to the analysis of clinical genomic data. Thus, the studies detailed in this dissertation represent significant advances in our understanding of the functions of disease genes and microRNAs from both an evolutionary and a clinical perspective.
|
3 |
Computation and Application of Persistent Homology on Streaming DataMoitra, Anindya January 2020 (has links)
No description available.
|
4 |
Two Problems in Computational GenomicsBelal, Nahla Ahmed 22 March 2011 (has links)
This work addresses two novel problems in the field of computational genomics. The first is whole genome alignment and the second is inferring horizontal gene transfer using posets. We define these two problems and present algorithmic approaches for solving them. For the whole genome alignment, we define alignment graphs for representing different evolutionary events, and define a scoring function for those graphs. The problem defined is proven to be NP-complete. Two heuristics are presented to solve the problem, one is a dynamic programming approach that is optimal for a class of sequences that we define in this work as breakable arrangements. And, the other is a greedy approach that is not necessarily optimal, however, unlike the dynamic programming approach, it allows for reversals. For inferring horizontal gene transfer, we define partial order sets among species, with respect to different genes, and infer genes involved in horizontal gene transfer by comparing posets for different genes. The posets are used to construct a tree for each gene. Those trees are then compared and tested for contradiction, where contradictory trees correspond to genes that are candidates of horizontal gene transfer. / Ph. D.
|
5 |
Learning with Sparcity: Structures, Optimization and ApplicationsChen, Xi 01 July 2013 (has links)
The development of modern information technology has enabled collecting data of unprecedented size and complexity. Examples include web text data, microarray & proteomics, and data from scientific domains (e.g., meteorology). To learn from these high dimensional and complex data, traditional machine learning techniques often suffer from the curse of dimensionality and unaffordable computational cost. However, learning from large-scale high-dimensional data promises big payoffs in text mining, gene analysis, and numerous other consequential tasks.
Recently developed sparse learning techniques provide us a suite of tools for understanding and exploring high dimensional data from many areas in science and engineering. By exploring sparsity, we can always learn a parsimonious and compact model which is more interpretable and computationally tractable at application time. When it is known that the underlying model is indeed sparse, sparse learning methods can provide us a more consistent model and much improved prediction performance. However, the existing methods are still insufficient for modeling complex or dynamic structures of the data, such as those evidenced in pathways of genomic data, gene regulatory network, and synonyms in text data.
This thesis develops structured sparse learning methods along with scalable optimization algorithms to explore and predict high dimensional data with complex structures. In particular, we address three aspects of structured sparse learning:
1. Efficient and scalable optimization methods with fast convergence guarantees for a wide spectrum of high-dimensional learning tasks, including single or multi-task structured regression, canonical correlation analysis as well as online sparse learning.
2. Learning dynamic structures of different types of undirected graphical models, e.g., conditional Gaussian or conditional forest graphical models.
3. Demonstrating the usefulness of the proposed methods in various applications, e.g., computational genomics and spatial-temporal climatological data. In addition, we also design specialized sparse learning methods for text mining applications, including ranking and latent semantic analysis.
In the last part of the thesis, we also present the future direction of the high-dimensional structured sparse learning from both computational and statistical aspects.
|
Page generated in 0.13 seconds