Global ETD Search

11	Topics in Signal Processing: applications in genomics and genetics Elmas, Abdulkadir January 2016 (has links) The information in genomic or genetic data is influenced by various complex processes and appropriate mathematical modeling is required for studying the underlying processes and the data. This dissertation focuses on the formulation of mathematical models for certain problems in genomics and genetics studies and the development of algorithms for proposing efficient solutions. A Bayesian approach for the transcription factor (TF) motif discovery is examined and the extensions are proposed to deal with many interdependent parameters of the TF-DNA binding. The problem is described by statistical terms and a sequential Monte Carlo sampling method is employed for the estimation of unknown parameters. In particular, a class-based resampling approach is applied for the accurate estimation of a set of intrinsic properties of the DNA binding sites. Through statistical analysis of the gene expressions, a motif-based computational approach is developed for the inference of novel regulatory networks in a given bacterial genome. To deal with high false-discovery rates in the genome-wide TF binding predictions, the discriminative learning approaches are examined in the context of sequence classification, and a novel mathematical model is introduced to the family of kernel-based Support Vector Machines classifiers. Furthermore, the problem of haplotype phasing is examined based on the genetic data obtained from cost-effective genotyping technologies. Based on the identification and augmentation of a small and relatively more informative genotype set, a sparse dictionary selection algorithm is developed to infer the haplotype pairs for the sampled population. In a relevant context, to detect redundant information in the single nucleotide polymorphism (SNP) sites, the problem of representative (tag) SNP selection is introduced. An information theoretic heuristic is designed for the accurate selection of tag SNPs that capture the genetic diversity in a large sample set from multiple populations. The method is based on a multi-locus mutual information measure, reflecting a biological principle in the population genetics that is linkage disequilibrium. Signal processing Signal processing--Statistical methods Genetics--Data processing Genomics--Data processing Bioinformatics Transcription factors Electrical engineering
12	The effect of evolutionary rate estimation methods on correlations observed between substitution rates in models of evolution Botha, Stephen Gordon 03 1900 (has links) Thesis (MSc)--Stellenbosch University, 2012. / ENGLISH ABSTRACT: see full text for abstract / AFRIKAANSE OPSOMMING: sien volteks vir opsomming Evolutionary model Substitution rates Biological parameters Bioinformatics Dissertations -- Computer science Theses -- Computer science Dissertations -- Mathematical sciences Theses -- Mathematical sciences Evolutionary genetics -- Data processing Computer Science
13	Genome-wide analyses of single cell phenotypes using cell microarrays Narayanaswamy, Rammohan, 1978- 29 August 2008 (has links) The past few decades have witnessed a revolution in recombinant DNA and nucleic acid sequencing technologies. Recently however, technologies capable of massively high-throughout, genome-wide data collection, combined with computational and statistical tools for data mining, integration and modeling have enabled the construction of predictive networks that capture cellular regulatory states, paving the way for ‘Systems biology’. Consequently, protein interactions can be captured in the context of a cellular interaction network and emergent ‘system’ properties arrived at, that may not have been possible by conventional biology. The ability to generate data from multiple, non-redundant experimental sources is one of the important facets to systems biology. Towards this end, we have established a novel platform called ‘spotted cell microarrays’ for conducting image-based genetic screens. We have subsequently used spotted cell microarrays for studying multidimensional phenotypes in yeast under different regulatory states. In particular, we studied the response to mating pheromone using a cell microarray comprised of the yeast non-essential deletion library and analyzed morphology changes to identify novel genes that were involved in mating. An important aspect of the mating response pathway is large-scale spatiotemporal changes to the proteome, an aspect of proteomics, still largely obscure. In our next study, we used an imaging screen and a computational approach to predict and validate the complement of proteins that polarize and change localization towards the mating projection tip. By adopting such hybrid approaches, we have been able to, not only study proteins involved in specific pathways, but also their behavior in a systemic context, leading to a broader comprehension of cell function. Lastly, we have performed a novel metabolic starvation-based screen using the GFP-tagged collection to study proteome dynamics in response to nutrient limitation and are currently in the process of rationalizing our observations through follow-up experiments. We believe this study to have implications in evolutionarily conserved cellular mechanisms such as protein turnover, quiescence and aging. Our technique has therefore been applied towards addressing several interesting aspects of yeast cellular physiology and behavior and is now being extended to mammalian cells. / text Genomics--Databases Genomics--Data processing DNA microarrays--Databases Phenotype--Data processing Pheromones--Data processing Proteomics--Data processing Proteins--Data processing
14	Combinatorial aspects of genome rearrangements and haplotype networks / Aspects combinatoires des réarrangements génomiques et des réseaux d'haplotypes Labarre, Anthony 12 September 2008 (has links) The dissertation covers two problems motivated by computational biology: genome rearrangements, and haplotype networks.<p><p>Genome rearrangement problems are a particular case of edit distance problems, where one seeks to transform two given objects into one another using as few operations as possible, with the additional constraint that the set of allowed operations is fixed beforehand; we are also interested in computing the corresponding distances between those objects, i.e. merely computing the minimum number of operations rather than an optimal sequence. Genome rearrangement problems can often be formulated as sorting problems on permutations (viewed as linear orderings of {1,2,n}) using as few (allowed) operations as possible. In this thesis, we focus among other operations on ``transpositions', which displace intervals of a permutation. Many questions related to sorting by transpositions are open, related in particular to its computational complexity. We use the disjoint cycle decomposition of permutations, rather than the ``standard tools' used in genome rearrangements, to prove new upper bounds on the transposition distance, as well as formulae for computing the exact distance in polynomial time in many cases. This decomposition also allows us to solve a counting problem related to the ``cycle graph' of Bafna and Pevzner, and to construct a general framework for obtaining lower bounds on any edit distance between permutations by recasting their computation as factorisation problems on related even permutations.<p><p>Haplotype networks are graphs in which a subset of vertices is labelled, used in comparative genomics as an alternative to trees. We formalise a new method due to Cassens, Mardulyn and Milinkovitch, which consists in building a graph containing a given set of partially labelled trees and with as few edges as possible. We give exact algorithms for solving the problem on two graphs, with an exponential running time in the general case but with a polynomial running time if at least one of the graphs belong to a particular class.<p>/<p>La thèse couvre deux problèmes motivés par la biologie: l'étude des réarrangements génomiques, et celle des réseaux d'haplotypes.<p><p>Les problèmes de réarrangements génomiques sont un cas particulier des problèmes de distances d'édition, où l'on cherche à transformer un objet en un autre en utilisant le plus petit nombre possible d'opérations, les opérations autorisées étant fixées au préalable; on s'intéresse également à la distance entre les deux objets, c'est-à-dire au calcul du nombre d'opérations dans une séquence optimale plutôt qu'à la recherche d'une telle séquence. Les problèmes de réarrangements génomiques peuvent souvent s'exprimer comme des problèmes de tri de permutations (vues comme des arrangements linéaires de {1,2,n}) en utilisant le plus petit nombre d'opérations (autorisées) possible. Nous examinons en particulier les ``transpositions', qui déplacent un intervalle de la permutation. Beaucoup de problèmes liés au tri par transpositions sont ouverts, en particulier sa complexité algorithmique. Nous nous écartons des ``outils standards' utilisés dans le domaine des réarrangements génomiques, et utilisons la décomposition en cycles disjoints des permutations pour prouver de nouvelles majorations sur la distance des transpositions ainsi que des formules permettant de calculer cette distance en temps polynomial dans de nombreux cas. Cette décomposition nous sert également à résoudre un problème d'énumération concernant le ``graphe des cycles' de Bafna et Pevzner, et à construire une technique générale permettant d'obtenir de nouvelles minorations en reformulant tous les problèmes de distances d'édition sur les permutations en termes de factorisations de permutations paires associées.<p><p>Les réseaux d'haplotypes sont des graphes dont une partie des sommets porte des étiquettes, utilisés en génomique comparative quand les arbres sont trop restrictifs, ou quand l'on ne peut choisir une ``meilleure' topologie parmi un ensemble donné d'arbres. Nous formalisons une nouvelle méthode due à Cassens, Mardulyn et Milinkovitch, qui consiste à construire un graphe contenant tous les arbres partiellement étiquetés donnés et possédant le moins d'arêtes possible, et donnons des algorithmes résolvant le problème de manière optimale sur deux graphes, dont le temps d'exécution est exponentiel en général mais polynomial dans quelques cas que nous caractérisons.<p> / Doctorat en Sciences / info:eu-repo/semantics/nonPublished Sciences exactes et naturelles Informatique générale Bioinformatics Genetics -- Data processing Bio-informatique Génétique -- Informatique permutation cycle graph/graphe des cycles supergraph/supergraphe edit distance/distance d'édition sorting/tri

Page generated in 0.0875 seconds