1 |
An Approximation Algorithm for Character Compatibility and Fast Quartet-based Phylogenetic Tree ComparisonTsang, John January 2000 (has links)
Phylogenetic analysis, or the inference of evolutionary history is done routinely by biologists and is one of the most important problems in systematic biology. In this thesis, we study two computational problems in the area. First, we study the evolutionary tree reconstruction problem under the character compatibility (CC) paradigm and give a polynomial time approximation scheme (PTAS) for a variation of the formulation called fractional character compatibility (FCC), which has been proven to be NP-hard. We also present a very simple algorithm called the Ordinal Split Method (OSM) to generate bipartitions given sequence data, which can be served as a front-end to the PTAS. The performance of the OSM and the validity of the FCC formulation are studied through simulation experiments. The second part of this thesis presents an efficient algorithm to compare evolutionary trees using the quartet metric. Different evolutionary hypothesis arises when different data sets are used or when different tree inference methods are applied to the same data set. Tree comparisons are routinely done by biologists to evaluate the quality of their tree inference experiments. The quartet metric has many desirable properties but its use has been hindered by its relatively heavy computational requirements. We address this problem by giving the first O(n^2) time algorithm to compute the quartet distance between two evolutionary trees.
|
2 |
Minimização de funções submodulares / Submodular Function MinimizationSimão, Juliana Barby 09 June 2009 (has links)
Funções submodulares aparecem naturalmente em diversas áreas, tais como probabilidade, geometria e otimização combinatória. Pode-se dizer que o papel desempenhado por essas funções em otimização discreta é similar ao desempenhado por convexidade em otimização contínua. Com efeito, muitos problemas em otimização combinatória podem ser formulados como um problema de minimizar uma função submodular sobre um conjunto apropriado. Além disso, submodularidade está presente em vários teoremas ou problemas combinatórios e freqüentemente desempenha um papel essencial em uma demonstração ou na eficiência de um algoritmo. Nesta dissertação, estudamos aspectos estruturais e algorítmicos de funções submodulares, com ênfase nos recentes avanços em algoritmos combinatórios para minimização dessas funções. Descrevemos com detalhes os primeiros algoritmos combinatórios e fortemente polinomiais para esse propósito, devidos a Schrijver e Iwata, Fleischer e Fujishige, além de algumas outras extensões. Aplicações de submodularidade em otimização combinatória também estão presentes neste trabalho. / Submodular functions arise naturally in various fields, including probability, geometry and combinatorial optimization. The role assumed by these functions in discrete optimization is similar to that played by convexity in continuous optimization. Indeed, we can state many problems in combinatorial optimization as a problem of minimizing a submodular function over an appropriate set. Moreover, submodularity appears in many combinatorial theorems or problems and frequently plays an essencial role in a proof or an algorithm. In this dissertation, we study structural and algorithmic aspects of submodular functions. In particular, we focus on the recent advances in combinatorial algorithms for submodular function minimization. We describe in detail the first combinatorial strongly polynomial-time algorithms for this purpose, due to Schrijver and Iwata, Fleischer, and Fujishige, as well as some extensions. Some applications of submodularity in combinatorial optimization are also included in this work.
|
3 |
An Approximation Algorithm for Character Compatibility and Fast Quartet-based Phylogenetic Tree ComparisonTsang, John January 2000 (has links)
Phylogenetic analysis, or the inference of evolutionary history is done routinely by biologists and is one of the most important problems in systematic biology. In this thesis, we study two computational problems in the area. First, we study the evolutionary tree reconstruction problem under the character compatibility (CC) paradigm and give a polynomial time approximation scheme (PTAS) for a variation of the formulation called fractional character compatibility (FCC), which has been proven to be NP-hard. We also present a very simple algorithm called the Ordinal Split Method (OSM) to generate bipartitions given sequence data, which can be served as a front-end to the PTAS. The performance of the OSM and the validity of the FCC formulation are studied through simulation experiments. The second part of this thesis presents an efficient algorithm to compare evolutionary trees using the quartet metric. Different evolutionary hypothesis arises when different data sets are used or when different tree inference methods are applied to the same data set. Tree comparisons are routinely done by biologists to evaluate the quality of their tree inference experiments. The quartet metric has many desirable properties but its use has been hindered by its relatively heavy computational requirements. We address this problem by giving the first O(n^2) time algorithm to compute the quartet distance between two evolutionary trees.
|
4 |
Minimização de funções submodulares / Submodular Function MinimizationJuliana Barby Simão 09 June 2009 (has links)
Funções submodulares aparecem naturalmente em diversas áreas, tais como probabilidade, geometria e otimização combinatória. Pode-se dizer que o papel desempenhado por essas funções em otimização discreta é similar ao desempenhado por convexidade em otimização contínua. Com efeito, muitos problemas em otimização combinatória podem ser formulados como um problema de minimizar uma função submodular sobre um conjunto apropriado. Além disso, submodularidade está presente em vários teoremas ou problemas combinatórios e freqüentemente desempenha um papel essencial em uma demonstração ou na eficiência de um algoritmo. Nesta dissertação, estudamos aspectos estruturais e algorítmicos de funções submodulares, com ênfase nos recentes avanços em algoritmos combinatórios para minimização dessas funções. Descrevemos com detalhes os primeiros algoritmos combinatórios e fortemente polinomiais para esse propósito, devidos a Schrijver e Iwata, Fleischer e Fujishige, além de algumas outras extensões. Aplicações de submodularidade em otimização combinatória também estão presentes neste trabalho. / Submodular functions arise naturally in various fields, including probability, geometry and combinatorial optimization. The role assumed by these functions in discrete optimization is similar to that played by convexity in continuous optimization. Indeed, we can state many problems in combinatorial optimization as a problem of minimizing a submodular function over an appropriate set. Moreover, submodularity appears in many combinatorial theorems or problems and frequently plays an essencial role in a proof or an algorithm. In this dissertation, we study structural and algorithmic aspects of submodular functions. In particular, we focus on the recent advances in combinatorial algorithms for submodular function minimization. We describe in detail the first combinatorial strongly polynomial-time algorithms for this purpose, due to Schrijver and Iwata, Fleischer, and Fujishige, as well as some extensions. Some applications of submodularity in combinatorial optimization are also included in this work.
|
5 |
Algorithms For Haplotype Inference And Block PartitioningVijaya, Satya Ravi 01 January 2006 (has links)
The completion of the human genome project in 2003 paved the way for studies to better understand and catalog variation in the human genome. The International HapMap Project was started in 2002 with the aim of identifying genetic variation in the human genome and studying the distribution of genetic variation across populations of individuals. The information collected by the HapMap project will enable researchers in associating genetic variations with phenotypic variations. Single Nucleotide Polymorphisms (SNPs) are loci in the genome where two individuals differ in a single base. It is estimated that there are approximately ten million SNPs in the human genome. These ten million SNPS are not completely independent of each other - blocks (contiguous regions) of neighboring SNPs on the same chromosome are inherited together. The pattern of SNPs on a block of the chromosome is called a haplotype. Each block might contain a large number of SNPs, but a small subset of these SNPs are sufficient to uniquely dentify each haplotype in the block. The haplotype map or HapMap is a map of these haplotype blocks. Haplotypes, rather than individual SNP alleles are expected to effect a disease phenotype. The human genome is diploid, meaning that in each cell there are two copies of each chromosome - i.e., each individual has two haplotypes in any region of the chromosome. With the current technology, the cost associated with empirically collecting haplotype data is prohibitively expensive. Therefore, the un-ordered bi-allelic genotype data is collected experimentally. The genotype data gives the two alleles in each SNP locus in an individual, but does not give information about which allele is on which copy of the chromosome. This necessitates computational techniques for inferring haplotypes from genotype data. This computational problem is called the haplotype inference problem. Many statistical approaches have been developed for the haplotype inference problem. Some of these statistical methods have been shown to be reasonably accurate on real genotype data. However, these techniques are very computation-intensive. With the international HapMap project collecting information from nearly 10 million SNPs, and with association studies involving thousands of individuals being undertaken, there is a need for more efficient methods for haplotype inference. This dissertation is an effort to develop efficient perfect phylogeny based combinatorial algorithms for haplotype inference. The perfect phylogeny haplotyping (PPH) problem is to derive a set of haplotypes for a given set of genotypes with the condition that the haplotypes describe a perfect phylogeny. The perfect phylogeny approach to haplotype inference is applicable to the human genome due to the block structure of the human genome. An important contribution of this dissertation is an optimal O(nm) time algorithm for the PPH problem, where n is the number of genotypes and m is the number of SNPs involved. The complexity of the earlier algorithms for this problem was O(nm^2). The O(nm) complexity was achieved by applying some transformations on the input data and by making use of the FlexTree data structure that has been developed as part of this dissertation work, which represents all the possible PPH solution for a given set of genotypes. Real genotype data does not always admit a perfect phylogeny, even within a block of the human genome. Therefore, it is necessary to extend the perfect phylogeny approach to accommodate deviations from perfect phylogeny. Deviations from perfect phylogeny might occur because of recombination events and repeated or back mutations (also referred to as homoplasy events). Another contribution of this dissertation is a set of fixed-parameter tractable algorithms for constructing near-perfect phylogenies with homoplasy events. For the problem of constructing a near perfect phylogeny with q homoplasy events, the algorithm presented here takes O(nm^2+m^(n+m)) time. Empirical analysis on simulated data shows that this algorithm produces more accurate results than PHASE (a popular haplotype inference program), while being approximately 1000 times faster than phase. Another important problem while dealing real genotype or haplotype data is the presence of missing entries. The Incomplete Perfect Phylogeny (IPP) problem is to construct a perfect phylogeny on a set of haplotypes with missing entries. The Incomplete Perfect Phylogeny Haplotyping (IPPH) problem is to construct a perfect phylogeny on a set of genotypes with missing entries. Both the IPP and IPPH problems have been shown to be NP-hard. The earlier approaches for both of these problems dealt with restricted versions of the problem, where the root is either available or can be trivially re-constructed from the data, or certain assumptions were made about the data. We make some novel observations about these problems, and present efficient algorithms for unrestricted versions of these problems. The algorithms have worst-case exponential time complexity, but have been shown to be very fast on practical instances of the problem.
|
6 |
Variable Strength Covering ArraysRaaphorst, Sebastian 21 January 2013 (has links)
Recently, covering arrays have been the subject of considerable research attention as they hold both theoretical interest and practical importance due to their applications to testing. In this thesis, we perform the first comprehensive study of a generalization of covering arrays called variable strength covering arrays, where we dictate the interactions to be covered in the array by modeling them as facets of an abstract simplicial complex.
We outline the necessary background in the theory of hypergraphs, combinatorial testing, and design theory that is relevant to the study of variable strength covering arrays. We then approach questions that arise in variable strength covering arrays in a number of ways. We demonstrate their connections to hypergraph homomorphisms, and explore the properties of a particular family of abstract simplicial complexes, the qualitative independence hypergraphs. These hypergraphs are tightly linked to variable strength covering arrays, and we determine and identify several of their important properties and subhypergraphs.
We give a detailed study of constructions for variable strength covering arrays, and provide several operations and divide-and-conquer techniques that can be used in building them. In addition, we give a construction using linear feedback shift registers from primitive polynomials of degree 3 over arbitrary finite fields to find variable strength covering arrays, which we extend to strength-3 covering arrays whose sizes are smaller than many of the best known sizes of covering arrays.
We then give an algorithm for creating variable strength covering arrays over arbitrary abstract simplicial complexes, which builds the arrays one row at a time, using a density concept to guarantee that the size of the resultant array is asymptotic in the logarithm of the number of facets in the abstact simplicial complex. This algorithm is of immediate practical importance, as it can be used to create test suites for combinatorial testing.
Finally, we use the Lovasz Local Lemma to nonconstructively determine upper bounds on the sizes of arrays for a number of different families of hypergraphs. We lay out a framework that can be used for many hypergraphs, and then discuss possible strategies that can be taken in asymmetric problems.
|
7 |
Variable Strength Covering ArraysRaaphorst, Sebastian 21 January 2013 (has links)
Recently, covering arrays have been the subject of considerable research attention as they hold both theoretical interest and practical importance due to their applications to testing. In this thesis, we perform the first comprehensive study of a generalization of covering arrays called variable strength covering arrays, where we dictate the interactions to be covered in the array by modeling them as facets of an abstract simplicial complex.
We outline the necessary background in the theory of hypergraphs, combinatorial testing, and design theory that is relevant to the study of variable strength covering arrays. We then approach questions that arise in variable strength covering arrays in a number of ways. We demonstrate their connections to hypergraph homomorphisms, and explore the properties of a particular family of abstract simplicial complexes, the qualitative independence hypergraphs. These hypergraphs are tightly linked to variable strength covering arrays, and we determine and identify several of their important properties and subhypergraphs.
We give a detailed study of constructions for variable strength covering arrays, and provide several operations and divide-and-conquer techniques that can be used in building them. In addition, we give a construction using linear feedback shift registers from primitive polynomials of degree 3 over arbitrary finite fields to find variable strength covering arrays, which we extend to strength-3 covering arrays whose sizes are smaller than many of the best known sizes of covering arrays.
We then give an algorithm for creating variable strength covering arrays over arbitrary abstract simplicial complexes, which builds the arrays one row at a time, using a density concept to guarantee that the size of the resultant array is asymptotic in the logarithm of the number of facets in the abstact simplicial complex. This algorithm is of immediate practical importance, as it can be used to create test suites for combinatorial testing.
Finally, we use the Lovasz Local Lemma to nonconstructively determine upper bounds on the sizes of arrays for a number of different families of hypergraphs. We lay out a framework that can be used for many hypergraphs, and then discuss possible strategies that can be taken in asymmetric problems.
|
8 |
CAMLearn* : une architecture de système de recommandation sémantique sensible au contexte : application au domaine du m-learning / CAMLearn : a semantic context-aware recommender system architecture : application on m-learning domainSoualah Alila, Fayrouz 18 March 2015 (has links)
Au vu de l'émergence rapide des nouvelles technologies mobiles et la croissance des offres et besoins d'une société en mouvement en formation, les travaux se multiplient pour identifier de nouvelles plateformes d'apprentissage pertinentes afin d'améliorer et faciliter le processus d'apprentissage à distance. La prochaine étape de l'apprentissage à distance est naturellement le port de l'apprentissage électronique vers les nouveaux systèmes mobiles. On parle alors de m-learning (apprentissage mobile). Jusqu'à présent l'environnement d'apprentissage était soit défini par un cadre pédagogique soit imposé par le contenu d'apprentissage. Maintenant, nous cherchons, à l'inverse, à adapter le cadre pédagogique et le contenu d'apprentissage au contexte de l'apprenant.Nos travaux de recherche portent sur le développement d'une nouvelle architecture pour le m-learning. Nous proposons une approche pour un système m-learning contextuel et adaptatif intégrant des stratégies de recommandation de scénarios de formations sans risque de rupture. / Given the rapid emergence of new mobile technologies and the growth of needs of a moving society in training, works are increasing to identify new relevant educational platforms to improve distant learning. The next step in distance learning is porting e-learning to mobile systems. This is called m-learning. So far, learning environment was either defined by an educational setting, or imposed by the educational content. In our approach, in m-learning, we change the paradigm where the system recommends content and adapts learning follow to learner's context.
|
9 |
Variable Strength Covering ArraysRaaphorst, Sebastian January 2013 (has links)
Recently, covering arrays have been the subject of considerable research attention as they hold both theoretical interest and practical importance due to their applications to testing. In this thesis, we perform the first comprehensive study of a generalization of covering arrays called variable strength covering arrays, where we dictate the interactions to be covered in the array by modeling them as facets of an abstract simplicial complex.
We outline the necessary background in the theory of hypergraphs, combinatorial testing, and design theory that is relevant to the study of variable strength covering arrays. We then approach questions that arise in variable strength covering arrays in a number of ways. We demonstrate their connections to hypergraph homomorphisms, and explore the properties of a particular family of abstract simplicial complexes, the qualitative independence hypergraphs. These hypergraphs are tightly linked to variable strength covering arrays, and we determine and identify several of their important properties and subhypergraphs.
We give a detailed study of constructions for variable strength covering arrays, and provide several operations and divide-and-conquer techniques that can be used in building them. In addition, we give a construction using linear feedback shift registers from primitive polynomials of degree 3 over arbitrary finite fields to find variable strength covering arrays, which we extend to strength-3 covering arrays whose sizes are smaller than many of the best known sizes of covering arrays.
We then give an algorithm for creating variable strength covering arrays over arbitrary abstract simplicial complexes, which builds the arrays one row at a time, using a density concept to guarantee that the size of the resultant array is asymptotic in the logarithm of the number of facets in the abstact simplicial complex. This algorithm is of immediate practical importance, as it can be used to create test suites for combinatorial testing.
Finally, we use the Lovasz Local Lemma to nonconstructively determine upper bounds on the sizes of arrays for a number of different families of hypergraphs. We lay out a framework that can be used for many hypergraphs, and then discuss possible strategies that can be taken in asymmetric problems.
|
10 |
Monomino-Domino Tatami CoveringsErickson, Alejandro 03 September 2013 (has links)
We present several new results on the combinatorial properties of a locally restricted version of monomino-domino coverings of rectilinear regions. These are monomino-domino tatami coverings, and the restriction is that no four tiles may meet at any point. The global structure that the tatami restriction induces has numerous implications, and provides a powerful tool for solving enumeration problems on tatami coverings. Among these we address the enumeration of coverings of rectangles, with various parameters, and we develop algorithms for exhaustive generation of coverings, in constant amortised time per covering. We also con- sider computational complexity on two fronts; firstly, the structure shows that the space required to store a covering of the rectangle is linear in its longest dimension, and secondly, it is NP-complete to decide whether an arbitrary polyomino can be tatami-covered only with dominoes. / Graduate / 0984 / 0405 / alejandro.erickson@gmail.com
|
Page generated in 0.0678 seconds