Spelling suggestions: "subject:"queuelength"" "subject:"codelength""
1 |
Fast and accurate estimation of large-scale phylogenetic alignments and treesLiu, Kevin Jensen 06 July 2011 (has links)
Phylogenetics is the study of evolutionary relationships.
Phylogenetic trees and alignments play important roles in a wide range
of biological research, including reconstruction of the Tree of Life
- the evolutionary history of all organisms on Earth - and the
development of vaccines and antibiotics.
Today's phylogenetic studies seek to reconstruct
trees and alignments on a greater number and variety of
organisms than ever before, primarily
due to exponential
growth in affordable sequencing and computing power.
The importance of
phylogenetic trees and alignments motivates the need for
methods to reconstruct them accurately and efficiently
on large-scale datasets.
Traditionally, phylogenetic studies proceed in two phases: first, an
alignment is produced from biomolecular sequences with differing
lengths, and, second, a tree is produced using the alignment. My
dissertation presents the first empirical performance study of leading
two-phase methods on datasets with up to hundreds of thousands of
sequences. Relatively accurate alignments and trees were obtained
using methods with high computational requirements on datasets with a
few hundred sequences, but as datasets grew past 1000 sequences and up
to tens of thousands of sequences, the set of methods capable of
analyzing a dataset diminished and only the methods with the lowest
computational requirements and lowest accuracy remained.
Alternatively, methods have been developed to simultaneously estimate
phylogenetic alignments and trees. Methods optimizing the treelength
optimization problem - the most widely-used approach for simultaneous
estimation - have not been shown to return more accurate trees and alignments
than two-phase approaches. I demonstrate that treelength optimization
under a particular class of optimization criteria represents
a promising means for inferring accurate trees
and alignments.
The other methods for simultaneous estimation are not known to
support analyses of datasets with a few hundred sequences due to their
high computational requirements.
The main contribution of my dissertation is SATe,
the first fast and accurate method for simultaneous
estimation of alignments and trees on datasets with up to several
thousand nucleotide sequences. SATe improves upon the alignment and
topological accuracy of all existing methods, especially
on the most difficult-to-align datasets, while retaining
reasonable computational requirements. / text
|
2 |
Propriétés métriques des grands graphes / Metric properties of large graphsDucoffe, Guillaume 09 December 2016 (has links)
Les grands réseaux de communication sont partout, des centres de données avec des millions de serveurs jusqu’aux réseaux sociaux avec plusieurs milliards d’utilisateurs.Cette thèse est dédiée à l’étude fine de la complexité de différents problèmes combinatoires sur ces réseaux. Dans la première partie, nous nous intéressons aux propriétés des plongements des réseaux de communication dans les arbres. Ces propriétés aident à mieux comprendre divers aspects du trafic dans les réseaux (tels que la congestion). Plus précisément, nous étudions la complexité du calcul de l’hyperbolicité au sens de Gromov et de paramètres des décompositions arborescentes dans les graphes. Ces paramètres incluent la longueur arborescente (treelength) et l’épaisseur arborescente (treebreadth). Au passage, nous démontrons de nouvelles bornes sur ces paramètres dans de nombreuses classes de graphes, certaines d’entre elles ayant été utilisées dans la conception de réseaux d’interconnexion des centres de données. Le résultat principal dans cette partie est une relation entre longueur et largeur arborescentes (treewidth), qui est un autre paramètre très étudié des graphes. De ce résultat, nous obtenons une vision unifiée de la ressemblance des graphes avec un arbre, ainsi que différentes applications algorithmiques. Nous utilisons dans cette partie divers outils de la théorie des graphes et des techniques récentes de la théorie de la complexité / Large scale communication networks are everywhere, ranging from data centers withmillions of servers to social networks with billions of users. This thesis is devoted tothe fine-grained complexity analysis of combinatorial problems on these networks.In the first part, we focus on the embeddability of communication networks totree topologies. This property has been shown to be crucial in the understandingof some aspects of network traffic (such as congestion). More precisely, we studythe computational complexity of Gromov hyperbolicity and of tree decompositionparameters in graphs – including treelength and treebreadth. On the way, we givenew bounds on these parameters in several graph classes of interest, some of thembeing used in the design of data center interconnection networks. The main resultin this part is a relationship between treelength and treewidth: another well-studiedgraph parameter, that gives a unifying view of treelikeness in graphs and has algorithmicapplications. This part borrows from graph theory and recent techniques incomplexity theory. The second part of the thesis is on the modeling of two privacy concerns with social networking services. We aim at analysing information flows in these networks,represented as dynamical processes on graphs. First, a coloring game on graphs isstudied as a solution concept for the dynamic of online communities. We give afine-grained complexity analysis for computing Nash and strong Nash equilibria inthis game, thereby answering open questions from the literature. On the way, wepropose new directions in algorithmic game theory and parallel complexity, usingcoloring games as a case example
|
Page generated in 0.0495 seconds