Spelling suggestions: "subject:"phylogenetic trees"" "subject:"hylogenetic trees""
1 |
Algebra and Phylogenetic TreesHansen, Michael 01 May 2007 (has links)
One of the restrictions used in all of the works done on phylogenetic invariants for group based models has been that the group be abelian. In my thesis, I aim to generalize the method of invariants for group-based models of DNA sequence evolution to include nonabelian groups. By using a nonabelian group to act one the nucleotides, one could capture the structure of the symmetric model for DNA sequence evolution. If successful, this line of research would unify the two separated strands of active research in the area today: Allman and Rhodes’s invariants for the symmetric model and Strumfels and Sullivant’s toric ideals of phylogenetic invariants. Furthermore, I want to look at the statistical properties of polynomial invariants to get a better understanding of how they behave when used with real, “noisy” data.
|
2 |
Constructing Phylogenetic Trees from SubsplitsKashiwada, Akemi 01 May 2005 (has links)
Phylogenetic trees represent theoretical evolutionary relationships among various species. Mathematically they can be described as weighted binary trees and the leaves represent the taxa being compared. One major problem in mathematical biology is the reconstruction of these trees. We already know that trees on the leaf set X can be uniquely constructed from splits, which are bipartitions of X. The question I explore in this thesis is whether reconstruction of a tree is possible from subsplits, or partial split information. The major result of this work is a constructive algorithm which allows us to determine whether a given set of subsplits will realize a tree and, if so, what the tree looks like.
|
3 |
Large-scale analysis of phylogenetic search behaviorPark, Hyun Jung 15 May 2009 (has links)
Phylogenetic analysis is used in all branches of biology by inferring evolutionary
trees. Applications include designing more effective drugs, tracing the transmission of
deadly viruses, and guiding conservation and biodiversity efforts. Most analyses rely
on effective heuristics for obtaining accurate trees. However, relatively little work has
been done to analyze quantitatively the behavior of phylogenetic heuristics in tree
space. This is important, because a better understanding of local search behavior
can facilitate the design of better heuristics, which ultimately leads to more accurate
depictions of the true evolutionary relationships.
In order to access and analyze the tree search space, we implement an effec-
tive local search heuristic. Having an effective heuristic that can open the space is
important, since no search heuristic in this field can effectively provide data collec-
tion control. So we have implemented and estimated a search heuristic, Simple Local
Search or SLS, that works reasonably well in the space.
Our investigations led to several interesting observations about the behavior of a
search heuristic and the tree search space. We studied the correlation of tree features
of search path trees, where tree features refer to the parsimony score, the Robinson-
Foulds distance and the homoplasy measure. Most importantly from the results,
parsimony score was highly correlated with Robinson-Foulds distance only in trees
that lie on the search path to a local optimum. We also note that the scores of
neighborhoods along search paths improve together, as a local search progresses. Correlations of tree features of search path trees are particularly useful in char-
acterizing and controlling a search path. This paper proposes one possible stopping
criterion to maximize the tree search results while minimizing computational time
tested on three biological datasets using the correlation between the parsimony score
and the RF distance value of search path trees. Also, the observation that scores of
a neighborhood on a search path improve together gives us a significant amount of
flexibility in selecting the next pivot of a search without losing performance.
Eventually, our long-term goal is developing an effective search heuristic that
can deal with large scale tree space in reasonable time. Improved knowledge about
the tree search space and the search heuristic can provide a reasonable starting point
toward the goal.
|
4 |
Simulated annealing in the search for phylogenetic treesBarker, Daniel January 2000 (has links)
I investigate use of the simulated annealing heuristic to seek phylogenetic trees judged optimal according to the principle of parsimony. I begin by looking into the central data structure in phylogenetic research, the tree. I discuss why it is usually necessary to employ a heuristic, rather than an exact method, when seeking parsimonious trees. I summarise different heuristic approaches. I explain how to use the program LVB, written to use simulated annealing in the search for parsimonious trees. I use LVB, with different combinations of values for parameters controlling the annealing search, to re-analyse two DNA sequence data matrices, one of 50 objects and one of 365 objects. Equations to estimate suitable control parameters, on the basis of desired run time and quality of result, are fitted to data obtained by these analyses. Future directions of research are discussed.
|
5 |
Phylogenetic supertree methodsSwenson, Michelle Dawn 29 April 2014 (has links)
The central task in phylogenetics is to infer the evolutionary relationships among a given set of species. These relationships are usually represented by a phylogenetic tree with the species of interest at the leaves and where the internal vertices of the tree represent ancestral species. The amount of available molecular data is increasing exponentially and, given the continual advances in sequencing techniques and throughput, this explosive growth will likely continue. These vast amounts of available data mean that biologists are able to assemble large multi-gene datasets for use in phylogenetic analyses, which presents distinct computational challenges. Supertree methods comprise one approach to reconstructing large phylogenies, given estimated trees for overlapping subsets of the entire set of taxa. These source trees are combined into a single supertree on the full set of taxa using various algorithmic techniques. When the data allow, the competing approach is a combined analysis (also known as a “super-matrix” or “total evidence” approach), whereby the different sequence data matrices for each of the different subsets of taxa are put into a single super-matrix, and a tree is estimated on that super-matrix. In this dissertation, I present simulation software I designed to allow users to compare the relative performance of different supertree methods, as well as that of combined analysis, on more realistic data and on a larger scale than has been used up to this point. I present an extensive simulation study that uses this software to compare the performance of supertree methods and combined analysis, and that demonstrates a need for more topologically accurate supertree methods. I also introduce a new supertree method that I have developed that outperforms the most commonly used, and what until now has arguably been the most accurate, supertree method. / text
|
6 |
Estimating phylogenetic trees from discrete morphological dataWright, April Marie 04 September 2015 (has links)
Morphological characters have a long history of use in the estimation of phylogenetic trees. Datasets consisting of morphological characters are most often analyzed using the maximum parsimony criterion, which seeks to minimize the amount of character change across a phylogenetic tree. When combined with molecular data, characters are often analyzed using model-based methods, such as maximum likelihood or, more commonly, Bayesian estimation. The efficacy of likelihood and Bayesian methods using a common model for estimating topology from discrete morphological characters, the Mk model, is poorly-explored. In Chapter One, I explore the efficacy of Bayesian estimation of phylogeny, using the Mk model, under conditions that are commonly encountered in paleontological studies. Using simulated data, I describe the relative performances of parsimony and the Mk model under a range of realistic conditions that include common scenarios of missing data and rate heterogeneity. I further examine the use of the Mk model in Chapter Two. Like any model, the Mk model makes a number of assumptions. One is that transition between character states are symmetric (i.e., there is an equal probability of changing from state 0 to state 1 and from state 1 to state 0). Many characters, including alleged Dollo characters and extremely labile characters, may not fit this assumption. I tested methods for relaxing this assumption in a Bayesian context. Using empirical datasets, I performed model fitting to demonstrate cases in which modelling asymmetric transitions among characters is preferred. I used simulated datasets to demonstrate that choosing the best-fit model of transition state symmetry can improve model fit and phylogenetic estimation. In my final chapter, I looked at the use of partitions to model datasets more appropriately. Common in molecular studies, partitioning breaks up the dataset into pieces that evolve according to similar mechanisms. These pieces, called partitions, are then modeled separately. This practice has not been widely adopted in morphological studies. I extended the PartitionFinder software, which is used in molecular studies to score different possible partition schemes to find the one which best models the dataset. I used empirical datasets to demonstrate the effects of partitioning datasets on model likelihoods and on the phylogenetic trees estimated from those datasets. / text
|
7 |
Comparative And Functional Genomics Of Actinobacteria And ArchaeaGao, Beile 12 1900 (has links)
<p> The higher taxonomic groups within Prokaryotes are presently distinguished mainly on the basis of their branching in phylogenetic trees. In most cases, no molecular, biochemical or physiological characteristics are known that are uniquely shared by species from these groups. Comparative genomic analyses are leading to discovery of molecular characteristics that are specific for different groups of Bacteria and Archaea. These markers include conserved inserts and deletions in universal proteins and lineagespecific proteins, which provide novel means for identifying and circumscribing these groups of prokaryotes in clear molecular terms and for understanding their evolution. Because of their taxa specificities, further studies on these newly discovered molecular characteristics should lead to discovery of novel biochemical and physiological characteristics that are unique to different groups of microbes. The focus of my project was phylogenomic studies for two large prokaryotic group: Actinobacteria and Archaea. My goals were to a) identify molecular markers that are specific to Actinobacteria and Archaea at different taxonomic levels, which will help to understand the phylogenetic relationship within these two major groups; b) understand the functional significance of Actinobacteria-specific proteins. By comparative genomics approach, a number of conserved indels in various proteins (viz. Coxl, GluRS, CTPsyn, Gft, GlyRS, TrmD, Gyrase A, SahH and SHMT) have been identified that are specific for all Actinobacteria and additional indels were found to be unique to its major subgroups, such as Corynebacterineae, Bifidobacteriaceae, etc. In parallel, a large number of proteins were discovered to be restricted to Actinobacteria at different phylogenetic depths. These identified conserved indels and proteins for the first time provide useful markers for defining and circumscribing the Actinobacteria phylum or its subgroups in clear molecular terms. Similar comparative genomic studies have been carried out on Archaea and a vast number of proteins have been identified that are unique to Archaea or its various lineages. Lastly, I have performed functional studies on one of the Actinobacteria-specific proteins (ASPl). The structure of ASPl was determined and structural comparison indicates that the function of this protein might be novel since it does not match any known protein with or without known function. </p> / Thesis / Doctor of Philosophy (PhD)
|
8 |
Methods for phylogenetic analysisKrig, Kåre January 2010 (has links)
<p>In phylogenetic analysis one study the relationship between different species. By comparing DNA from two different species it is possible to get a numerical value representing the difference between the species. For a set of species, all pair-wise comparisons result in a dissimilarity matrix <em>d</em>.</p><p>In this thesis I present a few methods for constructing a phylogenetic tree from <em>d</em>. The common denominator for these methods is that they do not generate a tree, but instead give a connected graph. The resulting graph will be a tree, in areas where the data perfectly matches a tree. When <em>d</em> does not perfectly match a tree, the resulting graph will instead show the different possible topologies, and how strong support they have from the data.</p><p>Finally I have tested the methods both on real measured data and constructed test cases.</p>
|
9 |
Methods for phylogenetic analysisKrig, Kåre January 2010 (has links)
In phylogenetic analysis one study the relationship between different species. By comparing DNA from two different species it is possible to get a numerical value representing the difference between the species. For a set of species, all pair-wise comparisons result in a dissimilarity matrix d. In this thesis I present a few methods for constructing a phylogenetic tree from d. The common denominator for these methods is that they do not generate a tree, but instead give a connected graph. The resulting graph will be a tree, in areas where the data perfectly matches a tree. When d does not perfectly match a tree, the resulting graph will instead show the different possible topologies, and how strong support they have from the data. Finally I have tested the methods both on real measured data and constructed test cases.
|
10 |
Phylogenetic Methods for Testing Significant Codivergence between Host Species and their SymbiontsSpeakman, Skyler 01 January 2008 (has links)
Significant phylogenetic codivergence between plant or animal hosts (H) and their symbionts or parasites (P) indicate the importance of their interactions on evolutionary time scales. However, valid and realistic methods to test for codivergence are not fully developed. One of the systems where possible codivergence has been of interest involves the large subfamily of temperate grasses (Pooideae) and their endophytic fungi (epichloae). Here we introduce the MRCALink (most-recent-common-ancestor link) method and use it to investigate the possibility of grass-epichloё codivergence. MRCALink applied to ultrametric H and P trees identifies all corresponding nodes for pairwise comparisons of MRCA ages. The result is compared to the space of random H and Ptree pairs estimated by a Monte Carlo method. Compared to tree reconciliation the method is less dependent on tree topologies (which often can be misleading), and it crucially improves on phylogeny-independent methods such as ParaFit or the Mantel test by eliminating an extreme (but previously unrecognized) distortion of node-pair sampling. Analysis of 26 grass species-epichloё species symbioses did not reject random association of H and P MRCA ages. However, when five obvious host jumps were removed the analysis significantly rejected random association and supported grass-endophyte codivergence. Interestingly, early cladogenesis events in the Pooideae corresponded to early cladogenesis events in epichloae, suggesting concomitant origins of this grass subfamily and its remarkable group of symbionts.
|
Page generated in 0.0871 seconds