Spelling suggestions: "subject:"simultaneously estimation""
1 |
Fast and accurate estimation of large-scale phylogenetic alignments and treesLiu, Kevin Jensen 06 July 2011 (has links)
Phylogenetics is the study of evolutionary relationships.
Phylogenetic trees and alignments play important roles in a wide range
of biological research, including reconstruction of the Tree of Life
- the evolutionary history of all organisms on Earth - and the
development of vaccines and antibiotics.
Today's phylogenetic studies seek to reconstruct
trees and alignments on a greater number and variety of
organisms than ever before, primarily
due to exponential
growth in affordable sequencing and computing power.
The importance of
phylogenetic trees and alignments motivates the need for
methods to reconstruct them accurately and efficiently
on large-scale datasets.
Traditionally, phylogenetic studies proceed in two phases: first, an
alignment is produced from biomolecular sequences with differing
lengths, and, second, a tree is produced using the alignment. My
dissertation presents the first empirical performance study of leading
two-phase methods on datasets with up to hundreds of thousands of
sequences. Relatively accurate alignments and trees were obtained
using methods with high computational requirements on datasets with a
few hundred sequences, but as datasets grew past 1000 sequences and up
to tens of thousands of sequences, the set of methods capable of
analyzing a dataset diminished and only the methods with the lowest
computational requirements and lowest accuracy remained.
Alternatively, methods have been developed to simultaneously estimate
phylogenetic alignments and trees. Methods optimizing the treelength
optimization problem - the most widely-used approach for simultaneous
estimation - have not been shown to return more accurate trees and alignments
than two-phase approaches. I demonstrate that treelength optimization
under a particular class of optimization criteria represents
a promising means for inferring accurate trees
and alignments.
The other methods for simultaneous estimation are not known to
support analyses of datasets with a few hundred sequences due to their
high computational requirements.
The main contribution of my dissertation is SATe,
the first fast and accurate method for simultaneous
estimation of alignments and trees on datasets with up to several
thousand nucleotide sequences. SATe improves upon the alignment and
topological accuracy of all existing methods, especially
on the most difficult-to-align datasets, while retaining
reasonable computational requirements. / text
|
2 |
Multivariate Analysis of Accident Related Outcomes with Respect to Contemporaneous Correlation and Endogeneity: Application of Simultaneous Estimation TechniquesKim, Do-Gyeong January 2006 (has links)
Motor vehicle crashes have increasingly become a serious concern for highway safety engineers and transportation agencies over the past few decades. This serious concern has led to a great deal of research activities. One of these activities is to develop safety analysis tools, specifically crash prediction models, for the purpose of reducing crashes and enhancing highway safety.Crash prediction models based on statistical or econometric modeling techniques are used for a variety of purposes; most commonly to estimate the expected crash frequencies from various roadway entities (highways, intersections, interstates, etc.) and also to identify geometric, environmental, and operations factors that are associated with crashes. A comprehensive review of prior literature indicates that many researchers have mainly focused on the development of aggregate crash prediction models based on single equation estimation techniques to identify the influences of geometric, environmental, and traffic variables on a single counted outcome. In some cases, however, more than one dependent variable might be of interest and hence several equations are formulated at the same time. Such a multiple equation structure may require simultaneous (or joint) estimation techniques under some situations.This dissertation research develops simultaneous estimation approaches to account for contemporaneous correlation and endogeneity problems in crash data. Specifically, seemingly unrelated negative binomial models and simultaneous equation models are developed to account for contemporaneous correlation between the disturbance terms across crash type models and to control for the endogenous relationship between the presence of left-turn lanes and angle crashes.Modeling crash types may provide certain advantages to gain insights as to 1) identification of high-risk sites with respect to specific types of crashes, which is not revealed through crash totals, and 2) the differences between conditions that lead to various crash types, but the disturbance terms across crash types might be contemporaneously correlated due to the unobserved common characteristics. Therefore, individual and simultaneous crash type models were estimated and the results of both models were compared. The results showed that a simultaneous estimation approach provides more efficient estimators relative to a single equation estimation technique.The presence of left-turn lanes has been treated as exogenous in crash prediction models, but in fact they are affecting each other. The bi-directional relationship between left-turn lanes and crashes results in endogeneity. This research investigated the endogenous relationship between left-turn lanes and crashes and developed simultaneous equation models to control for the endogeneity. The findings indicated that the presence of left-turn lanes is endogenously associated with crashes and the real effect of left-turn lanes on crashes can be obtained by controlling for endogeneity.
|
Page generated in 0.1211 seconds