11 
Nonparametric Bayesian Methods for Extracting Structure from DataMeeds, Edward 01 August 2008 (has links)
One desirable property of machine learning
algorithms is the ability to balance
the number of parameters in a model
in accordance with the amount of available data.
Incorporating nonparametric Bayesian priors into models is
one approach of automatically
adjusting model capacity to the amount of available data: with small
datasets, models are less complex
(require storing fewer parameters in memory), whereas with larger datasets, models
are implicitly more complex
(require storing more parameters in memory).
Thus, nonparametric Bayesian priors satisfy frequentist intuitions
about model complexity within a fully Bayesian framework.
This thesis presents several novel
machine learning models and applications that use
nonparametric Bayesian priors.
We introduce two novel models that use flat,
Dirichlet process priors. The first is an infinite mixture
of experts model, which builds
a fully generative, joint density model of the input and output space.
The second is a Bayesian
biclustering model, which simultaneously
organizes a
data matrix into
blockconstant biclusters.
The model capable of efficiently processing very large, sparse matrices,
enabling cluster analysis on incomplete data matrices.
We introduce binary matrix factorization,
a novel matrix factorization model that, in contrast to
classic factorization methods, such as singular value decomposition,
decomposes a matrix using latent binary matrices.
We describe two nonparametric Bayesian priors
over tree structures. The first is an infinitely exchangeable
generalization of the nested
Chinese restaurant process that generates
datavectors at a single node in the tree.
The second is a novel, finitely exchangeable
prior generates trees by first partitioning data indices into groups
and then by randomly
assigning groups to a tree.
We present two applications of the tree priors: the first
automatically learns probabilistic stickfigure models of motioncapture
data that recover
plausible structure and are robust to missing
marker data.
The second learns hierarchical
allocation models based on the latent Dirichlet allocation
topic model for document corpora,
where nodes in a topictree
are latent ``supertopics", and nodes
in a documenttree are latent
categories.
The thesis concludes
with a summary of contributions, a discussion
of the models and their limitations, and a brief outline
of potential future research
directions.

12 
Genome GraphsMedvedev, Paul 18 February 2011 (has links)
Wholegenome shotgun sequencing is an experimental technique used for obtaining information about a genome’s sequence, whereby it is broken up into many short (possibly overlapping) segments whose sequence is then determined. A longstanding use of sequencing is in genome assembly – the problem of determining the sequence of an unknown genome, which plays a central role for the sequencing of novel species. However, even within the same species, the genomes of two individuals differ, and though these variations are relatively small, they account for the observed variation in phenotypes. A large portion of these are copy number variants (CNVs), or genomic segments which appear a different number of times in different individuals.
The unifying theme of this thesis is the use of genome graphs for both CNV detection and genome assembly problems. Genome graphs, which have already been successfully used for alignment and assembly, capture the structure of a genome even when its sequence is not fully known, as with the case of sequencing data. In this thesis, we extend
their uses in several ways, culminating in a method for CNV detection that is based on a novel genome graph model. First, we demonstrate how the doublestranded nature of
DNA can be efficiently incorporated into genome graphs by using the technique of bidirected network flow. Furthermore, we show how genome graphs can be efficiently used for finding solutions that maximize the likelihood of the data, as opposed to the usual maximum parsimony approach. Finally, we show how genome graphs can be useful for
CNV detection through a novel construction called the donor graph. These extensions are combined into a method for detecting CNVs, which we use on a Yoruban human individual, showing a high degree of accuracy and improvement over previous methods.

13 
DREM: Architectural Support for Deterministic Redundant Execution of Multithreaded ProgramsKvasov, Stanislav 12 February 2010 (has links)
Recently there have been several proposals to use redundant execution
of diverse replicas to defend against attempts to exploit memory corruption vulnerabilities. However, redundant execution relies on the premise that the replicas behave deterministically, so that if inputs are replicated to both replicas, any divergences in their outputs can only be the result of an attack. Unfortunately, this assumption does not hold for multithreaded programs, which are becoming increasingly prevalent  the
nondeterministic interleaving of threads can also cause divergences in the replicas.
This thesis presents a method to eliminate concurrency related nondeterminism between replicas.
We introduce changes to the existing cache coherence hardware used in multicores to support
deterministic redundant execution. We demonstrate that our solution requires moderate hardware changes and shows modest overhead in scientific applications.

14 
Genome GraphsMedvedev, Paul 18 February 2011 (has links)
Wholegenome shotgun sequencing is an experimental technique used for obtaining information about a genome’s sequence, whereby it is broken up into many short (possibly overlapping) segments whose sequence is then determined. A longstanding use of sequencing is in genome assembly – the problem of determining the sequence of an unknown genome, which plays a central role for the sequencing of novel species. However, even within the same species, the genomes of two individuals differ, and though these variations are relatively small, they account for the observed variation in phenotypes. A large portion of these are copy number variants (CNVs), or genomic segments which appear a different number of times in different individuals.
The unifying theme of this thesis is the use of genome graphs for both CNV detection and genome assembly problems. Genome graphs, which have already been successfully used for alignment and assembly, capture the structure of a genome even when its sequence is not fully known, as with the case of sequencing data. In this thesis, we extend
their uses in several ways, culminating in a method for CNV detection that is based on a novel genome graph model. First, we demonstrate how the doublestranded nature of
DNA can be efficiently incorporated into genome graphs by using the technique of bidirected network flow. Furthermore, we show how genome graphs can be efficiently used for finding solutions that maximize the likelihood of the data, as opposed to the usual maximum parsimony approach. Finally, we show how genome graphs can be useful for
CNV detection through a novel construction called the donor graph. These extensions are combined into a method for detecting CNVs, which we use on a Yoruban human individual, showing a high degree of accuracy and improvement over previous methods.

15 
DREM: Architectural Support for Deterministic Redundant Execution of Multithreaded ProgramsKvasov, Stanislav 12 February 2010 (has links)
Recently there have been several proposals to use redundant execution
of diverse replicas to defend against attempts to exploit memory corruption vulnerabilities. However, redundant execution relies on the premise that the replicas behave deterministically, so that if inputs are replicated to both replicas, any divergences in their outputs can only be the result of an attack. Unfortunately, this assumption does not hold for multithreaded programs, which are becoming increasingly prevalent  the
nondeterministic interleaving of threads can also cause divergences in the replicas.
This thesis presents a method to eliminate concurrency related nondeterminism between replicas.
We introduce changes to the existing cache coherence hardware used in multicores to support
deterministic redundant execution. We demonstrate that our solution requires moderate hardware changes and shows modest overhead in scientific applications.

16 
Firstorder Decisiontheoretic Planning in Structured Relational EnvironmentsSanner, Scott 28 July 2008 (has links)
We consider the general framework of firstorder decisiontheoretic planning in structured relational environments. Most traditional solution approaches to these planning problems ground the relational specification w.r.t. a specific domain instantiation and apply a solution approach directly to the resulting ground Markov decision process (MDP). Unfortunately, the space and time complexity of these solution algorithms scale linearly with the domain size in the best case and exponentially in the worst case. An alternate approach to grounding a relational planning problem is to lift it to a firstorder MDP (FOMDP) specification. This FOMDP can then be solved directly, resulting in a domainindependent solution whose space and time complexity either do not scale with domain size or can scale sublinearly in the domain size. However, such generality does not come without its own set of challenges and the first purpose of this thesis is to explore exact and approximate solution techniques for practically solving FOMDPs. The second purpose of this thesis is to extend the FOMDP specification to succinctly capture factored actions and additive rewards while extending the exact and approximate solution techniques to directly exploit this structure. In addition, we provide a proof of correctness of the firstorder symbolic dynamic programming approach w.r.t. its wellstudied ground MDP counterpart.

17 
Neural Representation, Learning and Manipulation of UncertaintyNatarajan, Rama 21 April 2010 (has links)
Uncertainty is inherent in neural processing due to noise in sensation and the sensory transmission processes, the illposed nature of many perceptual tasks, and temporal dynamics of the natural environment, to name a few causes. A wealth of evidence from physiological and behavioral experiments show that these various forms of uncertainty have major effects on perceptual learning and inference. In order to use sensory inputs efficiently to make decisions and guide behavior, neural systems must represent and manipulate information about uncertainty in their computations.
In this thesis, we first consider how spiking neural populations might encode and decode information about continuous dynamic stimulus variables including the uncertainty about them. We explore the efficacy of a complex encoder that is paired with a simple decoder which allows computationally straightforward representation and manipulation of dynamically changing uncertainty. The encoder we present takes the form of a biologically plausible recurrent spiking neural network where the output population recodes its inputs to produce spikes that are independently decodeable. We show that this network can be learned in a supervised manner, by a simple, local learning rule. We also demonstrate that the coding scheme can be applied recursively to carry out meaningful uncertaintysensitive computations such as dynamic cue combination.
Next, we explore the computational principles that underlie nonlinear response characteristics such as perceptual bias and uncertainty observed in audiovisual spatial illusions that involve multisensory interactions with conflicting cues. We examine in detail, the explanatory power of one particular causal model in characterizing the impact of conflicting inputs on perception and behavior. We also attempt to understand from a computational perspective, whether and how different task instructions might modulate the interaction of information from individual (visual and auditory) senses. Our analyses reveal some new properties of the sensory likelihoods and stimulus prior which were thought to be well described by Gaussian functions. Our results conclude that taskspecific expectations can influence perception in ways that relate to a choice of inference strategy.

18 
Generation and Verification of Plans with LoopsHu, Yuxiao 22 August 2012 (has links)
This thesis studies planning problems whose solution plans are programlike structures that contain branches and loops. Such problems are a generalization of classical and conditional planning, and usually involve infinitely many cases to be handled by a single plan. This form of planning is useful in a number of applications, but meanwhile challenging to analyze and solve. As a result, it is drawing increasing interest in the AI community.
In this thesis, I will give a formal definition of planning with loops in the situation calculus framework, and propose a corresponding plan representation in the form of finitestate automata. It turns out that this definition is more general than a previous formalization that uses restricted programming structures for plans.
For the verification of plans with loops, we study a property of planning problems called finite verifiability. Such problems have the property that for any candidate plan, only a finite number of cases need to be checked in order to conclude whether the plan is correct for all the infinitely many cases. I will identify several forms of finitelyverifiable classes of planning problems, including the socalled onedimensional problems where an unknown and unbounded number of objects need independent processing. I will also show that this property is not universal, in that finite verifiability of less restricted problems would mean a solution to the Halting problem or an unresolved mathematical conjecture.
For the generation of plans with loops, I will present a novel nondeterministic algorithm which essentially searches in the space of the AND/OR execution trees of an incremental partial plan on a finite set of example instances of the planning problem. Two different implementations of the algorithm are explored for search efficiency, namely, heuristic search and randomized search with restarts. In both cases, I will show that the resulting planner generates compact plans for a dozen benchmark problems, some of which are not solved by other existing approaches, to the best of our knowledge.
Finally, I will present generalizations and applications of the framework proposed in this thesis that enables the analysis and solution of related planning problems recently proposed in the literature, namely, controller synthesis, service composition and planning programs. Notably, the latter two require possiblynonterminating execution in a dynamic environment to provide services to coming requests. I will show a generic definition and planner whose instantiation accommodates and solves all the three example applications. Interestingly, the instantiations are competitive with, and sometimes even outperform, the original tailored approaches proposed in the literature.

19 
Tree Spanners of Simple GraphsPapoutsakis, Ioannis 09 August 2013 (has links)
A tree $t$spanner $T$ of a simple graph $G$ is a spanning tree of $G$,
such that for every pair of vertices of $G$ their distance in $T$ is at
most $t$ times their distance in $G$, where $t$ is called a stretch
factor of $T$ in $G$. It has been shown that there is a linear time
algorithm to find a tree 2spanner in a graph;
it has also been proved that, for each $t>3$, determining whether a graph
admits a tree $t$spanner is an NPcomplete problem. This thesis studies
tree $t$spanners from both theoretical and algorithmic perspectives.
In particular, it is proved that a nontree graph admits a unique tree
$t$spanner for at
most one value of stretch factor $t$. As a corollary, a nontree
bipartite graph cannot admit a unique tree $t$spanner for any $t$.
But, for each $t$, there are infinitely many nontree graphs that admit
exactly one tree $t$spanner. Furthermore, for each $t$, let U($t$) be
the set of graphs being the union of two tree $t$spanners of a graph.
Although graphs in U(2) do not have cycles of length greater than 4,
graphs in U(3) may contain cycles of arbitrary length. It turns out that
any even cycle is an induced subgraph of a graph in U(3), while no graph in
U(3) contains an induced odd cycle other than a triangle; graphs in U(3)
are shown to be perfect. Also, properties of induced even cycles of graphs
in U(3) are presented. For each $t>3$, though, graphs in U($t$) may
contain induced odd cycles of any length.
Moreover, there is an efficient algorithm to recognize graphs that admit a
tree 3spanner of diameter at most 4, while it is proved that, for
each $t>3$, determining whether a graph admits a tree $t$spanner of
diameter at most $t+1$ is an NPcomplete problem. It is not known if it
is hard to recognize graphs that admit a tree 3spanner of general diameter;
however integer programming is employed to provide certificates of tree
3spanner inadmissibility for a family of graphs.

20 
Customizable Services for Applicationlayer Overlay NetworksZhao, Yu 17 July 2013 (has links)
Applicationlayer overlay networks have emerged as a powerful paradigm for providing network services. While most approaches focus on providing a predefined set of network services, we provide a mechanism for network applications to deploy customizable data delivery services. We present the design, implementation, and evaluation of applicationdefined data delivery services that are executed at overlay nodes by transmitting messages marked with service identifiers. In our approach, a data delivery services is specified as an XML specification that define a finitestate machines that respond to network events, and perform a set of network primitives. We implemented a mechanism to execute these XML specifications in the HyperCast overlay middleware, and have evaluated this mechanism quantitatively on an Emulab testbed. The experiments show that our approach is effective in realizing a variety of data delivery services without incurring unreasonable performance overhead.

Page generated in 0.0576 seconds