Spelling suggestions: "subject:"inference -- data processing"" "subject:"inference -- mata processing""
1 |
A factoring approach for probabilistic inference in belief networksLi, Zhaoyu 08 June 1993 (has links)
Reasoning about any realistic domain always involves a degree of uncertainty.
Probabilistic inference in belief networks is one effective way of reasoning under
uncertainty. Efficiency is critical in applying this technique, and many researchers
have been working on this topic. This thesis is the report of our research in this
area.
This thesis contributes a new framework for probabilistic inference in belief
networks. The previously developed algorithms depend on the topological structure
of a belief network to perform inference efficiently. Those algorithms are constrained
by the way they use topological information and may not work efficiently for some
inference tasks. This thesis explores the essence of probabilistic inference, analyzes
previously developed algorithms, and presents a factoring approach for probabilistic
inference. It proposes that efficient probabilistic inference in belief networks can be
considered as an optimal factoring problem.
The optimal factoring framework provides an alternative perspective on
probabilistic inference and a quantitative measure of efficiency for an algorithm.
Using this framework, this thesis presents an optimal factoring algorithm for poly-tree
networks and for arbitrary belief networks (albeit with exponential worst-case
time complexity for non-poly-tree networks). Since the optimal factoring problem
in general is a hard problem, a heuristic algorithm, called set-factoring, is developed
for multiply-connected belief networks. Set factoring is shown to outperform
previously developed algorithms. We also apply the optimal factoring framework
to the problem of finding an instantiation of all nodes of a belief network which has
the largest probability and present an efficient algorithm for that task.
Extensive computation of probabilistic inference renders any currently used
exact probabilistic inference algorithm intractable for large belief networks. One
way to extend this boundary is to consider parallel hardware. This thesis also
explores the issue of parallelizing probabilistic inference in belief networks. The
feasibility of parallelizing probabilistic inference is demonstrated analytically and
experimentally. Exponential-time numerical computation can be reduced by a
polynomial-time factoring heuristic. This thesis offers an insight into the effect
of the structure of a belief network on speedup and efficiency. / Graduation date: 1994
|
2 |
Partition-based Model Representation LearningHsu, Yayun January 2020 (has links)
Modern machine learning consists of both task forces from classical statistics and modern computation. On the one hand, this field becomes rich and quick-growing; on the other hand, different convention from different schools becomes harder and harder to communicate over time. A lot of the times, the problem is not about who is absolutely right or wrong, but about from which angle that one should approach the problem. This is the moment when we feel there should be a unifying machine learning framework that can withhold different schools under the same umbrella. So we propose one of such a framework and call it ``representation learning''.
Representations are for the data, which is almost identical to a statistical model. However, philosophically, we would like to distinguish from classical statistical modeling such that (1) representations are interpretable to the scientist, (2) representations convey the pre-existing subject view that the scientist has towards his/her data before seeing it (in other words, representations may not align with the true data generating process), and (3) representations are task-oriented.
To build such a representation, we propose to use partition-based models. Partition-based models are easy to interpret and useful for figuring out the interactions between variables. However, the major challenge lies in the computation, since the partition numbers can grow exponentially with respect to the number of variables. To solve the problem, we need a model/representation selection method over different partition models. We proposed to use I-Score with backward dropping algorithm to achieve the goal.
In this work, we explore the connection between the I-Score variable selection methodology to other existing methods and extend the idea into developing other objective functions that can be used in other applications. We apply our ideas to analyze three datasets, one is the genome-wide association study (GWAS), one is the New York City Vision Zero, and, lastly, the MNIST handwritten digit database.
On these applications, we showed the potential of the interpretability of the representations can be useful in practice and provide practitioners with much more intuitions in explaining their results. Also, we showed a novel way to look at causal inference problems from the view of partition-based models.
We hope this work serve as an initiative for people to start thinking about approaching problems from a different angle and to involve interpretability into the consideration when building a model so that it can be easier to be used to communicate with people from other fields.
|
Page generated in 0.3093 seconds