Return to search

Learning with Degree-Based Subgraph Estimation

Networks and their topologies are critical to nearly every aspect of modern life, with social networks governing human interactions and computer networks governing global information-flow. Network behavior is inherently structural, and thus modeling data from networks benefits from explicitly modeling structure. This thesis covers methods for and analysis of machine learning from network data while explicitly modeling one important measure of structure: degree. Central to this work is a procedure for exact maximum likelihood estimation of a distribution over graph structure, where the distribution factorizes into edge-likelihoods for each pair of nodes and degree-likelihoods for each node. This thesis provides a novel method for exact estimation of the maximum likelihood edge structure under the distribution. The algorithm solves the optimization by constructing an augmented graph containing, in addition to the original nodes, auxiliary nodes whose edges encode the degree potentials. The exact solution is then recoverable by finding the maximum weight b-matching on the augmented graph, a well-studied combinatorial optimization. To solve the combinatorial optimization, this thesis focuses in particular on a belief propagation-based approach to finding the optimal b-matching and provides a novel proof of convergence for belief propagation on the loopy graphical model representing the b-matching objective. Additionally, this thesis describes new algorithmic techniques to improve the scalability of the b-matching solver. In addition to various applications of node degree in machine learning, including classification and collaborative filtering, this thesis proposes a learning algorithm for learning the parameters of the distribution from network data consisting of node attributes and network connectivity, using strategies similar to maximum-margin structured prediction. The main methods and results in this thesis represent a deep exploration of exact degree-based estimation for machine learning from network data, and furthermore lead to various extensions and applications of the main idea described within.

Identiferoai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/D8XG9Z41
Date January 2011
CreatorsHuang, Bert
Source SetsColumbia University
LanguageEnglish
Detected LanguageEnglish
TypeTheses

Page generated in 0.0018 seconds