Global ETD Search

Return to search

Data-Dependent Analysis of Learning Algorithms

This thesis studies the generalization ability of machine learning algorithms in a statistical setting. It focuses on the data-dependent analysis of the generalization performance of learning algorithms in order to make full use of the potential of the actual training sample from which these algorithms learn.¶

First, we propose an extension of the standard framework for the derivation of
generalization bounds for algorithms taking their hypotheses from random classes of functions. This approach is motivated by the fact that the function produced by a learning algorithm based on a random sample of data depends on this sample and is therefore a random function. Such an approach avoids the detour of the worst-case uniform bounds as done in the standard approach. We show that the mechanism which allows one to obtain generalization bounds for random classes in our framework is based on a “small complexity” of certain random coordinate
projections. We demonstrate how this notion of complexity relates to learnability
and how one can explore geometric properties of these projections in order to derive estimates of rates of convergence and good confidence interval estimates for the expected risk. We then demonstrate the generality of our new approach by presenting a range of examples, among them the algorithm-dependent compression schemes and the data-dependent luckiness
frameworks, which fall into our random subclass framework.¶

Second, we study in more detail generalization bounds for a specific algorithm which is of central importance in learning theory, namely the Empirical Risk Minimization algorithm (ERM). Recent results show that one can significantly improve the high-probability estimates for the convergence rates for empirical minimizers by a direct analysis of the ERM algorithm.
These results are based on a new localized notion of complexity of subsets of hypothesis functions with identical expected errors and are therefore dependent on the underlying unknown distribution. We investigate the extent to which one can estimate these high-probability convergence rates in a data-dependent manner. We provide an algorithm which computes a data-dependent upper bound for the expected error of empirical minimizers in terms of the “complexity” of data-dependent local subsets. These subsets are sets of functions of empirical errors of a given range and can be
determined based solely on empirical data.
We then show that recent direct estimates, which are essentially sharp estimates on the high-probability convergence rate for the ERM algorithm, can not be recovered universally from empirical data.

http://thesis.anu.edu.au./public/adt-ANU20050901.204523

statistical learning theory

generalization bounds

data-dependent complexity

machine learning algorithms

empirical risk minimization

empirical process theory

concentration inequalities

Rademacher averages

localized complexities

Identifer	oai:union.ndltd.org:ADTP/216795
Date	January 2005
Creators	Philips, Petra Camilla, petra.philips@gmail.com
Publisher	The Australian National University. Research School of Information Sciences and Engineering
Source Sets	Australiasian Digital Theses Program
Language	English
Detected Language	English
Rights	http://www.anu.edu.au/legal/copyrit.html), Copyright Petra Camilla Philips

Page generated in 0.0017 seconds

Data-Dependent Analysis of Learning Algorithms

Description

Links & Downloads

Tags

Additional Fields