Recent work has shown that AdaBoost can be viewed as an algorithm that
maximizes the margin on the training data via functional gradient descent. Under
this interpretation, the weight computed by AdaBoost, for each hypothesis generated,
can be viewed as a step size parameter in a gradient descent search. Friedman
has suggested that shrinking these step sizes could produce improved generalization
and reduce overfitting. In a series of experiments, he showed that very small
step sizes did indeed reduce overfitting and improve generalization for three variants
of Gradient_Boost, his generic functional gradient descent algorithm. For this
report, we tested whether reduced learning rates can also improve generalization in
AdaBoost. We tested AdaBoost (applied to C4.5 decision trees) with reduced learning
rates on 28 benchmark datasets. The results show that reduced learning rates
provide no statistically significant improvement on these datasets. We conclude that
reduced learning rates cannot be recommended for use with boosted decision trees
on datasets similar to these benchmark datasets. / Graduation date: 2002
Identifer | oai:union.ndltd.org:ORGSU/oai:ir.library.oregonstate.edu:1957/30992 |
Date | 30 November 2001 |
Creators | Forrest, Daniel L. K. |
Contributors | Dietterich, Thomas G. |
Source Sets | Oregon State University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Page generated in 0.0019 seconds