Return to search

A New Perspective on Classification

The idea of voting multiple decision rules was introduced in to statistics by Breiman. He used bootstrap samples to build different decision rules, and then aggregated them by majority voting (bagging). In regression, bagging gives improved predictors by reducing the variance (random variation), while keeping the bias (systematic error) the same. Breiman introduced the idea of bias and variance for classification to explain how bagging works. However, Friedman showed that for the two-class situation, bias and variance influence the classification error in a very different way than they do in the regression case.
In the first part of the dissertation, we build a theoretical framework for ensemble classifiers. Ensemble classifiers are currently the best off-the-shelf classifiers available, and they are the subject of much current research in classification. Our main theoretical results arc two theorems about voting iid (independently identically distributed) decision rules. The bias consistency theorem guarantees that voting will not change the Bias set, and the convergence theorem gives an explicit rate of convergence. The two theorems explain exactly how ensemble classifiers work. We also introduce the concept of weak consistency as opposed to the usual strong consistency. A boosting theorem is derived for a distribution-specific situation with iid voting.
In the second part of this dissertation, we discuss a special ensemble classifier called PERT. PERT is a voted random tree classifier for which each random tree classifies every training example correctly. PERT is shown to work surprisingly well. We discuss its consistency properties. We then compare its behavior to the NN (nearest neighbor) method and boosted c4.5. Both of the latter methods also classify every training example correctly. We call these types of classifiers “oversensitive” methods. We show that one reason PERT works is because of its “squeezing effect.”
In the third part of this dissertation, we design simulation studies to investigate why boosting methods work. The outlier effect of PERT is discussed and compared to boosted and bagged tree methods. We obtain a new criterion (Bayes deviance) that measures the efficiency of a classification method. We design simulation studies to compare the efficiency of several common classification methods, including NN, PERT, and boosted tree method.

Identiferoai:union.ndltd.org:UTAHS/oai:digitalcommons.usu.edu:etd-8219
Date01 May 2000
CreatorsZhao, Guohua
PublisherDigitalCommons@USU
Source SetsUtah State University
Detected LanguageEnglish
Typetext
Formatapplication/pdf
SourceAll Graduate Theses and Dissertations
RightsCopyright for this work is held by the author. Transmission or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. Works not in the public domain cannot be commercially exploited without permission of the copyright owner. Responsibility for any use rests exclusively with the user. For more information contact digitalcommons@usu.edu.

Page generated in 0.0027 seconds