Return to search

Implementation and Experimentation with C4.5 Decision Trees

C4.5 is a decision tree learning algorithm that was developed by Ross Quinlan based on his earlier algorithm ID3. C4.5 is one of the most popular algorithms used to solve classification problems. Classification problems are problems of interest in a variety of disciplines. C4.5 is a supervised learning algorithm which uses a set of training patterns to build a decision tree. The algorithm uses the patterns and analyzes their individual attributes to partition the pattern data. The popularity of C4.5 stems from the fact that it can handle both continuous and categorical attributes, and it can deal with missing attribute values, while at the same time providing an easy interpretation for the answers that it produces. There are two objectives of this thesis. The first is to implement C4.5 in C++ within a generic architecture to allow for additional modules to be added. The second is to use this generic architecture to implement an innovative post induction phase which adjusts splits to minimize the error of the C4.5 tree. The C4.5 code and the post induction phase will be compiled into a MEX DLL for use as functions within MATLAB. Experimentation is performed using MATLAB to verify the advantages of this post induction phase.

Identiferoai:union.ndltd.org:ucf.edu/oai:stars.library.ucf.edu:honorstheses1990-2015-1669
Date01 January 2007
CreatorsBeck, Jason
PublisherSTARS
Source SetsUniversity of Central Florida
LanguageEnglish
Detected LanguageEnglish
Typetext
SourceHIM 1990-2015

Page generated in 0.0017 seconds