Global ETD Search

Return to search

Benchmarking Open-Source Tree Learners in R/RWeka

The two most popular classification tree algorithms in machine learning and statistics - C4.5 and CART - are compared in a benchmark experiment together with two other more recent constant-fit tree learners from the statistics literature (QUEST, conditional inference trees). The study assesses both misclassification error and model complexity on bootstrap replications of 18 different benchmark datasets. It is carried out in the R system for statistical computing, made possible by means of the RWeka package which interfaces R to the open-source machine learning toolbox Weka. Both algorithms are found to be competitive in terms of misclassification error - with the performance difference clearly varying across data sets. However, C4.5 tends to grow larger and thus more complex trees. (author's abstract) / Series: Research Report Series / Department of Statistics and Mathematics

http://epub.wu.ac.at/1496/1/document.pdf

Identifer	oai:union.ndltd.org:VIENNA/oai:epub.wu-wien.ac.at:epub-wu-01_bd8
Date	January 2007
Creators	Schauerhuber, Michael, Zeileis, Achim, Meyer, David, Hornik, Kurt
Publisher	Department of Statistics and Mathematics, WU Vienna University of Economics and Business
Source Sets	Wirtschaftsuniversität Wien
Language	English
Detected Language	English
Type	Paper, NonPeerReviewed
Format	application/pdf
Relation	http://epub.wu.ac.at/1496/

Page generated in 0.0019 seconds

Benchmarking Open-Source Tree Learners in R/RWeka

Description

Links & Downloads

Tags

Additional Fields