This work presents an enhancement to the classification tree algorithm which forms the basis for Random Forests. Differently from the classical tree-based methods that focus on one variable at a time to separate the observations, the new algorithm performs the search for the best split in two-dimensional space using a linear combination of variables. Besides the classification, the method can be used to determine variables interaction and perform feature extraction. Theoretical investigations and numerical simulations were used to analyze the properties and performance of the new approach. Comparison with other popular classification methods was performed using simulated and real data examples. The algorithm was implemented as an extension package for the statistical computing environment R and is available for free download under the GNU General Public License.
Identifer | oai:union.ndltd.org:UTAHS/oai:digitalcommons.usu.edu:etd-2540 |
Date | 01 May 2013 |
Creators | Parfionovas, Andrejus |
Publisher | DigitalCommons@USU |
Source Sets | Utah State University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | All Graduate Theses and Dissertations |
Rights | Copyright for this work is held by the author. Transmission or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. Works not in the public domain cannot be commercially exploited without permission of the copyright owner. Responsibility for any use rests exclusively with the user. For more information contact Andrew Wesolek (andrew.wesolek@usu.edu). |
Page generated in 0.0015 seconds