Return to search

Attribute Interaction Effects in Rule Induction

Rule induction is a popular technique for knowledge acquisition and data mining. Many techniques, such as ID3, C4.5, CART (tree induction tecniques) and Artificial Neural Networks have been developed and widely used. However, most techniques are either based on categorical or numerical mechanisms to assess the importance of different input variables, which may not produce the optimal rule when a mixture of variables exists.
In 1992, Liang proposed a composite approach called CRIS that use different method to analyze different types of data in inducing rules for binary classification. Yang conducted a follow-up research to extend the original algorithm to multiple categories. However, both methods do not take variable interaction into consideration.
The purpose of this research is to extend previous approach and extend by including second-order interaction. We also take into consideration the kurtosis and skewness of data for numerical variables. For categorical data, we also adopt ID3 algorithm to handle classes with low representation in the sample. In order to evaluate this technique, we develop a prototype CRIS 3.0 and compare with existing techniques, including multi-category-CRIS, CART and C4.5 as benchmark. The results show that CRIS 3.0 has the highest probability of producing the highest prediction accuracy.

Identiferoai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0728108-144440
Date28 July 2008
CreatorsYang, Chi-hsien
ContributorsChih-ping Wei, Ting-peng Liang, Deng-neng Chen
PublisherNSYSU
Source SetsNSYSU Electronic Thesis and Dissertation Archive
LanguageCholon
Detected LanguageEnglish
Typetext
Formatapplication/pdf
Sourcehttp://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0728108-144440
Rightscampus_withheld, Copyright information available at source archive

Page generated in 0.0023 seconds