Artificial neural networks (ANNs) are essentially a 'black box' technology. The lack of an explanation component prevents the full and complete exploitation of this form of machine learning. During the mid 1990's the field of 'rule extraction' emerged. Rule extraction techniques attempt to derive a human comprehensible explanation structure from a trained ANN. Andrews et.al. (1995) proposed the following reasons for extending the ANN paradigm to include a rule extraction facility: * provision of a user explanation capability * extension of the ANN paradigm to 'safety critical' problem domains * software verification and debugging of ANN components in software systems * improving the generalization of ANN solutions * data exploration and induction of scientific theories * knowledge acquisition for symbolic AI systems An allied area of research is that of 'rule refinement'. In rule refinement an initial rule base, (i.e. what may be termed `prior knowledge') is inserted into an ANN by prestructuring some or all of the network architecture, weights, activation functions, learning rates, etc. The rule refinement process then proceeds in the same way as normal rule extraction viz (1) train the network on the available data set(s); and (2) extract the `refined' rules. Very few ANN techniques have the capability to act as a true rule refinement system. Existing techniques, such as KBANN, (Towell & Shavlik, (1993), are limited in that the rule base used to initialize the network must be a nearly complete, and the refinement process is limited to modifying antecedents. The limitations of existing techniques severely limit their applicability to real world problem domains. Ideally, a rule refinement technique should be able to deal with incomplete initial rule bases, modify antecedents, remove inaccurate rules, and add new knowledge by generating new rules. The motivation for this research project was to develop such a rule refinement system and to investigate its efficacy when applied to both nearly complete and incomplete problem domains. The premise behind rule refinement is that the refined rules better represent the actual domain theory than the initial domain theory used to initialize the network. The hypotheses tested in this research include: * that the utilization of prior domain knowledge will speed up network training, * produce smaller trained networks, * produce more accurate trained networks, and * bias the learning phase towards a solution that 'makes sense' in the problem domain. In 1998 Geva, Malmstrom, & Sitte, (1998) described the Local Cluster (LC) Neural Net. Geva et.al. (1998) showed that the LC network was able to learn / approximate complex functions to a high degree of accuracy. The hidden layer of the LC network is comprised of basis functions, (the local cluster units), that are composed of sigmoid based 'ridge' functions. In the General form of the LC network the ridge functions can be oriented in any direction. We describe RULEX, a technique designed to provide an explanation component for its underlying Local Cluster ANN through the extraction of symbolic rules from the weights of the local cluster units of the trained ANN. RULEX exploits a feature, ie, axis parallel ridge functions, of the Restricted Local Cluster (Geva , Andrews & Geva 2002), that allow hyper-rectangular rules of the form IF ∀ 1 ≤ i ≤ n : xi ∈ [ xi lower , xi upper ] THEN pattern belongs to the target class to be easily extracted from local functions that comprise the hidden layer of the LC network. RULEX is tested on 14 applications available in the public domain. RULEX results are compared with a leading machine learning technique, See5, with RULEX generally performing as well as See5 and in some cases outperforming See5 in predictive accuracy. We describe RULEIN, a rule refinement technique that allows symbolic rules to be converted into the parameters that define local cluster functions. RULEIN allows existing domain knowledge to be captured in the architecture of a LC ANN thus facilitating the first phase of the rule refinement paradigm. RULEIN is tested on a variety of artificial and real world problems. Experimental results indicate that RULEIN is able to satisfy the first requirement of a rule refinement technique by correctly translating a set of symbolic rules into a LC ANN that has the same predictive bahaviour as the set of rules from which it was constructed. Experimental results also show that in the cases where a strong domain theory exists, initializing an LC network using RULEIN generally speeds up network training, produces smaller, more accurate trained networks, with the trained network properly representing the underlying domain theory. In cases where a weak domain theory exists the same results are not always apparent. Experiments with the RULEIN / LC / RULEX rule refinement method show that the method is able to remove inaccurate rules from the initial knowledge base, modify rules in the initial knowledge base that are only partially correct, and learn new rules not present in the initial knowledge base. The combination of RULEIN / LC / RULEX thus is shown to be an effective rule refinement technique for use with a Restricted Local Cluster network.
Identifer | oai:union.ndltd.org:ADTP/264781 |
Date | January 2003 |
Creators | Andrews, Robert |
Publisher | Queensland University of Technology |
Source Sets | Australiasian Digital Theses Program |
Detected Language | English |
Rights | Copyright Robert Andrews |
Page generated in 0.0019 seconds