Return to search

Cost sensitive meta-learning

Classification is one of the primary tasks of data mining and aims to assign a class label to unseen examples by using a model learned from a training dataset. Most of the accepted classifiers are designed to minimize the error rate but in practice data mining involves costs such as the cost of getting the data, and cost of making an error. Hence the following question arises: Among all the available classification algorithms, and in considering a specific type of data and cost, which is the best algorithm for my problem? It is well known to the machine learning community that there is no single algorithm that performs best for all domains. This observation motivates the need to develop an “algorithm selector” which is the work of automating the process of choosing between different algorithms given a specific domain of application. Thus, this research develops a new meta-learning system for recommending cost-sensitive classification methods. The system is based on the idea of applying machine learning to discover knowledge about the performance of different data mining algorithms. It includes components that repeatedly apply different classification methods on data sets and measuring their performance. The characteristics of the data sets, combined with the algorithm and the performance provide the training examples. A decision tree algorithm is applied on the training examples to induce the knowledge which can then be applied to recommend algorithms for new data sets, and then active learning is used to automate the ability to choose the most informative data set that should enter the learning process. This thesis makes contributions to both the fields of meta-learning, and cost sensitive learning in that it develops a new meta-learning approach for recommending cost-sensitive methods. Although, meta-learning is not new, the task of accelerating the learning process remains an open problem, and the thesis develops a novel active learning strategy based on clustering that gives the learner the ability to choose which data to learn from and accordingly, speed up the meta-learning process. Both the meta-learning system and use of active learning are implemented in the WEKA system and evaluated by applying them on different datasets and comparing the results with existing studies available in the literature. The results show that the meta-learning system developed produces better results than METAL, a well-known meta-learning system and that the use of clustering and active learning has a positive effect on accelerating the meta-learning process, where all tested datasets show a decrement of error rate prediction by 75 %.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:674954
Date January 2015
CreatorsShilbayeh, S. A.
PublisherUniversity of Salford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://usir.salford.ac.uk/36278/

Page generated in 0.0025 seconds