Spelling suggestions: "subject:"positive.on data"" "subject:"positive.in data""
1 |
An approach to boosting from positive-only dataMitchell, Andrew, Computer Science & Engineering, Faculty of Engineering, UNSW January 2004 (has links)
Ensemble techniques have recently been used to enhance the performance of machine learning methods. However, current ensemble techniques for classification require both positive and negative data to produce a result that is both meaningful and useful. Negative data is, however, sometimes difficult, expensive or impossible to access. In this thesis a learning framework is described that has a very close relationship to boosting. Within this framework a method is described which bears remarkable similarities to boosting stumps and that does not rely on negative examples. This is surprising since learning from positive-only data has traditionally been difficult. An empirical methodology is described and deployed for testing positive-only learning systems using commonly available multiclass datasets to compare these learning systems with each other and with multiclass learning systems. Empirical results show that our positive-only boosting-like method learns, using stumps as a base learner and from positive data only, successfully, and in the process does not pay too heavy a price in accuracy compared to learners that have access to both positive and negative data. We also describe methods of using positive-only learners on multiclass learning tasks and vice versa and empirically demonstrate the superiority of our method of learning in a boosting-like fashion from positive-only data over a traditional multiclass learner converted to learn from positive-only data. Finally we examine some alternative frameworks, such as when additional unlabelled training examples are given. Some theoretical justifications of the results and methods are also provided.
|
Page generated in 0.0732 seconds