We study diversity in classifier ensembles from a broader perspectivethan the 0/1 loss function, the main reason being that the bias-variance decomposition of the 0/1 loss function is not unique, and therefore the relationship between ensemble accuracy and diversity is still unclear. In the parallel field of regression ensembles, where the loss function of interest is the mean squared error, this decomposition not only exists, but it has been shown that diversity can be managed via the Negative Correlation (NC) framework. In the field of probabilistic modelling the expected value of the negative log-likelihood loss function is given by its conditional entropy; this result suggests that interaction information might provide some insight into the trade off between accuracy and diversity. Our objective is to improve our understanding of classifier diversity by focusing on two different loss functions - the mean squared error and the negative log-likelihood. In a study of mean squared error functions, we reformulate the Tumer & Ghosh model for the classification error as a regression problem, and we show how the NC learning framework can be deployed to manage diversity in classification problems. In an empirical study of classifiers that minimise the negative log-likelihood loss function, we discuss model diversity as opposed to error diversity in ensembles of Naive Bayes classifiers. We observe that diversity in low-variance classifiers has to be structurally inferred. We apply interaction information to the problem of monitoring diversity in classifier ensembles. We present empirical evidence that interaction information can capture the trade-off between accuracy and diversity, and that diversity occurs at different levels of interactions between base classifiers. We use interaction information properties to build ensembles of structurally diverse averaged Augmented Naive Bayes classifiers. Our empirical study shows that this novel ensemble approach is computationally more efficient than an accuracy based approach and at the same time it does not negatively affect the ensemble classification performance.
Identifer | oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:525926 |
Date | January 2010 |
Creators | Zanda, Manuela |
Contributors | Brown, Gavin |
Publisher | University of Manchester |
Source Sets | Ethos UK |
Detected Language | English |
Type | Electronic Thesis or Dissertation |
Source | https://www.research.manchester.ac.uk/portal/en/theses/a-probabilistic-perspective-on-ensemble-diversity(06296f74-806a-42dc-a65f-f7607f67d9f5).html |
Page generated in 0.0013 seconds