Return to search

The impact of training set size and feature dimensionality on supervised object-based classification : a comparison of three classifiers

Thesis (MSc)--Stellenbosch University, 2012. / ENGLISH ABSTRACT: Supervised classifiers are commonly used in remote sensing to extract land cover information.
They are, however, limited in their ability to cost-effectively produce sufficiently accurate
land cover maps. Various factors affect the accuracy of supervised classifiers. Notably, the
number of available training samples is known to significantly influence classifier
performance and to obtain a sufficient number of samples is not always practical. The support
vector machine (SVM) does perform well with a limited number of training samples. But little
research has been done to evaluate SVM’s performance for geographical object-based image
analysis (GEOBIA). GEOBIA also allows the easy integration of additional features into the
classification process, a factor which may significantly influence classification accuracies. As
such, two experiments were developed and implemented in this research. The first compared
the performances of object-based SVM, maximum likelihood (ML) and nearest neighbour
(NN) classifiers using varying training set sizes. The effect of feature dimensionality on
classifier accuracy was investigated in the second experiment.
A SPOT 5 subscene and a four-class classification scheme were used. For the first
experiment, training set sizes ranging from 4-20 per land cover class were tested. The
performance of all the classifiers improved significantly as the training set size was increased.
The ML classifier performed poorly when few (<10 per class) training samples were used and
the NN classifier performed poorly compared to SVM throughout the experiment. SVM was
the superior classifier for all training set sizes although ML achieved competitive results for
sets of 12 or more training samples per class. Training sets were kept constant (20 and 10
samples per class) for the second experiment while an increasing number of features (1 to 22)
were included. SVM consistently produced superior classification results. SVM and NN were
not significantly (negatively) affected by an increase in feature dimensionality, but ML’s
ability to perform under conditions of large feature dimensionalities and few training areas
was limited.
Further investigations using a variety of imagery types, classification schemes and additional
features; finding optimal combinations of training set size and number of features; and
determining the effect of specific features should prove valuable in developing more costeffective
ways to process large volumes of satellite imagery.
KEYWORDS
Supervised classification, land cover, support vector machine, nearest neighbour classification
maximum likelihood classification, geographic object-based image analysis / AFRIKAANSE OPSOMMING: Gerigte klassifiseerders word gereeld aangewend in afstandswaarneming om inligting oor
landdekking te onttrek. Sulke klassifiseerders het egter beperkte vermoëns om akkurate
landdekkingskaarte koste-effektief te produseer. Verskeie faktore het ʼn uitwerking op die
akkuraatheid van gerigte klassifiseerders. Dit is veral bekend dat die getal beskikbare
opleidingseenhede ʼn beduidende invloed op klassifiseerderakkuraatheid het en dit is nie altyd
prakties om voldoende getalle te bekom nie. Die steunvektormasjien (SVM) werk goed met
beperkte getalle opleidingseenhede. Min navorsing is egter gedoen om SVM se verrigting vir
geografiese objek-gebaseerde beeldanalise (GEOBIA) te evalueer. GEOBIA vergemaklik die
integrasie van addisionele kenmerke in die klassifikasie proses, ʼn faktor wat klassifikasie
akkuraathede aansienlik kan beïnvloed. Twee eksperimente is gevolglik ontwikkel en
geïmplementeer in hierdie navorsing. Die eerste eksperiment het objekgebaseerde SVM,
maksimum waarskynlikheids- (ML) en naaste naburige (NN) klassifiseerders se verrigtings
met verskillende groottes van opleidingstelle vergelyk. Die effek van
kenmerkdimensionaliteit is in die tweede eksperiment ondersoek.
ʼn SPOT 5 subbeeld en ʼn vier-klas klassifikasieskema is aangewend. Opleidingstelgroottes
van 4-20 per landdekkingsklas is in die eerste eksperiment getoets. Die verrigting van die
klassifiseerders het beduidend met ʼn toename in die grootte van die opleidingstelle verbeter.
ML het swak presteer wanneer min (<10 per klas) opleidingseenhede gebruik is en NN het, in
vergelyking met SVM, deurgaans swak presteer. SVM het die beste presteer vir alle groottes
van opleidingstelle alhoewel ML kompeterend was vir stelle van 12 of meer
opleidingseenhede per klas. Die grootte van die opleidingstelle is konstant gehou (20 en 10
eenhede per klas) in die tweede eksperiment waarin ʼn toenemende getal kenmerke (1 tot 22)
toegevoeg is. SVM het deurgaans beter klassifikasieresultate gelewer. SVM en NN was nie
beduidend (negatief) beïnvloed deur ʼn toename in kenmerkdimensionaliteit nie, maar ML se
vermoë om te presteer onder toestande van groot kenmerkdimensionaliteite en min
opleidingsareas was beperk.
Verdere ondersoeke met ʼn verskeidenheid beelde, klassifikasie skemas en addisionele
kenmerke; die vind van optimale kombinasies van opleidingstelgrootte en getal kenmerke; en
die bepaling van die effek van spesifieke kenmerke sal waardevol wees in die ontwikkelling
van meer koste effektiewe metodes om groot volumes satellietbeelde te prosesseer.
TREFWOORDE
Gerigte klassifikasie, landdekking, steunvektormasjien, naaste naburige klassifikasie,
maksimum waarskynlikheidsklassifikasie, geografiese objekgebaseerde beeldanalise

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/71655
Date12 1900
CreatorsMyburgh, Gerhard
ContributorsVan Niekerk, Adriaan, Stellenbosch University. Faculty of Science. Dept. of Geography and Environmental Studies.
PublisherStellenbosch : Stellenbosch University
Source SetsSouth African National ETD Portal
Detected LanguageEnglish
TypeThesis
Format74 p. : ill., maps
RightsStellenbosch University

Page generated in 0.002 seconds