Return to search

Support vector machine prediction of HIV-1 drug resistance using The Viral Nucleotide patterns

Student Number : 0213068F -
MSc Dissertation -
School of Computer Science -
Faculty of Science / Drug resistance of the HI virus due to its fast replication and error-prone mutation is a key factor
in the failure to combat the HIV epidemic. For this reason, performing pre-therapy drug
resistance testing and administering appropriate drugs or combination of drugs accordingly is
very useful. There are two approaches to HIV drug resistance testing: phenotypic (clinical)
and genotypic (based on the particular virus’s DNA). Genotyping tests HIV drug resistance by
detecting specific mutations known to confer drug resistance. It is cheaper and can be computerised.
However, it requires being able to know or learn what mutations confer drug resistance.
Previous research using pattern recognition techniques has been promising, but the performance
needs to be improved. It is also important for techniques that can quickly learn new rules when
faced with new mutations or drugs.
A relatively recent addition to these techniques is the Support Vector Machines (SVMs).
SVMs have proved very successful in many benchmark applications such as face recognition,
text recognition, and have also performed well in many computational biology problems where
the number of features targeted is large compared to the number of available samples. This
paper explores the use of SVMs in predicting the drug resistance of an HIV strain extracted
from a patient based on the genetic sequence of those parts of the viral DNA encoding for the
two enzymes, Reverse Transcriptase or Protease, which are critical for the replication of the
HIV virus. In particular, it is the aim of this reseach to design the model without incorporating
the biological knowledge at hand to enable the resulting classifier accommodate new drugs and
mutations.
To evaluate the performance of SVMs we used cross validation technique to measure the
unbiased estimate on 2045 data points. The accuracy of classification and the area under the receiver
operating characteristics curve (AUC) was used as a performance measure. Furthermore,
to compare the performance of our SVMs model we also developed other prediction models
based on popular classification algorithms, namely neural networks, decision trees and logistic
regressions.
The results show that SVMs are a highly successful classifier and out-perform other techniques
with performance ranging between (94.13%–96.33%) accuracy and (81.26% - 97.49%)
AUC. Decision trees were rated second and logistic regression performed the worst.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:wits/oai:wiredspace.wits.ac.za:10539/2104
Date23 February 2007
CreatorsAraya, Seare Tesfamichael
Source SetsSouth African National ETD Portal
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Format1692909 bytes, application/pdf, application/pdf

Page generated in 0.0018 seconds