This work explores the use of a Multi-Objective Genetic Algorithm (MOGA) for both, feature selection and cluster count optimization, for an unsupervised machine learning technique, K-Means, applied to encrypted traffic identification (SSH). The performance of the proposed model is benchmarked against other unsupervised learning techniques existing in the literature: Basic K-Means, semi-supervised K-Means, DBSCAN, and EM. Results show that the proposed MOGA, not only outperforms the other models, but also provides a good trade off in terms of detection rate, false positive rate, and time to build and run the model. A hierarchical version of the proposed model is also implemented, to observe the gains, if any, obtained by increasing cluster purity by means of a second layer of clusters. Results show that with the hierarchical MOGA, significant gains are observed in terms of the classification performances of the system.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:NSHD.ca#10222/13016 |
Date | 10 August 2010 |
Creators | Bacquet, Carlos |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Thesis |
Page generated in 0.0014 seconds