Return to search

Sample entropy and random forests a methodology for anomaly-based intrusion detection and classification of low-bandwidth malware attacks

Sample Entropy examines changes in the normal distribution of network traffic to identify anomalies. Normalized Information examines the overall probability distribution in a data set. Random Forests is a supervised learning algorithm which is efficient at classifying highlyimbalanced data. Anomalies are exceedingly rare compared to the overall volume of network traffic. The combination of these methods enables low-bandwidth anomalies to easily be identified in high-bandwidth network traffic. Using only low-dimensional network information allows for near real-time identification of anomalies. The data set was collected from 1999 DARPA intrusion detection evaluation data set. The experiments compare a baseline f-score to the observed entropy and normalized information of the network. Anomalies that are disguised in network flow analysis were detected. Random Forests prove to be capable of classifying anomalies using the sample entropy and normalized information. Our experiment divided the data set into five-minute time slices and found that sample entropy and normalized information metrics were successful in classifying bad traffic with a recall of .99 and a f-score .50 which was 185% better than our baseline.

Identiferoai:union.ndltd.org:nps.edu/oai:calhoun.nps.edu:10945/2633
Date09 1900
CreatorsHyla, Bret M.
ContributorsMartell, Craig, Squire, Kevin, Naval Postgraduate School (U.S.)., Computer Science
PublisherMonterey, California. Naval Postgraduate School
Source SetsNaval Postgraduate School
Detected LanguageEnglish
TypeThesis
Formatxvi, 62 p. ;, application/pdf
RightsApproved for public release, distribution unlimited

Page generated in 0.0017 seconds