Sample Entropy examines changes in the normal distribution of network traffic to identify anomalies. Normalized Information examines the overall probability distribution in a data set. Random Forests is a supervised learning algorithm which is efficient at classifying highlyimbalanced data. Anomalies are exceedingly rare compared to the overall volume of network traffic. The combination of these methods enables low-bandwidth anomalies to easily be identified in high-bandwidth network traffic. Using only low-dimensional network information allows for near real-time identification of anomalies. The data set was collected from 1999 DARPA intrusion detection evaluation data set. The experiments compare a baseline f-score to the observed entropy and normalized information of the network. Anomalies that are disguised in network flow analysis were detected. Random Forests prove to be capable of classifying anomalies using the sample entropy and normalized information. Our experiment divided the data set into five-minute time slices and found that sample entropy and normalized information metrics were successful in classifying bad traffic with a recall of .99 and a f-score .50 which was 185% better than our baseline.
Identifer | oai:union.ndltd.org:nps.edu/oai:calhoun.nps.edu:10945/2633 |
Date | 09 1900 |
Creators | Hyla, Bret M. |
Contributors | Martell, Craig, Squire, Kevin, Naval Postgraduate School (U.S.)., Computer Science |
Publisher | Monterey, California. Naval Postgraduate School |
Source Sets | Naval Postgraduate School |
Detected Language | English |
Type | Thesis |
Format | xvi, 62 p. ;, application/pdf |
Rights | Approved for public release, distribution unlimited |
Page generated in 0.0015 seconds