The United States increasingly relies on cyber-physical systems to conduct military and commercial
operations. Attacks on these systems have increased dramatically around the globe. The attackers constantly
change their methods, making state-of-the-art commercial and military intrusion detection systems ineffective.
In this paper, we present a model to identify functional behavior of network devices from netflow traces. Our
model includes two innovations. First, we define novel features for a host IP using detection of application
graph patterns in IP’s host graph constructed from 5-min aggregated packet flows. Second, we present the
first application, to the best of our knowledge, of Graph Semi-Supervised Learning (GSSL) to the space of IP
behavior classification. Using a cyber-attack dataset collected from NetFlow packet traces, we show that
GSSL trained with only 20% of the data achieves higher attack detection rates than Support Vector Machines
(SVM) and Naïve Bayes (NB) classifiers trained with 80% of data points. We also show how to improve
detection quality by filtering out web browsing data, and conclude with discussion of future research
directions.
|