KEYWORDS: Artificial intelligence, Data modeling, Decision making, Risk assessment, Data fusion, Systems modeling, Network security, Sensors, Safety, Information fusion
Many techniques have been developed for sensor and information fusion, machine and deep learning, as well as data and machine analytics. Currently, many groups are exploring methods for human-machine teaming using saliency and heat maps, explainable and interpretable artificial intelligence, as well as user-defined interfaces. However, there is still a need for standard metrics for test and evaluation of systems utilizing artificial intelligence (AI), such as deep learning (DL), to support the AI principles. In this paper, we explore the elements associated with the opportunities and challenges emerging from designing, testing, and evaluating such future systems. The paper highlights the MAST (multi-attribute scorecard table), and more specifically the MAST criteria ―analysis of alternatives‖ by measuring the risk associated with an evidential DL-based decision. The concept of risk includes the probability of a decision as well as the severity of the choice, from which there is also a need for an uncertainty bound on the decision choice which the paper postulates a risk bound. Notional analysis for a cyber networked system is presented to guide to interactive process for test and evaluation to support the certification of AI systems as to the decision risk for a human-machine system that includes analysis from both the DL method and a user.
Machine learning (ML) requires both quantity and variety of examples in order to learn generalizable patterns. In cybersecurity, labeling network packets is a tedious and difficult task. This leads to insufficient labeled datasets of network packets for training ML-based Network Intrusion Detection Systems (NIDS) to detect malicious intrusions. Furthermore, benign network traffic and malicious cyber attacks are always evolving and changing, meaning that the existing datasets quickly become obsolete. We investigate generative ML modeling for network packet synthetic data generation/augmentation to improve NIDS detection of novel, but similar, cyber attacks by generating well-labeled synthetic network traffic. We develop a Cyber Creative Generative Adversarial Network (CCGAN), inspired by previous generative modeling to create new art styles from existing art images, trained on existing NIDS datasets in order to generate new synthetic network packets. The goal is to create network packet payloads that appear malicious but from different distributions than the original cyber attack classes. We use these new synthetic malicious payloads to augment the training of a ML-based NIDS to evaluate whether it is better at correctly identifying whole classes of real malicious packet payloads that were held-out during classifier training. Results show that data augmentation from CCGAN can increase a NIDS baseline accuracy on a novel malicious class from 79% to 97% with a minimal degradation in accuracy on benign classes (98.9% to 98.7%).
Detecting malicious activity using a network intrusion detection system (NIDS) is an ongoing battle for the cyber defender. Increasingly, cyber-attacks are sophisticated and occur rapidly, necessitating the use of machine/deep learning (ML/DL) techniques for network intrusion detection. Traditional ML/DL techniques for NIDS classifiers, however, are often unable to sufficiently find context-driven similarities between the various network flows and/or packet captures. In this work, we leverage graph representation learning (GRL) techniques to successfully detect adversarial intrusions by exploiting the graph structure of NIDS data to derive context awareness, as graphs are a universal language for describing entities and their relationships. We explore several methods for NIDS data graph representation at both the network flow and packet level utilizing the CIC-IDS2017 dataset. We leverage graph neural networks and graph embedding algorithms to create a context-aware network intrusion detection system. Results indicate that adding context derived from GRL improves performance for detecting attacks. Our highest-scoring classifier incorporated both GNN embeddings and flow-level features and achieved an accuracy of 99.9%. Adding GRL methods to augment the flow/packet features improved accuracy by as much as 52.41%.
Traditional machine learning (ML) models used for enterprise network intrusion detection systems (NIDS) typically rely on vast amounts of centralized data with expertly engineered features. Previous work, however, has shown the feasibility of using deep learning (DL) to detect malicious activity on raw network traffic payloads rather than engineered features at the edge, which is necessary for tactical military environments. In the future Internet of Battlefield Things (IoBT), the military will find itself in multiple environments with disconnected networks spread across the battlefield. These resource-constrained, data-limited networks require distributed and collaborative ML/DL models for inference that are continually trained both locally, using data from each separate tactical edge network, and then globally in order to learn and detect malicious activity represented across the multiple networks in a collaborative fashion. Federated Learning (FL), a collaborative paradigm which updates and distributes a global model through local model weight aggregation, provides a solution to train ML/DL models in NIDS utilizing learning from multiple edge devices from the disparate networks without the sharing of raw data. We develop and experiment with a data-efficient, FL framework for IoBT settings for intrusion detection using only raw network traffic in restricted, resource-limited environments. Our results indicate that regardless of the DL model architecture used on edge devices, the Federated Averaging FL algorithm achieved over 93% accuracy in model performance in detecting malicious payloads after only five episodes of FL training.
In this work, we aim to develop novel cybersecurity playbooks by exploiting dynamic reinforcement learning (RL) methods to close holes in the attack surface left open by the traditional signature-based approach to Defensive Cyber Operations (DCO). A useful first proof-of-concept is provided by the problem of training a scanning defense agent using RL; as a first line of defense, it is important to protect sensitive networks from network mapping tools. To address this challenge, we developed a hierarchical, Monte Carlo-based RL framework for the training of an autonomous agent which detects and reports the presence of Nmap scans in near real-time, efficiently and with near-perfect accuracy. Our algorithm is powered by a reduction of the state space given by a transformer, CLAPBAC, an anomaly detection tool which applies natural language processing to cybersecurity in a manner consistent with state-of-the-art. In a realistic scenario emulated in CyberVAN, our approach generates optimized playbooks for effective defense against malicious insiders inappropriately probing sensitive networks.
As methods and access to gene synthesis and genetic engineering have become more advanced, the fear that malicious viruses and bacteria will be designed with the express intention of causing harm to humans has received increased attention. In the event that such biological weapons are deployed, the security community needs tools to rapidly recognize the threat and identify responsible parties. Therefore, a key question is whether or not a biological threat is manmade. Currently, experts are capable of qualitatively assessing whether specific genetic sequences are natural or man-made, but few objective criteria exist for characterizing the degree to which a sequence has been engineered. Additionally, progress has recently been made on the task of attributing an engineered gene sequence to a lab-of-origin using machine learning. However, the task of analyzing naturally occurring genetic sequences so as to automatically detect outliers that may have been genetically engineered has received comparatively little attention. This work proposes a method for generating a dataset of natural and engineered sequences that can be used as an input for training machine learning classifiers to perform automatic detection of human engineering in gene sequence data.
Increasingly cyber-attacks are sophisticated and occur rapidly, necessitating the use of machine learning techniques for detection at machine speed. However, the use of machine learning techniques in cyber security requires the extraction of features from the raw network traffic. Thus, subject matter expertise is essential to analyze the network traffic and extract optimum features to detect a cyber-attack.
Consequently, we propose a novel machine learning algorithm for malicious network traffic detection using only the bytes of the raw network traffic. The feature vector in our machine learning method is a structure containing the headers and a variable number of payload bytes. We propose a 1D-Convolutional Neural Network (1D-CNN) and Feed Forward Network for detection of malicious packets using raw network bytes.
In order to scale for speed, technology often builds upon the earliest proven systems and architectures. As the context changes, from a civilian application domain to a military application domain, the priority of functional requirements can and often do change. The hardware, software, and language development environment set the foundation for the constraints and potential of a system. This along with the fact the information technology revolution, since early 2000, has primarily been driven by the commercial sector, requires engineers to consider whether nontraditional, less well-known architectures may have a role in the Multi-Domain Operations (MDO) application space. This paper will highlight features inherent to traditional architectures, the challenges associated with these architectural features, and how the Erlang VM represents an opportunity to develop an architectural foundation suitable to the MDO application domain. Finally, this paper will highlight a future technology concept integrating demonstrated neural interface technology with an Erlang VM supported architecture. This foundation will help enable human-machine teaming by empowering a human agent to interact with sensors and AI-enabled autonomous systems with a dynamic user interface allowing the human agent to accomplish MDO applications. The great potential for the concept depends on a fault-tolerant, distributed system permitted by the Erlang VM to exibly integrate the capabilities required to address the diverse challenges of a complex operating environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.