SAC-AP: Soft Actor Critic based Deep Reinforcement Learning for Alert Prioritization

Lalitha Chavali; Tanay Gupta; Paresh Saxena

doi:10.1109/cec55065.2022.9870423

ScienceGate Book Chapters

JOURNAL ARTICLE

SAC-AP: Soft Actor Critic based Deep Reinforcement Learning for Alert Prioritization

Lalitha Chavali Tanay Gupta Paresh Saxena

Year: 2022 Journal: 2022 IEEE Congress on Evolutionary Computation (CEC) Pages: 1-8

DOI: 10.1109/cec55065.2022.9870423

Get Full-Text PDF Get Analytical Report

Abstract

Intrusion detection systems (IDS) generate a large number of false alerts which makes it difficult to inspect true positives. Hence, alert prioritization plays a crucial role in deciding which alerts to investigate from an enormous number of alerts that are generated by IDS. Recently, deep reinforcement learning (DRL) based deep deterministic policy gradient (DDPG) off-policy method has shown to achieve better results for alert prioritization as compared to other state-of-the-art methods. However, DDPG is prone to the problem of overfitting. Additionally, it also has a poor exploration capability and hence it is not suitable for problems with a stochastic environment. To address these limitations, we present a soft actor-critic based DRL algorithm for alert prioritization (SAC-AP), an off-policy method, based on the maximum entropy reinforcement learning framework that aims to maximize the expected reward while also maximizing the entropy. Further, the interaction between an adversary and a defender is modeled as a zero-sum game and a double oracle framework is utilized to obtain the approximate mixed strategy Nash equilibrium (MSNE). SAC-AP finds robust alert investigation policies and computes pure strategy best response against opponent's mixed strategy. We present the overall design of SAC-AP and evaluate its performance as compared to other state-of-the art alert prioritization methods. We consider defender's loss, i.e., the defender's inability to investigate the alerts that are triggered due to attacks, as the performance metric. Our results show that SAC-AP achieves up to 30% decrease in defender's loss as compared to the DDPG based alert prioritization method and hence provides better protection against intrusions. Moreover, the benefits are even higher when SAC-AP is compared to other traditional alert prioritization methods including Uniform, GAIN, RIO and Suricata.

Keywords:

Reinforcement learning Computer science Oracle Prioritization Intrusion detection system Artificial intelligence Markov decision process Machine learning Overfitting Nash equilibrium Entropy (arrow of time) False positive paradox Mathematical optimization Markov process Artificial neural network Engineering Mathematics

Metrics

Cited By

2.74

FWCI (Field Weighted Citation Impact)

Refs

0.91

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Network Security and Intrusion Detection

Physical Sciences → Computer Science → Computer Networks and Communications

Advanced Malware Detection Techniques

Physical Sciences → Computer Science → Signal Processing

Smart Grid Security and Resilience

Physical Sciences → Engineering → Control and Systems Engineering

SAC-AP: Soft Actor Critic based Deep Reinforcement Learning for Alert Prioritization

Abstract

Metrics

Citation History

Topics

Related Documents

SAC-ABR: Soft Actor-Critic based deep reinforcement learning for Adaptive BitRate streaming

SAC-FACT: Soft Actor-Critic Reinforcement Learning for Counterfactual Explanations

Averaged Soft Actor‐Critic for Deep Reinforcement Learning

Soft Actor-Critic Deep Reinforcement Learning Based Interference Resource Allocation

Off-policy actor-critic deep reinforcement learning methods for alert prioritization in intrusion detection systems