Existing deep learning models for log anomaly detection assume user logs are collected from the central server system, exposing the data collection process to the risk of leaking sensitive information. Additionally uploading enormous amounts of raw log data requires a lot of bandwidth. We propose a federated learning framework for multi-domain environment in which various participating nodes hold datasets obtained from different log domains. An embedding transformation method is utilized on the server side to learn the cross-domain embedding transformation model in order to distill the relationship of user embedding between domains. In this paper, we propose a Privacy-Preserving Federated Cross Domain Anomaly Detection (CD-FAD) technique that uses a relatively information-rich source domain to boost the detection performance of the data-sparse target domain and comprehensively analyzes all aspects of log messages including health logs, to effectively identify abnormalities arising from unusual parameter patterns. Extensive tests on real-world logs show that our suggested solution adequately preserves user privacy while achieving performance comparable to that of detection systems already in use.
Shiyao MaJiangtian NieJiawen KangLingjuan LyuRyan Wen LiuRuihui ZhaoZiyao LiuDusit Niyato
Mengwei YangShuqi LiuJie XuGuozhen TanCongduan LiLinqi Song
Shenglin ZhangTing XuJun ZhuYongqian SunPengxiang JinBinpeng ShiDan Pei
Xiaojun ZHANG, Xingpeng LI, Wei TANG, Yunpu HAO, Jingting XUE