In this presentation, I will emphasize the significance of employing attention-based transformer models for analyzing particle clouds. Specifically, I will delve into the utilization of a multi-modal transformer model equipped with both self-attention and cross-attention mechanisms to effectively analyze various scales of inputs. These inputs include the intricate local substructures of jets as well as the broader, high-level reconstructed kinematics. Additionally, I will introduce interpretation techniques such as attention maps and Grad-CAM to provide insights into the network's outcomes. The network structure based on arXiv:2401.00452 and the public code is given in https://github.com/AHamamd150/Multi-Scale-Transformer-encoder.
A. HammadStefano MorettiMihoko M. Nojiri
Chun-Fu Richard ChenQuanfu FanRameswar Panda
Rongzhou ZhouJunfeng YaoQingqi HongXingxin LiXianpeng Cao
Jiayu WanYingguang HaoXiaorui MaHongyu Wang