JOURNAL ARTICLE

Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition: Enabling Miner Unsafe Action Recognition

Yu WangXiaoqing ChenJiaoqun LiZengxiang Lu

Year: 2024 Journal:   Sensors Vol: 24 (14)Pages: 4557-4557   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

The unsafe action of miners is one of the main causes of mine accidents. Research on underground miner unsafe action recognition based on computer vision enables relatively accurate real-time recognition of unsafe action among underground miners. A dataset called unsafe actions of underground miners (UAUM) was constructed and included ten categories of such actions. Underground images were enhanced using spatial- and frequency-domain enhancement algorithms. A combination of the YOLOX object detection algorithm and the Lite-HRNet human key-point detection algorithm was utilized to obtain skeleton modal data. The CBAM-PoseC3D model, a skeleton modal action-recognition model incorporating the CBAM attention module, was proposed and combined with the RGB modal feature-extraction model CBAM-SlowOnly. Ultimately, this formed the Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition (CBAM-MFFAR) model for recognizing unsafe actions of underground miners. The improved CBAM-MFFAR model achieved a recognition accuracy of 95.8% on the NTU60 RGB+D public dataset under the X-Sub benchmark. Compared to the CBAM-PoseC3D, PoseC3D, 2S-AGCN, and ST-GCN models, the recognition accuracy was improved by 2%, 2.7%, 7.3%, and 14.3%, respectively. On the UAUM dataset, the CBAM-MFFAR model achieved a recognition accuracy of 94.6%, with improvements of 2.6%, 4%, 12%, and 17.3% compared to the CBAM-PoseC3D, PoseC3D, 2S-AGCN, and ST-GCN models, respectively. In field validation at mining sites, the CBAM-MFFAR model accurately recognized similar and multiple unsafe actions among underground miners.

Keywords:
Feature (linguistics) RGB color model Block (permutation group theory) Artificial intelligence Benchmark (surveying) Pattern recognition (psychology) Modal Computer science Engineering Mathematics

Metrics

11
Cited By
9.69
FWCI (Field Weighted Citation Impact)
32
Refs
0.97
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Occupational Health and Safety Research
Health Sciences →  Health Professions →  Radiological and Ultrasound Technology
Hand Gesture Recognition Systems
Physical Sciences →  Computer Science →  Human-Computer Interaction
Gait Recognition and Analysis
Physical Sciences →  Engineering →  Biomedical Engineering

Related Documents

JOURNAL ARTICLE

Attention mechanism based multimodal feature fusion network for human action recognition

Zhao XuChao TangHuosheng HuWenjian WangShuo QiaoAnyang Tong

Journal:   Journal of Visual Communication and Image Representation Year: 2025 Vol: 110 Pages: 104459-104459
JOURNAL ARTICLE

Facial expression recognition based on convolutional block attention module and multi-feature fusion

Man JiangShoulin Yin

Journal:   International Journal of Computational Vision and Robotics Year: 2022 Vol: 13 (1)Pages: 21-21
JOURNAL ARTICLE

Facial expression recognition based on convolutional block attention module and multi-feature fusion

Shoulin YinMan Jiang

Journal:   International Journal of Computational Vision and Robotics Year: 2022 Vol: 1 (1)Pages: 1-1
© 2026 ScienceGate Book Chapters — All rights reserved.