Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition: Enabling Miner Unsafe Action Recognition

Yu Wang; Xiaoqing Chen; Jiaoqun Li; Zengxiang Lu

doi:10.3390/s24144557

ScienceGate Book Chapters

JOURNAL ARTICLE

Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition: Enabling Miner Unsafe Action Recognition

Yu Wang Xiaoqing Chen Jiaoqun Li Zengxiang Lu

Year: 2024 Journal: Sensors Vol: 24 (14)Pages: 4557-4557 Publisher: Multidisciplinary Digital Publishing Institute

DOI: 10.3390/s24144557

Get Full-Text PDF Get Analytical Report

Abstract

The unsafe action of miners is one of the main causes of mine accidents. Research on underground miner unsafe action recognition based on computer vision enables relatively accurate real-time recognition of unsafe action among underground miners. A dataset called unsafe actions of underground miners (UAUM) was constructed and included ten categories of such actions. Underground images were enhanced using spatial- and frequency-domain enhancement algorithms. A combination of the YOLOX object detection algorithm and the Lite-HRNet human key-point detection algorithm was utilized to obtain skeleton modal data. The CBAM-PoseC3D model, a skeleton modal action-recognition model incorporating the CBAM attention module, was proposed and combined with the RGB modal feature-extraction model CBAM-SlowOnly. Ultimately, this formed the Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition (CBAM-MFFAR) model for recognizing unsafe actions of underground miners. The improved CBAM-MFFAR model achieved a recognition accuracy of 95.8% on the NTU60 RGB+D public dataset under the X-Sub benchmark. Compared to the CBAM-PoseC3D, PoseC3D, 2S-AGCN, and ST-GCN models, the recognition accuracy was improved by 2%, 2.7%, 7.3%, and 14.3%, respectively. On the UAUM dataset, the CBAM-MFFAR model achieved a recognition accuracy of 94.6%, with improvements of 2.6%, 4%, 12%, and 17.3% compared to the CBAM-PoseC3D, PoseC3D, 2S-AGCN, and ST-GCN models, respectively. In field validation at mining sites, the CBAM-MFFAR model accurately recognized similar and multiple unsafe actions among underground miners.

Keywords:

Feature (linguistics) RGB color model Block (permutation group theory) Artificial intelligence Benchmark (surveying) Pattern recognition (psychology) Modal Computer science Engineering Mathematics

Metrics

Cited By

9.69

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Occupational Health and Safety Research

Health Sciences → Health Professions → Radiological and Ultrasound Technology

Hand Gesture Recognition Systems

Physical Sciences → Computer Science → Human-Computer Interaction

Gait Recognition and Analysis

Physical Sciences → Engineering → Biomedical Engineering

Convolutional Block Attention Module–Multimodal Feature-Fusion Action Recognition: Enabling Miner Unsafe Action Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Attention mechanism based multimodal feature fusion network for human action recognition

Facial expression recognition based on convolutional block attention module and multi-feature fusion

Facial expression recognition based on convolutional block attention module and multi-feature fusion

Microexpression Recognition Method Based on ADP-DSTN Feature Fusion and Convolutional Block Attention Module

Multimodal Feature Fusion Model For Rgb-D Action Recognition