JOURNAL ARTICLE

Structure-Aware Progressive Multi-Modal Fusion Network for RGB-T Crack Segmentation

Zhiqiang YuanXin DingXin XiaYibin HeHui FangBo YangWei Fu

Year: 2025 Journal:   Journal of Imaging Vol: 11 (11)Pages: 384-384   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Crack segmentation in images plays a pivotal role in the monitoring of structural surfaces, serving as a fundamental technique for assessing structural integrity. However, existing methods that rely solely on RGB images exhibit high sensitivity to light conditions, which significantly restricts their adaptability in complex environmental scenarios. To address this, we propose a structure-aware progressive multi-modal fusion network (SPMFNet) for RGB-thermal (RGB-T) crack segmentation. The main idea is to integrate complementary information from RGB and thermal images and incorporate structural priors (edge information) to achieve accurate segmentation. Here, to better fuse multi-layer features from different modalities, a progressive multi-modal fusion strategy is designed. In the shallow encoder layers, two gate control attention (GCA) modules are introduced to dynamically regulate the fusion process through a gating mechanism, allowing the network to adaptively integrate modality-specific structural details based on the input. In the deeper layers, two attention feature fusion (AFF) modules are employed to enhance semantic consistency by leveraging both local and global attention, thereby facilitating the effective interaction and complementarity of high-level multi-modal features. In addition, edge prior information is introduced to encourage the predicted crack regions to preserve structural integrity, which is constrained by a joint loss of edge-guided loss, multi-scale focal loss, and adaptive fusion loss. Experimental results on publicly available RGB-T crack detection datasets demonstrate that the proposed method outperforms both classical and advanced approaches, verifying the effectiveness of the progressive fusion strategy and the utilization of the structural prior.

Keywords:

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
43
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Related Documents

JOURNAL ARTICLE

Complementarity-aware cross-modal feature fusion network for RGB-T semantic segmentation

Wei WuTao ChuQiong Liu

Journal:   Pattern Recognition Year: 2022 Vol: 131 Pages: 108881-108881
JOURNAL ARTICLE

RGB-T tracking network based on multi-modal feature fusion

Jing JinJian‐Qin LiuFengwen ZHAI

Journal:   Optics and Precision Engineering Year: 2025 Vol: 33 (12)Pages: 1940-1954
JOURNAL ARTICLE

Multi-modal neural networks with multi-scale RGB-T fusion for semantic segmentation

Yangxintong LyuIonut SchiopuAdrian Munteanu

Journal:   Electronics Letters Year: 2020 Vol: 56 (18)Pages: 920-923
© 2026 ScienceGate Book Chapters — All rights reserved.