Structure-Aware Progressive Multi-Modal Fusion Network for RGB-T Crack Segmentation

Zhiqiang Yuan; Xin Ding; Xin Xia; Yibin He; Hui Fang; Bo Yang; Wei Fu

doi:10.3390/jimaging11110384

ScienceGate Book Chapters

JOURNAL ARTICLE

Structure-Aware Progressive Multi-Modal Fusion Network for RGB-T Crack Segmentation

Zhiqiang Yuan Xin Ding Xin Xia Yibin He Hui Fang Bo Yang Wei Fu

Year: 2025 Journal: Journal of Imaging Vol: 11 (11)Pages: 384-384 Publisher: Multidisciplinary Digital Publishing Institute

DOI: 10.3390/jimaging11110384

Get Full-Text PDF Get Analytical Report

Abstract

Crack segmentation in images plays a pivotal role in the monitoring of structural surfaces, serving as a fundamental technique for assessing structural integrity. However, existing methods that rely solely on RGB images exhibit high sensitivity to light conditions, which significantly restricts their adaptability in complex environmental scenarios. To address this, we propose a structure-aware progressive multi-modal fusion network (SPMFNet) for RGB-thermal (RGB-T) crack segmentation. The main idea is to integrate complementary information from RGB and thermal images and incorporate structural priors (edge information) to achieve accurate segmentation. Here, to better fuse multi-layer features from different modalities, a progressive multi-modal fusion strategy is designed. In the shallow encoder layers, two gate control attention (GCA) modules are introduced to dynamically regulate the fusion process through a gating mechanism, allowing the network to adaptively integrate modality-specific structural details based on the input. In the deeper layers, two attention feature fusion (AFF) modules are employed to enhance semantic consistency by leveraging both local and global attention, thereby facilitating the effective interaction and complementarity of high-level multi-modal features. In addition, edge prior information is introduced to encourage the predicted crack regions to preserve structural integrity, which is constrained by a joint loss of edge-guided loss, multi-scale focal loss, and adaptive fusion loss. Experimental results on publicly available RGB-T crack detection datasets demonstrate that the proposed method outperforms both classical and advanced approaches, verifying the effectiveness of the progressive fusion strategy and the utilization of the structural prior.

Keywords:

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Structure-Aware Progressive Multi-Modal Fusion Network for RGB-T Crack Segmentation

Abstract

Metrics

Citation History

Topics

Related Documents

Complementarity-aware cross-modal feature fusion network for RGB-T semantic segmentation

Depth-aware RGB-D concrete crack segmentation and quantification using progressive cross-modal attention

Edge-Supervised Attention-Aware Fusion Network for RGB-T Semantic Segmentation

RGB-T tracking network based on multi-modal feature fusion

Multi-modal neural networks with multi-scale RGB-T fusion for semantic segmentation