CACFNet: Cross-Modal Attention Cascaded Fusion Network for RGB-T Urban Scene Parsing

Wujie Zhou; Shaohua Dong; Meixin Fang; Lu Yu

doi:10.1109/tiv.2023.3314527

ScienceGate Book Chapters

JOURNAL ARTICLE

CACFNet: Cross-Modal Attention Cascaded Fusion Network for RGB-T Urban Scene Parsing

Wujie Zhou Shaohua Dong Meixin Fang Lu Yu

Year: 2023 Journal: IEEE Transactions on Intelligent Vehicles Vol: 9 (1)Pages: 1919-1929 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tiv.2023.3314527

Get Full-Text PDF Get Analytical Report

Abstract

Color–thermal (RGB-T) urban scene parsing has recently attracted widespread interest. However, most existing approaches to RGB-T urban scene parsing do not deeply explore the information complementarity between RGB-T features. In this study, we propose a cross-modal attention-cascaded fusion network (CACFNet) that fully exploits cross-modality. In our design, a cross-modal attention fusion module mines complementary information from two modalities. Subsequently, a cascaded fusion module decodes the multi-level features in an up-bottom manner. Noting that each pixel is labeled with the category of the region to which it belongs, we present a region-based module that explores the relationship between pixel and region. Moreover, in contrast to previous methods that employ only the cross-entropy loss to penalize pixel-wise predictions, we propose an additional loss to learn pixel–pixel relationships. Extensive experiments on two datasets demonstrate that the proposed CACFNet achieves state-of-the-art performance in RGB-T urban scene parsing.

Keywords:

Parsing Modal Computer science Artificial intelligence RGB color model Fusion Computer vision Linguistics Materials science

Metrics

Cited By

6.72

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Remote Sensing and LiDAR Applications

Physical Sciences → Environmental Science → Environmental Engineering

Video Surveillance and Tracking Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Automated Road and Building Extraction

Physical Sciences → Engineering → Ocean Engineering

CACFNet: Cross-Modal Attention Cascaded Fusion Network for RGB-T Urban Scene Parsing

Abstract

Metrics

Citation History

Topics

Related Documents

Cross-Modal Attention Guided Enhanced Fusion Network for RGB-T Tracking

ECFNet: Efficient cross-layer fusion network for real time RGB-Thermal urban scene parsing

Cross-modal Attention Network for RGB-T Tracking

EGFNet: Edge-Aware Guidance Fusion Network for RGB–Thermal Urban Scene Parsing

Cross-modal attention fusion network for RGB-D semantic segmentation