JOURNAL ARTICLE

RT-CBAM: Refined Transformer Combined with Convolutional Block Attention Module for Underwater Image Restoration

Renchuan YeYuqiang QianXinming Huang

Year: 2024 Journal:   Sensors Vol: 24 (18)Pages: 5893-5893   Publisher: Multidisciplinary Digital Publishing Institute

Abstract

Recently, transformers have demonstrated notable improvements in natural advanced visual tasks. In the field of computer vision, transformer networks are beginning to supplant conventional convolutional neural networks (CNNs) due to their global receptive field and adaptability. Although transformers excel in capturing global features, they lag behind CNNs in handling fine local features, especially when dealing with underwater images containing complex and delicate structures. In order to tackle this challenge, we propose a refined transformer model by improving the feature blocks (dilated transformer block) to more accurately compute attention weights, enhancing the capture of both local and global features. Subsequently, a self-supervised method (a local and global blind-patch network) is embedded in the bottleneck layer, which can aggregate local and global information to enhance detail recovery and improve texture restoration quality. Additionally, we introduce a multi-scale convolutional block attention module (MSCBAM) to connect encoder and decoder features; this module enhances the feature representation of color channels, aiding in the restoration of color information in images. We plan to deploy this deep learning model onto the sensors of underwater robots for real-world underwater image-processing and ocean exploration tasks. Our model is named the refined transformer combined with convolutional block attention module (RT-CBAM). This study compares two traditional methods and six deep learning methods, and our approach achieved the best results in terms of detail processing and color restoration.

Keywords:
Computer science Convolutional neural network Artificial intelligence Transformer Underwater Encoder Computer vision Feature learning Deep learning Pattern recognition (psychology) Engineering Voltage Electrical engineering

Metrics

3
Cited By
1.59
FWCI (Field Weighted Citation Impact)
33
Refs
0.76
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Image Enhancement Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image and Signal Denoising Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image Fusion Techniques
Physical Sciences →  Engineering →  Media Technology

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.