JOURNAL ARTICLE

Arbitrary Style Transfer With Fused Convolutional Block Attention Modules

Xin Hai-taoLi Li

Year: 2023 Journal:   IEEE Access Vol: 11 Pages: 44977-44988   Publisher: Institute of Electrical and Electronics Engineers

Abstract

The advancement of deep learning has rendered image style transfer a progressively intricate subject matter. The proposed solution aims to tackle the limitations of current methods in retaining the content image object contours and avoiding blurred image boundaries and mismatched color matching after stylization. To achieve this, an arbitrary-style transfer network is introduced, which leverages the attention mechanism. The network comprises an encoder-decoder module, a convolutional block attention module (CBAM), and an adaptive attention normalization network (AdaAttN) module. The CBAM attention mechanism is presented as an extension of the AdaAttN network, with the aim of producing stylized images that exhibit both global and local style coordination. This is achieved by leveraging long-range dependencies in the image. Additionally, a novel loss function, referred to as the structural similarity loss, is proposed to enhance the consistency of the generated images with respect to the underlying content structure. Finally, a new local feature loss is introduced to further enhance the visual quality of the stylized images at a local level. The study involved conducting style transfer training on a dataset comprising 82,783 real images and 81,446 artistic images. Furthermore, an additional set of 1,000 resultant images, generated from 100 real photos and 10 artistic portraits, was utilized for testing purposes. The study compares the experimental outcomes with four contemporary-style transfer techniques. Additionally, the efficacy of the CBAM module and SSIM loss function is demonstrated through ablation experiments. The findings of the experiment demonstrate that the network proposed has the ability to effectively adapt to the local style and can adeptly correspond the semantically proximate style features to the content features, thereby preserving superior spatial consistency.

Keywords:
Computer science Artificial intelligence Encoder Computer vision Feature (linguistics) Block (permutation group theory) Similarity (geometry) Image (mathematics) Pattern recognition (psychology) Mathematics

Metrics

11
Cited By
2.00
FWCI (Field Weighted Citation Impact)
26
Refs
0.84
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Generative Adversarial Networks and Image Synthesis
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Enhancement Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image Processing Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.