JOURNAL ARTICLE

A CNN-Transformer Approach for Image-Text Multimodal Classification with Cross-Modal Feature Fusion

Keywords:
Computer science Modal Artificial intelligence Pattern recognition (psychology) Transformer Image fusion Feature extraction Feature (linguistics) Fusion Image (mathematics) Computer vision Voltage Engineering Electrical engineering Materials science

Metrics

5
Cited By
23.87
FWCI (Field Weighted Citation Impact)
23
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.