Utnetpara: A Hybrid CNN-Transformer Architecture with Multi-Scale Fusion for Whole-Slide Image Segmentation

Boqiang Huang; Jiayu Ying; Ruizhi Lyu; Nadine S. Schaadt; Barbara M. Klinkhammer; Peter Boor; Johannes Lotz; Friedrich Feuerhake; Dorit Merhof

doi:10.1109/isbi56570.2024.10635778

JOURNAL ARTICLE

Utnetpara: A Hybrid CNN-Transformer Architecture with Multi-Scale Fusion for Whole-Slide Image Segmentation

Boqiang Huang Jiayu Ying Ruizhi Lyu Nadine S. Schaadt Barbara M. Klinkhammer Peter Boor Johannes Lotz Friedrich Feuerhake Dorit Merhof

Year: 2024 Pages: 1-5

DOI: 10.1109/isbi56570.2024.10635778

Get Full-Text PDF Get Analytical Report

Abstract

In medical image segmentation tasks, Convolutional Neural Networks (CNNs) have become an efficient and successful solution, although they have limitations in explicitly modeling long-term dependencies. The Transformer neural network has recently demonstrated its capabilities in image segmentation, although a large amount of data is required for training. In this study, we present a hybrid architecture, UTNetPara, that integrates the Transformer into a U-shaped CNN to improve segmentation accuracy on a medium-sized dataset. Self-attention modules are applied in both the encoder and decoder to enhance the ability to capture long-term dependencies at different scales. Efficient self-attention mechanisms with relative position encoding are employed to reduce the computational cost accordingly. A fully annotated dataset consisting of whole slide images scanned from periodic acid-Schiff stained mouse kidney tissue is used for evaluation. The proposed method is trained to segment the main renal structures: glomerular tuft, glomerulus including Bowman's capsule, tubules, arteries, arterial lumina, and veins. Our experimental results indicate that the UTNetPara has a better segmentation performance than other state-of-the-art models.

Keywords:

Computer science Architecture Artificial intelligence Computer vision Transformer Image segmentation Segmentation Fusion Pattern recognition (psychology) Engineering Electrical engineering Geography

Metrics

Cited By

1.06

FWCI (Field Weighted Citation Impact)

Refs

0.68

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Medical Image Segmentation Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Utnetpara: A Hybrid CNN-Transformer Architecture with Multi-Scale Fusion for Whole-Slide Image Segmentation

Abstract

Metrics

Citation History

Topics

Related Documents

TransUMobileNet: Integrating multi-channel attention fusion with hybrid CNN-Transformer architecture for medical image segmentation

ZoomISEG: Interactive Multi-Scale Fusion for Histopathology Whole Slide Image Segmentation

An effective multi-scale interactive fusion network with hybrid Transformer and CNN for smoke image segmentation

Multi-scale Prototypical Transformer for Whole Slide Image Classification

CNN–Transformer Hybrid Architecture for Underwater Sonar Image Segmentation