Abstract Twin-to-twin transfusion syndrome (TTTS) is a complex prenatal condition in which monochorionic twins experience an imbalance in blood flow due to abnormal vascular connections in the shared placenta. Fetoscopic laser photocoagulation is the first-line treatment for TTTS, aimed at coagulating these abnormal connections. However, the procedure is complicated by a limited field of view, occlusions, poor-quality endoscopic images, and distortions caused by artifacts. To optimize the visualization of placental vessels during surgical procedures, we propose Hybrid-MedNet, a novel hybrid CNN-transformer network that incorporates multi-dimensional deep feature learning techniques. The network introduces a BiPath tokenization module that enhances vessel boundary detection by capturing both channel dependencies and spatial features through parallel attention mechanisms. A context-aware transformer block addresses the weak inductive bias problem in traditional transformers while preserving spatial relationships crucial for accurate vessel identification in distorted fetoscopic images. Furthermore, we develop a multi-scale trifusion module that integrates multi-dimensional features to capture rich vascular representations from the encoder and facilitate precise vessel information transfer to the decoder for improved segmentation accuracy. Experimental results show that our approach achieves a Dice score of 95.40% on fetoscopic images, outperforming ten state-of-the-art segmentation methods. The consistent superior performance across four segmentation tasks and ten distinct datasets confirms the robustness and effectiveness of our method for diverse and complex medical imaging applications.
Zihan LiDihan LiCangbai XuWeice WangQingqi HongQingde LiJie Tian
Sijing CaiYukun JiangYuwei XiaoJian ZengGuangming Zhou
Ismayl LabbihiOthmane El MeslouhiZouhair Elamrani Abou ElassadMohamed BenaddyMustapha KardouchiMoulay A. Akhloufi