Multi-Modal Point Cloud Completion with Interleaved Attention Enhanced Transformer

Chenghao Fang; Jianqing Liang; Jiye Liang; Hangkun Wang; Kaixuan Yao; Feilong Cao

doi:10.24963/ijcai.2025/108

ScienceGate Book Chapters

JOURNAL ARTICLE

Multi-Modal Point Cloud Completion with Interleaved Attention Enhanced Transformer

Chenghao Fang Jianqing Liang Jiye Liang Hangkun Wang Kaixuan Yao Feilong Cao

Year: 2025 Pages: 963-971

DOI: 10.24963/ijcai.2025/108

Get Full-Text PDF Get Analytical Report

Abstract

Multi-modal point cloud completion, which utilizes a complete image and a partial point cloud as input, is a crucial task in 3D computer vision. Previous methods commonly employ a cross-attention mechanism to fuse point clouds and images. However, these approaches often fail to fully leverage image information and overlook the intrinsic geometric details of point clouds that could complement the image modality. To address these challenges, we propose an interleaved attention enhanced Transformer (IAET) with three main components, i.e., token embedding, bidirectional token supplement, and coarse-to-fine decoding. IAET incorporates a novel interleaved attention mechanism to enable bidirectional information supplementation between the point cloud and image modalities. Additionally, to maximize the use of the supplemented image information, we introduce a view-guided upsampling module that leverages image tokens as queries to guide the generation of detailed point cloud structures. Extensive experiments demonstrate the effectiveness of IAET, highlighting its state-of-the-art performance on multi-modal point cloud completion benchmarks in various scenarios. The source code is freely accessible at https://github.com/doldolOuO/IAET.

Keywords:

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.37

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

3D Shape Modeling and Analysis

Physical Sciences → Engineering → Computational Mechanics

Optical measurement and interference techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

3D Surveying and Cultural Heritage

Physical Sciences → Earth and Planetary Sciences → Geology

Multi-Modal Point Cloud Completion with Interleaved Attention Enhanced Transformer

Abstract

Metrics

Topics

Related Documents

Multi-Modal Point Cloud Completion with Interleaved Attention Enhanced Transformer

Cross-Modal Transformer for Point Cloud Completion

Enhanced Cross-Modal Point Cloud Completion Framework

PMP-Net++: Point Cloud Completion by Transformer-Enhanced Multi-Step Point Moving Paths

Multi-scale Transformer based point cloud completion network