PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer

Honghui Yang; Wenxiao Wang; Minghao Chen; Binbin Lin; Tong He; Hua Chen; Xiaofei He; Wanli Ouyang

doi:10.1109/cvpr52729.2023.01295

ScienceGate Book Chapters

JOURNAL ARTICLE

PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer

Honghui Yang Wenxiao Wang Minghao Chen Binbin Lin Tong He Hua Chen Xiaofei He Wanli Ouyang

Year: 2023 Pages: 13476-13487

DOI: 10.1109/cvpr52729.2023.01295

Get Full-Text PDF Get Analytical Report

Abstract

Recent Transformer-based 3D object detectors learn point cloud features either from point- or voxel-based representations. However, the former requires time-consuming sampling while the latter introduces quantization errors. In this paper, we present a novel Point-Voxel Transformer for single-stage 3D detection (PVT-SSD) that takes advantage of these two representations. Specifically, we first use voxel-based sparse convolutions for efficient feature encoding. Then, we propose a Point-Voxel Transformer (PVT) module that obtains long-range contexts in a cheap manner from voxels while attaining accurate positions from points. The key to associating the two different representations is our introduced input-dependent Query Initialization module, which could efficiently generate reference points and content queries. Then, PVT adaptively fuses long-range contextual and local geometric information around reference points into content queries. Further, to quickly find the neighboring points of reference points, we design the Virtual Range Image module, which generalizes the native range image to multi-sensor and multi-frame. The experiments on several autonomous driving benchmarks verify the effectiveness and efficiency of the proposed method. Code will be available.

Keywords:

Voxel Computer science Initialization Point cloud Artificial intelligence Computer vision Detector Quantization (signal processing) Transformer Object detection Pattern recognition (psychology) Voltage

Metrics

Cited By

32.24

FWCI (Field Weighted Citation Impact)

Refs

1.00

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Robotics and Sensor-Based Localization

Physical Sciences → Engineering → Aerospace Engineering

3D Surveying and Cultural Heritage

Physical Sciences → Earth and Planetary Sciences → Geology

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer

Abstract

Metrics

Citation History

Topics

Related Documents

V2P-SSD: Single-Stage 3-D Object Detection With Voxel-to-Point Transformation

PP-SSD: Point Painting Single Stage Detector for 3D Object Detection

SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud

T-SSD: A Transformer-based Single-Stage Multi-Scale Sampling Object Detector

NIV-SSD: Neighbor IoU-voting single-stage object detector from point cloud