Dense Top-View Semantic Completion With Sparse Guidance and Online Distillation

Shuo Gu; Jiacheng Lu; Jian Yang; Chengzhong Xu; Hui Kong

doi:10.1109/tiv.2023.3268241

ScienceGate Book Chapters

JOURNAL ARTICLE

Dense Top-View Semantic Completion With Sparse Guidance and Online Distillation

Shuo Gu Jiacheng Lu Jian Yang Chengzhong Xu Hui Kong

Year: 2023 Journal: IEEE Transactions on Intelligent Vehicles Vol: 9 (1)Pages: 481-491 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tiv.2023.3268241

Get Full-Text PDF Get Analytical Report

Abstract

Dense semantic scene understanding of the surrounding environment in top-view is a crucial task for autonomous vehicles. Recent LiDAR-based semantic perception works mainly focus on point-wise predictions of the LiDAR points instead of dense predictions of the environment, making them not appropriate for path-planning tasks. Pillar and voxel representations can achieve dense predictions, but the generation of data representation and data processing are usually time-consuming. In this article, we propose a top-view semantic completion network to produce accurate dense grid-wise predictions with real-time performance. Specifically, we propose an online distillation strategy, consisting of two parts: a student model using 2D range-view and top-view representations, and a teacher model using range-view, top-view, and voxel representations. To realize information transfer between different representations, we propose a cross-view association (CVA) module, by which the range-view features and 3D voxel features are converted into the ones in the top-view. The proposed method can avoid the difficulty of direct dense semantic segmentation in the top-view, with the point-wise sparse semantic segmentation module acting as a guide for the dense grid-wise semantic completion in a semantic-completion way. It can also alleviate the computational complexity by using only the voxel representation and 3D convolution in the teacher model. The experimental results on the SemanticKITTI dataset (46.4% mIoU) and nuScenes-LidarSeg dataset (47.3% mIoU) demonstrate the effectiveness of the proposed sparse guidance and online distillation strategies.

Keywords:

Computer science Segmentation Artificial intelligence Grid Voxel Representation (politics) Focus (optics) Range (aeronautics) Point (geometry) Semantics (computer science) Machine learning Pattern recognition (psychology) Computer vision

Metrics

Cited By

0.52

FWCI (Field Weighted Citation Impact)

Refs

0.75

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Robotics and Sensor-Based Localization

Physical Sciences → Engineering → Aerospace Engineering

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Autonomous Vehicle Technology and Safety

Physical Sciences → Engineering → Automotive Engineering

Dense Top-View Semantic Completion With Sparse Guidance and Online Distillation

Abstract

Metrics

Citation History

Topics

Related Documents

Camera-Based 3D Semantic Scene Completion With Sparse Guidance Network

Sparse and Dense Data with CNNs: Depth Completion and Semantic Segmentation

Sparse and Dense Data with CNNs: Depth Completion and Semantic Segmentation

VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion

A Cylindrical Convolution Network for Dense Top-View Semantic Segmentation with LiDAR Point Clouds