PointNAT: Large-Scale Point Cloud Semantic Segmentation via Neighbor Aggregation With Transformer

Ziyin Zeng; Huan Qiu; Jian Zhou; Zhen Dong; Jinsheng Xiao; Bijun Li

doi:10.1109/tgrs.2024.3407761

ScienceGate Book Chapters

JOURNAL ARTICLE

PointNAT: Large-Scale Point Cloud Semantic Segmentation via Neighbor Aggregation With Transformer

Ziyin Zeng Huan Qiu Jian Zhou Zhen Dong Jinsheng Xiao Bijun Li

Year: 2024 Journal: IEEE Transactions on Geoscience and Remote Sensing Vol: 62 Pages: 1-18 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tgrs.2024.3407761

Get Full-Text PDF Get Analytical Report

Abstract

Given the prominence of 3D sensors in recent years, 3D point clouds are worthy to be further investigated for environment perception and scene understanding. Learning accurate local and global contexts in point clouds is pivotal for semantic segmentation, and neighbor aggregation and Transformers have achieved notable success in local and global perception in point cloud analysis, respectively. Nevertheless, studying each independently is far from the optimal solution for comprehensive feature learning. To address this, we take a novel step towards investigating and integrating the structures of neighbor aggregation and Transformers. In this paper, we introduce Point Neighbor Aggregation with Transformer (PointNAT), a conceptually straightforward and effective approach aiming to enhance the performance of 3D point cloud semantic segmentation. PointNAT consists of a Neighbor Aggregation Block (NAB) for local perception, a Point Transformer Block (PTB) for global modeling, and a Hybrid Block to connect NABs and PTBs. NABs effectively learn complex local features at varying scales through an improved neighbor aggregation operation and a multi-head mechanism. PTBs efficiently perform global attention using a small set of learnable key points. Hybrid Blocks serve as high-and-low frequency signal hybridizers, merging the strengths of these two blocks by adaptively assigning hybrid weights to local and global contexts. We have evaluated the performance of PointNAT with state-of-the-art networks on several benchmarks, including S3DIS, Toronto3D, and SensatUrban. PointNAT achieves mIoU scores of 77.8%, 84.7%, and 65.2% in these three dataset, respectively. Furthermore, it outperforms the baseline approach PointNeXt by 3.0%, 1.3%, and 4.2%, respectively, while utilizing only 59.9% of the parameters and 15.2% of the FLOPs. The results demonstrate PointNAT's superior ability in accurately segmenting large-scale 3D point cloud scenes, emphasizing its potential to advance environment perception and scene understanding. Our code is available at https://github.com/zeng-ziyin/PointNAT.

Keywords:

Point cloud Computer science Segmentation k-nearest neighbors algorithm Artificial intelligence Transformer Cloud computing Block (permutation group theory) Data mining Pattern recognition (psychology) Machine learning Mathematics Engineering

Metrics

Cited By

15.14

FWCI (Field Weighted Citation Impact)

107

Refs

0.99

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

3D Shape Modeling and Analysis

Physical Sciences → Engineering → Computational Mechanics

3D Surveying and Cultural Heritage

Physical Sciences → Earth and Planetary Sciences → Geology

Remote Sensing and LiDAR Applications

Physical Sciences → Environmental Science → Environmental Engineering

PointNAT: Large-Scale Point Cloud Semantic Segmentation via Neighbor Aggregation With Transformer

Abstract

Metrics

Citation History

Topics

Related Documents

Urban-scale point cloud semantic segmentation with transformer

NeiEA-NET: Semantic segmentation of large-scale point cloud scene via neighbor enhancement and aggregation

Radial Transformer for Large-Scale Outdoor LiDAR Point Cloud Semantic Segmentation

MPT-Net: Mask Point Transformer Network for Large Scale Point Cloud Semantic Segmentation

DCNet: Large-Scale Point Cloud Semantic Segmentation With Discriminative and Efficient Feature Aggregation