Neighborhood construction plays a key role in point cloud processing. However, existing models only use a single neighborhood construction method to extract neighborhood features, which limits their scene understanding ability. In this paper, we propose a learnable Dual-Neighborhood Feature Aggregation (DNFA) module embedded in the encoder that builds and aggregates comprehensive surrounding knowledge of point clouds. In this module, we first construct two kinds of neighborhoods and design corresponding feature enhancement blocks, including a Basic Local Structure Encoding (BLSE) block and an Extended Context Encoding (ECE) block. The two blocks mine structural and contextual cues for enhancing neighborhood features, respectively. Second, we propose a Geometry-Aware Compound Aggregation (GACA) block, which introduces a functionally complementary compound pooling strategy to aggregate richer neighborhood features. To fully learn the neighborhood distribution, we absorb the geometric location information during the aggregation process. The proposed module is integrated into an MLP-based large-scale 3D processing architecture, which constitutes a 3D semantic segmentation network called DNFA-Net. Extensive experiments on public datasets containing indoor and outdoor scenes validate the superiority of DNFA-Net.
Changhong LiuZhihui LiuXinyu Wang
Dawei LiGuoliang ShiYuhao WuYanping YangMingbo Zhao