Dilong LiShenghong ZhengZiyi ChenJonathan LiLanying WangJi‐Xiang Du
Transformer networks have demonstrated remarkable performance in point cloud processing tasks. However, balancing local feature aggregation with long-range dependency modeling remains a challenging issue. In this work we present a local enhanced Transformer network (LETNet) for land cover classification with multispectral LiDAR data. Specifically, we first rethink position encoding in 3D Transformers and design a novel feature encoding module that embeds comprehensive geometric and semantic information, serving a similar purpose. Then, the proposed local enhanced Transformer module is used to capture the accurate global attention weights and refine the features. Finally, to effectively extract and integrate global features across various scales, an attention-based pooling module is introduced. This module extracts global features from each encoder and decoder layer and constructs a feature pyramid to fuse these multi-scale global features. Both quantitative assessments and comparative analyses demonstrate the competitive capability and advanced performance of the LETNet in land cover classification task.
Yongtao YuTao JiangJunyong GaoHaiyan GuanDilong LiShangbing GaoE. TangWenhao WangPeng TangJonathan Li
Nima EkhtariCraig GlennieJuan Carlos Fernández-Diaz
Salem MorsyAhmed ShakerAhmed El‐RabbanyP. E. LaRocque
Salem MorsyAhmed ShakerAhmed El‐RabbanyP. E. LaRocque
Jesús BaladoPedro AriasL. Díaz−VilariñoL. M. González-deSantos