JOURNAL ARTICLE

Spatio-Temporal Contextual Learning for Single Object Tracking on Point Clouds

Jiantao GaoXu YanWeibing ZhaoZhen LyuYinghong LiaoChaoda Zheng

Year: 2023 Journal:   IEEE Transactions on Neural Networks and Learning Systems Vol: 35 (7)Pages: 9470-9482   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Single object tracking (SOT) is one of the most active research directions in the field of computer vision. Compared with the 2-D image-based SOT which has already been well-studied, SOT on 3-D point clouds is a relatively emerging research field. In this article, a novel approach, namely, the contextual-aware tracker (CAT), is investigated to achieve a superior 3-D SOT through spatially and temporally contextual learning from the LiDAR sequence. More precisely, in contrast to the previous 3-D SOT methods merely exploiting point clouds in the target bounding box as the template, CAT generates templates by adaptively including the surroundings outside the target box to use available ambient cues. This template generation strategy is more effective and rational than the previous area-fixed one, especially when the object has only a small number of points. Moreover, it is deduced that LiDAR point clouds in 3-D scenes are often incomplete and significantly vary from frame to another, which makes the learning process more difficult. To this end, a novel cross-frame aggregation (CFA) module is proposed to enhance the feature representation of the template by aggregating the features from a historical reference frame. Leveraging such schemes enables CAT to achieve a robust performance, even in the case of extremely sparse point clouds. The experiments confirm that the proposed CAT outperforms the state-of-the-art methods on both the KITTI and NuScenes benchmarks, achieving 3.9% and 5.6% improvements in terms of precision.

Keywords:
Point cloud Computer science Minimum bounding box Artificial intelligence Computer vision Bounding overwatch Frame (networking) Lidar Feature (linguistics) Object (grammar) Tracking (education) Process (computing) Point (geometry) Field (mathematics) Representation (politics) Image (mathematics) Geography Mathematics Remote sensing

Metrics

15
Cited By
2.73
FWCI (Field Weighted Citation Impact)
85
Refs
0.88
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Spatio-Temporal Point Process for Multiple Object Tracking

Tao WangKean ChenWeiyao LinJohn SeeZenghui ZhangQian XuXia Jia

Journal:   IEEE Transactions on Neural Networks and Learning Systems Year: 2020 Vol: 34 (4)Pages: 1777-1788
JOURNAL ARTICLE

Single object tracking based on Spatio-Temporal information

Lixin WeiYun LuoRongzhe ZhuXin Li

Journal:   Signal Processing Image Communication Year: 2025 Vol: 142 Pages: 117463-117463
JOURNAL ARTICLE

Learning Spatio-Temporal Information for Multi-Object Tracking

Jian WeiMei YangFeng Liu

Journal:   IEEE Access Year: 2017 Vol: 5 Pages: 3869-3877
© 2026 ScienceGate Book Chapters — All rights reserved.