JOURNAL ARTICLE

Pose Relation Transformer Refine Occlusions for Human Pose Estimation

Abstract

Accurately estimating the human pose is an essential task for many applications in robotics. However, existing pose estimation methods suffer from poor performance when occlusion occurs. Recent advances in NLP have been very successful in predicting the missing words conditioned on visible words. We draw upon the sentence completion analogy in NLP to guide our model to address occlusions in the pose estimation problem. We propose a novel approach that can mitigate the effect of occlusions motivated by the sentence completion task of NLP. In an analogous manner, we designed our model to reconstruct occluded joints given the visible joints utilizing joint correlations by capturing the implicit joint connectivity through the attention mechanism. In this work, we propose a POse Relation Transformer (PORT) that captures the global context of the pose using self-attention and a local context by aggregating adjacent joint features. To supervise PORT in learning joint correlations, we guide PORT to reconstruct randomly masked joints, which we call Masked Joint Modeling (MJM). PORT trained with MJM adds to existing keypoint detection methods and successfully refines occlusions. Notably, PORT is a model-agnostic plug-and-play module for pose refinement under occlusion that can be plugged into any keypoint detector with substantially low computational costs. We conducted extensive experiments to demonstrate the advantage of PORT mitigating the occlusion on the hand and body pose PORT improves the pose estimation accuracy of existing human pose estimation methods by up to 16% with only 5% of additional parameters. The code is publicly available at https://github.com/stnoah1/PORT.

Keywords:
Computer science Pose Artificial intelligence Transformer Context (archaeology) Computer vision Sentence Task (project management) Machine learning Voltage Engineering

Metrics

5
Cited By
0.91
FWCI (Field Weighted Citation Impact)
72
Refs
0.70
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Hand Gesture Recognition Systems
Physical Sciences →  Computer Science →  Human-Computer Interaction
Robot Manipulation and Learning
Physical Sciences →  Engineering →  Control and Systems Engineering

Related Documents

JOURNAL ARTICLE

Gated Region-Refine pose transformer for human pose estimation

Tianfeng WangXiaoxu Zhang

Journal:   Neurocomputing Year: 2023 Vol: 530 Pages: 37-47
JOURNAL ARTICLE

Aggregation Transformer for Human Pose Estimation

Hao DongGuodong WangXinyue Zhang

Journal:   2022 26th International Conference on Pattern Recognition (ICPR) Year: 2022 Pages: 3660-3667
© 2026 ScienceGate Book Chapters — All rights reserved.