JOURNAL ARTICLE

Deep Models for Rigid Objects Real-Time Pose Estimation

Zhao, Jianyu

Year: 2024 Journal:   CLOK (University of Central Lancashire)   Publisher: University of Central Lancashire

Abstract

Accurate and robust six degrees of freedom (6-DoF) pose estimation of rigid objects is one of the fundamental tasks in computer vision, with wide-ranging applications that span industrial automation, augmented reality, and medical intervention. However, most existing methods typically rely on knowledge of objects’ 3D models and depth measurements, and often require time-consuming iterative refinement to improve accuracy, which can be seen as limiting factors for broader applications. This PhD thesis is primarily motivated by the desire to overcome these limitations. It presents a comprehensive study of the 6-DoF pose estimation problem. Drawing inspiration from the latest deep learning pose estimation methods, a novel 6-DoF pose estimation framework named Auto-Pose is proposed, which incorporates latent space representations of deep neural networks with supervised learning algorithms. The proposed framework consists of three novel autoencoder-based methods: DALSR-Pose, CVML-Pose, and CVAM-Pose. These proposed methods are specifically designed to address the limitations of the existing methods enabling the estimation of rigid objects’ 6-DoF poses from a single colour image in real time, without access to any explicit 3D models of the objects or depth data or performing a post-refinement. The fundamental idea is to implicitly learn intermediate representations of objects in the latent space from only colour images, and the 6-DoF poses are estimated from the latent representations using multiple regression-based algorithms such as multilayer perception (MLP), k-nearest neighbours (KNN), and random forest (RF). Deep Models for Rigid Objects Real-Time Pose Estimation. The proposed methods can operate in real time and are applicable in complex scenarios, including textured/texture-less objects represented in low-resolution images with heavy occlusion and clutter. Extensive experiments and evaluation results across multiple publicly available benchmark datasets demonstrate the superiority of the proposed framework in pose estimation accuracy over existing methods that similarly use latent space representations, with accuracy improved by 30%. It also achieves comparable results to other state-of-the-art methods that use 3D models. The thesis makes significant contributions to the field of 6-DoF pose estimation facilitating development of model-free estimation algorithms. The novelty of the work rests in the proposed autoencoder-based methods that achieve competitive performance compared to the state-of-the-art using only data from a monoscopic camera, without the need for the object’s 3D model, depth measurement, or further iterative refinement often essential for the existing methods.

Keywords:
Pose 3D pose estimation Articulated body pose estimation Deep learning Space (punctuation) Perception Artificial neural network Estimation

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.45
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Robot Manipulation and Learning
Physical Sciences →  Engineering →  Control and Systems Engineering
Robotics and Sensor-Based Localization
Physical Sciences →  Engineering →  Aerospace Engineering
Robotic Mechanisms and Dynamics
Physical Sciences →  Engineering →  Control and Systems Engineering

Related Documents

JOURNAL ARTICLE

Robust Monocular Pose Estimation of Rigid 3D Objects in Real-Time

Tjaden, Henning

Journal:   Universitätsbibliothek Johannes Gutenberg Universität Mainz Year: 2019
JOURNAL ARTICLE

Real-time pose estimation of rigid objects in heavily cluttered environments

Blaž BrataničFranjo PernušBoštjan LikarDejan Tomaževič

Journal:   Computer Vision and Image Understanding Year: 2015 Vol: 141 Pages: 38-51
BOOK-CHAPTER

Model-Based Pose Estimation for Rigid Objects

Manolis LourakisXenophon Zabulis

Lecture notes in computer science Year: 2013 Pages: 83-92
© 2026 ScienceGate Book Chapters — All rights reserved.