JOURNAL ARTICLE

MOLTR: Multiple Object Localization, Tracking and Reconstruction From Monocular RGB Videos

Kejie LiHamid RezatofighiIan Reid

Year: 2021 Journal:   IEEE Robotics and Automation Letters Vol: 6 (2)Pages: 3341-3348   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Semantic aware reconstruction is more advantageous than geometric-only reconstruction for future robotic and AR/VR applications because it represents not only where things are, but also what things are. Object-centric mapping is a task to build an object-level reconstruction where objects are separate and meaningful entities that convey both geometry and semantic information. In this letter, we present MOLTR, a solution to object-centric mapping using only monocular image sequences and camera poses. It is able to localize, track and reconstruct multiple rigid objects in an online fashion when a RGB camera captures a video of the surrounding. Given a new RGB frame, MOLTR firstly applies a monocular 3D detector to localize objects of interest and extract their shape codes that represent the object shape in a learnt embedding space. Detections are then merged to existing objects in the map after data association. Motion state (i.e., kinematics and the motion status) of each object is tracked by a multiple model Bayesian filter and object shape is progressively refined by fusing multiple shape code. We evaluate localization, tracking and reconstruction on benchmarking datasets for indoor and outdoor scenes, and show superior performance over previous approaches.

Keywords:
Computer vision Artificial intelligence Computer science Object (grammar) Monocular Video tracking RGB color model Tracking (education)

Metrics

20
Cited By
5.84
FWCI (Field Weighted Citation Impact)
58
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Robotics and Sensor-Based Localization
Physical Sciences →  Engineering →  Aerospace Engineering
3D Surveying and Cultural Heritage
Physical Sciences →  Earth and Planetary Sciences →  Geology
Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Monocular Advantage for Multiple Object Tracking

Guiping ZhengRong JiangZhou KeShuai ChangXinping YuLiqin ZhouMing Meng

Journal:   Journal of Vision Year: 2025 Vol: 25 (9)Pages: 1904-1904
JOURNAL ARTICLE

Monocular advantage for multiple object tracking

Guiping ZhengRong JiangZhou KeShuai ChangXinping YuLiqin ZhouMing Meng

Journal:   Psychonomic Bulletin & Review Year: 2025 Vol: 33 (1)Pages: 9-9
JOURNAL ARTICLE

Multiple Pedestrian Tracking From Monocular Videos in an Interacting Multiple Model Framework

Zhengqiang JiangDu Q. Huynh

Journal:   IEEE Transactions on Image Processing Year: 2017 Vol: 27 (3)Pages: 1361-1375
JOURNAL ARTICLE

MIRROR: Multiple Indoor Rooms Reconstruction with Optimized Refinement from Monocular Videos

Ziyue WangYanchao LiuXina ChengTakeshi Ikenaga

Journal:   IEICE Transactions on Information and Systems Year: 2025 Vol: E109.D (1)Pages: 61-69
© 2026 ScienceGate Book Chapters — All rights reserved.