Object-Centric Representation Learning with Attention Mechanism

Hidemoto Nakada; Hideki Asoh

doi:10.1109/imcom60618.2024.10418364

ScienceGate Book Chapters

JOURNAL ARTICLE

Object-Centric Representation Learning with Attention Mechanism

Hidemoto Nakada Hideki Asoh

Year: 2024 Vol: abs/2006.07034 Pages: 1-7

DOI: 10.1109/imcom60618.2024.10418364

Get Full-Text PDF Get Analytical Report

Abstract

For object-centric representation learning, several slot-based methods, that separate objects using masks and learn the objects separately, are proposed. While these methods are proved to be useful on various downstream tasks, it is known that they require a significant amount of computation for training. We propose the introduction of attention mechanisms into slot-based method to simplify and speed up the computation. We pick ViMON as the base structure and propose two methods, named AttnViMON and SFA. We evaluate them in terms of reconstruction error and computation time, and a downstream task. The proposed methods demonstrate that they achieve significant speed-up while showing even better performance.

Keywords:

Computer science Computation Representation (politics) Object (grammar) Task (project management) Artificial intelligence Mechanism (biology) Base (topology) Speedup Cognitive neuroscience of visual object recognition Machine learning Algorithm Parallel computing Mathematics

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.01

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Object-Centric Representation Learning with Attention Mechanism

Abstract

Metrics

Topics

Related Documents

Identifiable Object-Centric Representation Learning via Probabilistic Slot Attention

Language-Mediated, Object-Centric Representation Learning

ORSA-T: Multi-View Object-Centric Scene Representation Learning with Slot Attention and Transformer

ROLA: real-world object-centric learning with attention optimization

Object-Centric Scene Representation Learning via SAM