Caterina CaccavellaVittorio FraAndreas ZieglerGiulia D'AngeloYulia Sandamirskaya
Conventional frame-based cameras suffer from high data redundancy and limited temporal resolution, making them inefficient for real-time tasks in dynamic environments. To address these limitations, this work proposes a real-time, event-based object detection framework grounded in the fundamental assumption that objects are continuous and close entities in space. The use of event-based cameras, inspired by the human retina, minimize latency, energy consumption, and data redundancy while supporting high dynamic range perception. Moreover, event input naturally extracts edges in the scene, a crucial feature for object identification. A biologically inspired selective attention mechanism further reduces data processing by dynamically selecting regions of interest (ROIs) in the sparse input signal that may contain objects. The proposed framework uses a modular architecture that includes a saliency-based attention model, a lightweight classifier, and two Dynamic neural Fields (DNFs), used respectively for selecting the ROI in the scene and for implementing a scene memory module. The first DNF integrates input from the attention model, previously attended features, and input events to select the ROI through dynamic competition among multiple salience peaks. The lightweight classifier, designed with a minimal number of parameters for fast training and deployment, classifies the content within the ROI. The output is stored in a second DNF, which maintains a memory of recognized objects and their locations. A real-time demonstration illustrates the system’s ability to recognize objects in an open-world scenario, emphasizing the benefits of combining learning-free, low-latency, and low-power proto-object extraction and lightweight classifiers.
Caterina CaccavellaVittorio FraAndreas ZieglerGiulia D'AngeloYulia Sandamirskaya