Reinforcement Learning (RL) has shown great performance in solving sequential decision-making and control in dynamic environments problems. Despite its achievements, training Deep Neural Network (DNN) based RL is expensive in terms of time and power because of the large number of episodes required to train agents with high dimensional image representations. At the deployment also, the massive energy footprint of deep neural networks can be a major drawback. Embedded devices as the main deployment platform, are intrinsically resource-constrained and deploying DNN on them is challenging. Consequently, reducing the number of actions taken by the RL agent to learn desired policy, along with the development of efficient hardware architectures for RL is crucial. In this paper, we propose a novel hardware architecture for RL agents based on the learning hierarchical policies method. We show that hierarchical learning with several levels of control improves RL agents training efficiency and the agent converges faster compared to a none hierarchical model and therefore using less power. This is especially true as the environment becomes more complex with multiple objective sub-goals. Our method is important for efficient learning of policies for RL agent, especially when the target platform is a resource constraint embedded device. By performing a systematic neural network architecture search and hardware design space exploration, we implemented an energy-efficient scalable hardware accelerator for the hierarchical RL. Hardware factors of merit such as the latency, throughput, and energy consumption of the accelerator are evaluated with the various processing elements, and model parameters. The most energy-efficient configuration achieves 139 fps throughput with 5.8 mJ energy consumption per classification on Xilinx Artix-7 FPGA. Compared to similar works our design shows up to 3x better energy efficiency.
Aidin ShiriUttej KallakuriHasib-Al RashidBharat PrakashNicholas R. WaytowichTim OatesTinoosh Mohsenin
Ballari SuprajaKiran KumarKrishna Kumar N
Nithish SindheSafdar AhmedAman RaoArshiya Anjum
Hang ZhouYusi LongShimin GongKun ZhuDinh Thai HoangDusit Niyato