An Energy-Efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning

Aidin Shiri; Bharat Prakash; Arnab Neelim Mazumder; Nicholas R. Waytowich; Tim Oates; Tinoosh Mohsenin

doi:10.1109/aicas51828.2021.9458548

ScienceGate Book Chapters

JOURNAL ARTICLE

An Energy-Efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning

Aidin Shiri Bharat Prakash Arnab Neelim Mazumder Nicholas R. Waytowich Tim Oates Tinoosh Mohsenin

Year: 2021 Pages: 1-4

DOI: 10.1109/aicas51828.2021.9458548

Get Full-Text PDF Get Analytical Report

Abstract

Reinforcement Learning (RL) has shown great performance in solving sequential decision-making and control in dynamic environments problems. Despite its achievements, training Deep Neural Network (DNN) based RL is expensive in terms of time and power because of the large number of episodes required to train agents with high dimensional image representations. At the deployment also, the massive energy footprint of deep neural networks can be a major drawback. Embedded devices as the main deployment platform, are intrinsically resource-constrained and deploying DNN on them is challenging. Consequently, reducing the number of actions taken by the RL agent to learn desired policy, along with the development of efficient hardware architectures for RL is crucial. In this paper, we propose a novel hardware architecture for RL agents based on the learning hierarchical policies method. We show that hierarchical learning with several levels of control improves RL agents training efficiency and the agent converges faster compared to a none hierarchical model and therefore using less power. This is especially true as the environment becomes more complex with multiple objective sub-goals. Our method is important for efficient learning of policies for RL agent, especially when the target platform is a resource constraint embedded device. By performing a systematic neural network architecture search and hardware design space exploration, we implemented an energy-efficient scalable hardware accelerator for the hierarchical RL. Hardware factors of merit such as the latency, throughput, and energy consumption of the accelerator are evaluated with the various processing elements, and model parameters. The most energy-efficient configuration achieves 139 fps throughput with 5.8 mJ energy consumption per classification on Xilinx Artix-7 FPGA. Compared to similar works our design shows up to 3x better energy efficiency.

Keywords:

Reinforcement learning Computer science Efficient energy use Scalability Artificial neural network Energy consumption Deep learning Software deployment Hardware acceleration Memory footprint Distributed computing Artificial intelligence Embedded system Computer architecture Field-programmable gate array

Metrics

Cited By

0.28

FWCI (Field Weighted Citation Impact)

Refs

0.62

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Reinforcement Learning in Robotics

Physical Sciences → Computer Science → Artificial Intelligence

Advanced Memory and Neural Computing

Physical Sciences → Engineering → Electrical and Electronic Engineering

Adversarial Robustness in Machine Learning

Physical Sciences → Computer Science → Artificial Intelligence

An Energy-Efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning

Abstract

Metrics

Citation History

Topics

Related Documents

E2HRL: An Energy-efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning

Energy-Efficient Deep Reinforcement Learning Accelerator Designs for Mobile Autonomous Systems

Optimized Energy-Efficient IoT Healthcare Systems Using Hierarchical Deep Reinforcement Learning

Deep Learning Hardware Accelerator Unit

Hierarchical Multi-Agent Deep Reinforcement Learning for Energy-Efficient Hybrid Computation Offloading