UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision

Jinmook Lee; Changhyeon Kim; Sanghoon Kang; Dongjoo Shin; Sangyeob Kim; Hoi‐Jun Yoo

doi:10.1109/jssc.2018.2865489

ScienceGate Book Chapters

JOURNAL ARTICLE

UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision

Jinmook Lee Changhyeon Kim Sanghoon Kang Dongjoo Shin Sangyeob Kim Hoi‐Jun Yoo

Year: 2018 Journal: IEEE Journal of Solid-State Circuits Vol: 54 (1)Pages: 173-185 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/jssc.2018.2865489

Get Full-Text PDF Get Analytical Report

Abstract

An energy-efficient deep neural network (DNN) accelerator, unified neural processing unit (UNPU), is proposed for mobile deep learning applications. The UNPU can support both convolutional layers (CLs) and recurrent or fully connected layers (FCLs) to support versatile workload combinations to accelerate various mobile deep learning applications. In addition, the UNPU is the first DNN accelerator ASIC that can support fully variable weight bit precision from 1 to 16 bit. It enables the UNPU to operate on the accuracy-energy optimal point. Moreover, the lookup table (LUT)-based bit-serial processing element (LBPE) in the UNPU achieves the energy consumption reduction compared to the conventional fixed-point multiply-and-accumulate (MAC) array by 23.1%, 27.2%, 41%, and 53.6% for the 16-, 8-, 4-, and 1-bit weight precision, respectively. Besides the energy efficiency improvement, the unified DNN core architecture of the UNPU improves the peak performance for CL by 1.15$\times$ compared to the previous work. It makes the UNPU operate on the lower voltage and frequency for the given DNN to increase energy efficiency. The UNPU is implemented in 65-nm CMOS technology and occupies the $4 \times 4$ mm ² die area. The UNPU can operates from 0.63- to 1.1-V supply voltage with maximum frequency of 200 MHz. The UNPU has peak performance of 345.6 GOPS for 16-bit weight precision and 7372 GOPS for 1-bit weight precision. The wide operating range of UNPU makes the UNPU achieve the power efficiency of 3.08 TOPS/W for 16-bit weight precision and 50.6 TOPS/W for 1-bit weight precision. The functionality of the UNPU is successfully demonstrated on the verification system using ImageNet deep CNN (VGG-16).

Keywords:

Computer science Lookup table Convolutional neural network CMOS Efficient energy use Artificial neural network Energy (signal processing) Energy consumption Computer hardware Application-specific integrated circuit Deep learning Artificial intelligence Computer engineering Electronic engineering Electrical engineering Mathematics Engineering

Metrics

318

Cited By

15.34

FWCI (Field Weighted Citation Impact)

Refs

0.99

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Memory and Neural Computing

Physical Sciences → Engineering → Electrical and Electronic Engineering

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Ferroelectric and Negative Capacitance Devices

Physical Sciences → Engineering → Electrical and Electronic Engineering

UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision

Abstract

Metrics

Citation History

Topics

Related Documents

UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision

A Flexible Precision Scaling Deep Neural Network Accelerator with Efficient Weight Combination

A Precision-Scalable Energy-Efficient Convolutional Neural Network Accelerator

An Energy-Efficient Deep Neural Network Accelerator Design

DeepCAM: A Fully CAM-based Inference Accelerator with Variable Hash Lengths for Energy-efficient Deep Neural Networks