Quantized deep neural networks for energy efficient hardware-based inference

Ruizhou Ding; Zeye Liu; R.D. Blanton; Diana Marculescu

doi:10.1109/aspdac.2018.8297274

ScienceGate Book Chapters

JOURNAL ARTICLE

Quantized deep neural networks for energy efficient hardware-based inference

Ruizhou Ding Zeye Liu R.D. Blanton Diana Marculescu

Year: 2018 Journal: 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC) Pages: 1-8

DOI: 10.1109/aspdac.2018.8297274

Get Full-Text PDF Get Analytical Report

Abstract

Deep Neural Networks (DNNs) have been adopted in many systems because of their higher classification accuracy, with custom hardware implementations great candidates for high-speed, accurate inference. While progress in achieving large scale, highly accurate DNNs has been made, significant energy and area are required due to massive memory accesses and computations. Such demands pose a challenge to any DNN implementation, yet it is more natural to handle in a custom hardware platform. To alleviate the increased demand in storage and energy, quantized DNNs constrain their weights (and activations) from floating-point numbers to only a few discrete levels. Therefore, storage is reduced, thereby leading to less memory accesses. In this paper, we provide an overview of different types of quantized DNNs, as well as the training approaches for them. Among the various quantized DNNs, our LightNN (Light Neural Network) approach can reduce both memory accesses and computation energy, by filling the gap between classic, full-precision and binarized DNNs. We provide a detailed comparison between LightNNs, conventional DNNs and Binarized Neural Networks (BNNs), with MNIST and CIFAR-10 datasets. In contrast to other quantized DNNs that trade-off significant amounts of accuracy for lower memory requirements, LightNNs can significantly reduce storage, energy and area while still maintaining a test error similar to a large DNN configuration. Thus, LightNNs provide more options for hardware designers to trade-off accuracy and energy.

Keywords:

MNIST database Computer science Inference Deep neural networks Computation Artificial neural network Computer engineering Efficient energy use Computer hardware Energy (signal processing) Parallel computing Artificial intelligence Algorithm

Metrics

Cited By

3.19

FWCI (Field Weighted Citation Impact)

Refs

0.92

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Adversarial Robustness in Machine Learning

Physical Sciences → Computer Science → Artificial Intelligence

Machine Learning and Data Classification

Physical Sciences → Computer Science → Artificial Intelligence

Quantized deep neural networks for energy efficient hardware-based inference

Abstract

Metrics

Citation History

Topics

Related Documents

Hardware for Quantized Mixed-Precision Deep Neural Networks

Agile and Efficient Inference of Quantized Neural Networks

Efficient hardware acceleration for approximate inference of bitwise deep neural networks

A Hardware Accelerator Based on Quantized Weights for Deep Neural Networks

BitBlade: Energy-Efficient Variable Bit-Precision Hardware Accelerator for Quantized Neural Networks