Residue Number System (RNS) has recently attracted interest for the hardware implementation of inference in machine-learning systems as it provides promising trade-offs in the area, time, and power dissipation space. In this paper we introduce a technique that utilizes regularization during training, and increases the percentage of residues which are zero, when the parameters of an artificial neural network (ANN) are expressed in an RNS. The proposed technique can also be used as a post-processing stage, allowing the optimization of pre-trained models for RNS implementation. By increasing the number of residues being zero, i.e., residue-level sparsity, the proposed technique facilitates new hardware architectures for RNS-based inference, allowing new trade-offs and improving performance over prior art without practically compromising accuracy. The introduced method increases residue sparsity by a factor of 4× to 6× in certain cases.
Xiaotian ZhuWengang ZhouHouqiang Li
Hemalatha KurapatiR. Sakthivel