Learning Sparse Neural Networks Through Mixture-Distributed Regularization

Chang-Ti Huang; Jun-Cheng Chen; Ja‐Ling Wu

doi:10.1109/cvprw50498.2020.00355

ScienceGate Book Chapters

JOURNAL ARTICLE

Learning Sparse Neural Networks Through Mixture-Distributed Regularization

Chang-Ti Huang Jun-Cheng Chen Ja‐Ling Wu

Year: 2020 Pages: 2968-2977

DOI: 10.1109/cvprw50498.2020.00355

Get Full-Text PDF Get Analytical Report

Abstract

L ₀ -norm regularization is one of the most efficient approaches to learn a sparse neural network. Due to its discrete nature, differentiable and approximate regularizations based on the concrete distribution [31] or its variants are proposed as alternatives; however, the concrete relaxation suffers from high-variance gradient estimates and is limited to its own concrete distribution. To address these issues, in this paper, we propose a more general framework for relaxing binary gates through mixture distributions. With the proposed method, any mixture pair of distributions converging to δ(0) and δ(1) can be applied to construct smoothed binary gates. We further introduce a reparameterization method for the smoothed binary gates drawn from mixture distributions to enable efficient gradient gradient-based optimization under the proposed deep learning algorithm. Extensive experiments are conducted, and the results show that the proposed approach achieves better performance in terms of pruned architectures, structured sparsity and the reduced number of floating point operations (FLOPs) as compared with other state-of-the-art sparsity-inducing methods.

Keywords:

Differentiable function Binary number Computer science Regularization (linguistics) Gradient descent Algorithm FLOPS Artificial neural network Deep neural networks Deep learning Artificial intelligence Mathematics Parallel computing

Metrics

Cited By

0.56

FWCI (Field Weighted Citation Impact)

Refs

0.60

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Sparse and Compressive Sensing Techniques

Physical Sciences → Engineering → Computational Mechanics

Image and Signal Denoising Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image Enhancement Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Learning Sparse Neural Networks Through Mixture-Distributed Regularization

Abstract

Metrics

Citation History

Topics

Related Documents

Sparse Learning for Neural Networks with A Generalized Sparse Regularization

Learning Sparse Neural Networks Using Non-Convex Regularization

Learning Sparse Low-Precision Neural Networks With Learnable Regularization

Nonconvex regularization for sparse neural networks

Group sparse regularization for deep neural networks