JOURNAL ARTICLE

Register Tiling for Unstructured Sparsity in Neural Network Inference

L.R. WilkinsonKazem CheshmiMaryam Mehri Dehnavi

Year: 2023 Journal:   Proceedings of the ACM on Programming Languages Vol: 7 (PLDI)Pages: 1995-2020   Publisher: Association for Computing Machinery

Abstract

Unstructured sparse neural networks are an important class of machine learning (ML) models, as they compact model size and reduce floating point operations. The execution time of these models is frequently dominated by the sparse matrix multiplication (SpMM) kernel, C = A × B , where A is a sparse matrix, and B and C are dense matrices. The unstructured sparsity pattern of matrices in pruned machine learning models along with their sparsity ratio has rendered useless the large class of libraries and systems that optimize sparse matrix multiplications. Reusing registers is particularly difficult because accesses to memory locations should be known statically. This paper proposes Sparse Register Tiling, a new technique composed of an unroll-and-sparse-jam transformation followed by data compression that is specifically tailored to sparsity patterns in ML matrices. Unroll-and-sparse-jam uses sparsity information to jam the code while improving register reuse. Sparse register tiling is evaluated across 2396 weight matrices from transformer and convolutional models with a sparsity range of 60-95% and provides an average speedup of 1.72× and 2.65× over MKL SpMM and dense matrix multiplication, respectively, on a multicore CPU processor. It also provides an end-to-end speedup of 2.12× for MobileNetV1 with 70% sparsity on an ARM processor commonly used in edge devices.

Keywords:
Speedup Computer science Sparse matrix Parallel computing Matrix multiplication Kernel (algebra) Multi-core processor Algorithm Mathematics

Metrics

14
Cited By
6.75
FWCI (Field Weighted Citation Impact)
46
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Parallel Computing and Optimization Techniques
Physical Sciences →  Computer Science →  Hardware and Architecture
Tensor decomposition and applications
Physical Sciences →  Mathematics →  Computational Mathematics
Neural Networks and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.