JOURNAL ARTICLE

DCSolver: Accelerating Sparse Iterative Solvers via Divide-and-Conquer on GPUs

Haozhong QiuChuanfu XuJianbin FangJian ZhangLiang DengZ. G. DaiYue DingYue WangZhiping HanYonggang CheJie Liu

Year: 2025 Journal:   ACM Transactions on Architecture and Code Optimization Vol: 22 (3)Pages: 1-25   Publisher: Association for Computing Machinery

Abstract

Sparse iterative solvers are commonly used in various fields. However, certain essential kernels of these solvers, such as sparse triangular solves (SpTRSV), present significant challenges for efficient parallelization due to data dependencies . Previous methods, like level-scheduling or multi-coloring, typically involve creating a Task Dependency Graph (TDG) to represent data dependencies and identify independent sets from the TDG for parallel execution. However, these approaches often result in limited parallelism with substantial synchronization overheads or negatively impact the solver convergence rate. This article introduces DCSolver , a Divide-and-Conquer (DC) framework designed to efficiently parallelize sparse solvers with data dependencies on GPUs. To achieve this, we break down the solver TDG into independent subgraphs, allowing us to exploit both coarse-grained and fine-grained parallelism. To efficiently allocate GPU threads for subgraphs with varying degrees of parallelism, we have developed an adaptive in-warp scheduling strategy. Additionally, we propose a hybrid parallelization scheme in DCSolver, which involves employing different parallel approaches for different DC recursions to achieve a more optimal balance between parallelism and convergence for solvers. To evaluate the effectiveness of DCSolver, we apply it to two preconditioned Krylov subspace solvers and an unstructured mesh Computational Fluid Dynamics (CFD) solver. Our results show that when compared with the state-of-the-art methods, DCSolver accelerates the time-to-solution of solvers by an average speedup of up to 26.19X.

Keywords:
Divide and conquer algorithms Computer science Parallel computing Computational science Supercomputer Algorithm

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
36
Refs
0.21
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Parallel Computing and Optimization Techniques
Physical Sciences →  Computer Science →  Hardware and Architecture
Numerical Methods and Algorithms
Physical Sciences →  Computer Science →  Computational Theory and Mathematics
Matrix Theory and Algorithms
Physical Sciences →  Computer Science →  Computational Theory and Mathematics
© 2026 ScienceGate Book Chapters — All rights reserved.