B. Raghu KumarKalluri EswarP. SadayappanChien‐Hung Huang
A judiciously chosen symmetric permutation can significantly reduce the amount of storage and computation for the Cholesky factorization of sparse matrices. On distributed memory machines, the issue of mapping data and computation onto processors is also important. Previous research on ordering for parallelism has focussed on idealized measures like execution time on an unbounded number of processors, with zero communication costs. In this paper, we propose an ordering and mapping algorithm that attempts to minimize communication and performs load balancing of work among the processors. Performance results on an Intel iPSC/860 hypercube are presented to demonstrate its effectiveness.< >
Burkhard MonienJürgen P. Schulze