Laurie A. HulbertEarl Zmijewski
A new parallel algorithm for computing the Cholesky factorization of a large sparse positive-definite matrix on a message-passing multiprocessor is developed. The algorithm attempts to reduce the communication overhead by redistributing the computational load and by repeatedly combining the effect of many messages into a single message. It is demonstrated experimentally that, for problems ordered and partitioned among the processors using nested dissection, the new algorithm communicates significantly fewer messages than a more straightforward approach. Because of this reduction in communication, for the test problems on an Intel iPSC/2 hypercube, the new algorithm is typically at least 20 percent faster. Theoretically, it is shown that in factoring a $k \times k$ grid on p processors, the new algorithm sends $\Theta (pk\log _2 p)$ messages, compared to $k \times k$ grid on p processors, the new algorithm sends $\Theta (pk\log _2 k)$ messages for the straightforward algorithm.
Burkhard MonienJürgen P. Schulze
Edward RothbergRobert Schreiber
John R. GilbertRobert Schreiber
Thomas RauberGudula RüngerCarsten Scholtes