HP MLIB User's Guide Vol. 2 7th Ed.
Chapter 11 Introduction to Distributed SuperLU 737
Distributed SuperLU computational routines
The difference between pdgssvx (pzgssvx) and pdgssvx_ABglobal
(pzgssvx_ABglobal) is that, for pdgssvx_ABglobal (pzgssvx_ABglobal), the
input matrices A and B are globally available (replicated) on all processes,
whereas for pdgssvx (pzgssvx), the input matrices A and B are distributed
among all processes.
If there is sufficient memory, then
pdgssvx_ABglobal (pzgssvx_ABglobal)
should be used to solve sparse linear systems, since
pdgssvx_ABglobal
(pzgssvx_ABglobal) is faster than pdgssvx (pzgssvx) due to algorithmic
differences.
pdgssvx, pdgssvx_ABglobal, pzgssvx and pzgssvx_ABglobal perform the
following functions:
• Equilibrate the system (scale A’s rows and columns to have unit norm) if A
is poorly scaled.
• Find a row permutation that makes diagonal of A large relative to the
off-diagonal.
• Find a column permutation that preserves the sparsity of the L and U
factors.
• Solve the system AX=B for X by factoring A followed by forward and back
substitutions.
• Refine the solution X.
Distributed SuperLU computational routines
The following computational routines can be invoked to directly control the
behavior of SuperLU.
• pdgstrf, pzgstrf: Factorize in parallel.
These routines factorize the input matrix A (or the scaled and permuted A).
They assume that the distributed data structures for L and U factors are
already set up, and the initial values of A are loaded into the data
structures. They can factor non-square matrices.
Currently, A must be globally available on all processes.
• pdgstrs, pdgstrs_Bglobal, pzgstrs, pzgstrs_Bglobal: Triangular solve in
parallel.