HP MLIB User's Guide Vol. 1 7th Ed.
Chapter 3 Basic Matrix Operations 227
Strassen matrix-matrix multiply DGEMMS/ZGEMMS
Name DGEMMS/ZGEMMS
Strassen matrix-matrix multiply
Purpose These subprograms use Strassen’s method to compute the matrix-matrix
product AB, where A is an m-by-k matrix and B is a k-by-n matrix. Strassen’s
method is an algorithm for matrix multiplication that, under certain
circumstances, uses fewer than mnk multiplications and additions. These
subprograms have argument lists identical to the standard Level 3 BLAS
subprograms DGEMM and ZGEMM in VECLIB and CGEMM and SGEMM in
VECLIB8. So to convert a program to call a Strassen subprogram instead of a
standard matrix multiply, it is only necessary to change the subprogram name.
With consistent upper or lower case coding, a simple preprocessor directive can
select standard or Strassen matrix multiply calls. Work area management is
done by the subprograms.
By using Strassen’s method, these subprograms can be considerably faster than
their VECLIB and VECLIB8 counterparts. Refer to “Notes” for details. In
addition to computing the matrix-matrix product AB, A can be replaced by A
T
or A*, where A is a k-by-m matrix, and B can be replaced by B
T
or B*, where B
is an n-by-k matrix. Here, A
T
and B
T
are the transposes and A* and B* are the
conjugate-transposes of A and B, respectively. The product can be stored in the
result matrix (which is always of size m-by-n) or optionally can be added to or
subtracted from it. This is handled in a convenient, but general, way by two
scalar arguments, α and β, which are used as multipliers of the matrix product
and the result matrix. Specifically, these subprograms compute matrix products
of the forms:
Refer to “F_SGEMM/F_DGEMM/F_CGEMM/F_ZGEMM” on page 362 for a
description of the equivalent BLAS Standard subprograms.
Usage VECLIB:
CHARACTER*1 transa, transb
INTEGER*4 m, n, k, lda, ldb, ldc
REAL*8 alpha, beta, a(lda, *), b(ldb, *), c(ldc, n)
CALL DGEMMS(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
CHARACTER*1 transa, transb
INTEGER*4 m, n, k, lda, ldb, ldc
COMPLEX*16 alpha, beta, a(lda, *), b(ldb, *), c(ldc, n)
CALL ZGEMMS(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
C ←αAB +βC,
C ←αA
T
B +βC,
C ←αA
∗
B +βC,
C ←αAB
T
+βC, C ←αA
T
B
T
+βC, C ←αA∗B
T
+βC,
C ←αAB
∗
+βC, C ←αA
T
B
∗
+βC, C ←αA
∗
B
∗
+βC.