HP-MPI User's Guide (11th Edition)
Example applications
multi_par.f
Appendix A 253
The second outer-loop (the summations in column-wise fashion) is done
in the same manner. For example, at the beginning of the second step for
the column-wise summations, the rank 2 process receives data from the
rank 1 process that computed the [3,0] block. The rank 2 process also
sends the last column of the [2,0] block to the rank 3 process. Note that
each process keeps the same blocks for both of the outer-loop
computations.
This approach is good for distributed memory architectures on which
repartitioning requires massive data communications that are
expensive. However, on shared memory architectures, the partitioning of
the compute region does not imply data distribution. The row- and
column-block partitioning method requires just one synchronization at
the end of each outer loop.
For distributed shared-memory architectures, the mix of the two
methods can be effective. The sample program implements the
twisted-data layout method with MPI and the row- and column-block
partitioning method with OPENMP thread directives. In the first case,
the data dependency is easily satisfied as each thread computes down a
different set of columns. In the second case we still want to compute
down the columns for cache reasons, but to satisfy the data dependency,
each thread computes a different portion of the same column and the
threads work left to right across the rows together.
implicit none
include 'mpif.h'
integer nrow ! # of rows
integer ncol ! # of columns
parameter(nrow=1000,ncol=1000)
double precision array(nrow,ncol) ! compute region
integer blk ! block iteration counter
integer rb ! row block number
integer cb ! column block number
integer nrb ! next row block number
integer ncb ! next column block
number
integer rbs(:) ! row block start
subscripts
integer rbe(:) ! row block end
subscripts
integer cbs(:) ! column block start
subscripts
integer cbe(:) ! column block end
subscripts
integer rdtype(:) ! row block communication
datatypes
integer cdtype(:) ! column block
communication datatypes
integer twdtype(:) ! twisted distribution