HP-MPI User's Guide (11th Edition)

Tuning

Message latency and bandwidth

Chapter 5 185

Message latency and bandwidth

Latency is the time between the initiation of the data transfer in the

sending process and the arrival of the first byte in the receiving process.

Latency is often dependent upon the length of messages being sent. An

application’s messaging behavior can vary greatly based upon whether a

large number of small messages or a few large messages are sent.

Message bandwidth is the reciprocal of the time needed to transfer a

byte. Bandwidth is normally expressed in megabytes per second.

Bandwidth becomes important when message sizes are large.

To improve latency or bandwidth or both:

• Reduce the number of process communications by designing

applications that have coarse-grained parallelism.

• Use derived, contiguous data types for dense data structures to

eliminate unnecessary byte-copy operations in certain cases. Use

derived data types instead of MPI_Pack and MPI_Unpack if possible.

HP-MPI optimizes noncontiguous transfers of derived data types.

• Use collective operations whenever possible. This eliminates the

overhead of using MPI_Send and MPI_Recv each time when one

process communicates with others. Also, use the HP-MPI collectives

rather than customizing your own.

• Specify the source process rank whenever possible when calling

MPI routines. Using MPI_ANY_SOURCE may increase latency.

• Double-word align data buffers if possible. This improves byte-copy

performance between sending and receiving processes because of

double-word loads and stores.

•Use MPI_Recv_init and MPI_Startall instead of a loop of

MPI_Irecv calls in cases where requests may not complete

immediately.

For example, suppose you write an application with the following

code section:

j = 0

for (i=0; i<size; i++) {

if (i==rank) continue;

MPI_Irecv(buf[i], count, dtype, i, 0, comm, &requests[j++]);