Troubleshooting guide

and on host2 type:
gm_allsize --both-ways --bandwidth \
--remote-host=host1 --size=15 –geometric
where the length of the messages sent is 2**(size - 8) bytes. This test has GM
streaming packets in both directions (both nodes are always sending) and it causes
GM to report the sum of the send and receive bandwidths.
The output from this command will consist of two columns of data: the first
column lists the message size (in bytes) and the second column lists the
bandwidth (in MB/s).
4. Run a sample benchmark (e.g., HPL) (1 node run) on each of the nodes in the
cluster to ensure that all nodes report consistent performance. If not, there could
be an issue with a particular CPU on one of the hosts.
5. Run a sample benchmark (e.g., HPL) on equally-sized subsets of nodes. Make
sure that performance is consistent across all subsets of nodes. If you see a
particular subset that is slower, then you need to perform a divide-and-conquer
approach to isolate the slower node.
© 2007 Myricom, Inc. DRAFT
44