White Papers

Abaqus Performance
15 Dell EMC Ready Solution for HPC Digital ManufacturingDassault Systѐmes’ Simulia Abaqus Performance
These results are consistent with the Standard results in Figure 4, where the newer Cascade Lake processors
with the most cores performed the best.
While the explicit solver in Abaqus is a straight forward MPI parallel implementation, the typical standard
solver employs a hybrid parallel algorithm using both shared memory parallel threads and MPI domain
parallelism. The default run mode for the standard solver is to use a simple MPI domain per server, with
parallel threads for each available core on the server. However, the parallel efficiency of the thread
parallelism tends to drop off depending on the model size and features after 5-10 threads. Abaqus enables
the user to carry out simulations by placing more than a single MPI domain on a server to reduce the number
of shared memory parallel threads per domain to increase overall program efficiency. This can be easily
activated with the command line argument “mp_host_split=xx” argument. There is no absolute method to a
priori determine the optimal number of MPI domains per server.
Figure 6 shows the effect of modifying the number of MPI domains for the standard benchmarks on a single
server with the 24-core Intel Gold 6252 processor.
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
E1 E2 E3 E4 E5 E6
Solver Elapsed Time (sec)
Figure 5: Abaqus Explicit Performance
E5-2697Av4 6142 6242 6248 6252