White Papers
Altair AcuSolve Performance
16 Dell EMC Ready Solution for HPC Digital Manufacturing— Altair Performance
Again, the Riser(R) model shows better overall parallel scaling than the larger Windmill(W) and Nozzle(N)
models, primarily from cache effects. All models display similar behavior when the number of shared memory
threads is varied. There is little benefit in using multiple threads until four nodes are used. At eight nodes, the
benefits of multiple shared memory threads are noticeable, where typically the more threads the better, up to
a certain point. It would appear that a good rule of thumb for using thread parallelism would be to use one
thread for the number of nodes used in the run (i.e. 4 threads for 4-node runs). This may not be optimal for
every situation but should give reasonable performance.
1.00
2.00
4.00
8.00
48(1) 96(2) 192(4) 384(8)
Performance relative to 32 cores
Number of cores (number of nodes)
Figure 6: AcuSolve Hybrid Parallel Scaling on EBB
R-1 R-2 R-4 R-8
W-1 W-2 W-4 W-8
N-1 N-2
N-4
N-8