White Papers

Altair Feko Performance
24 Dell EMC Ready Solution for HPC Digital Manufacturing Altair Performance
All the models show good scaling when running a single job with up to 8 nodes. It should be noted that Feko
can efficiently distribute the problem datasets across nodes for larger direct full wave problems. The larger
sedan examples solved at 1.6 GHz which could not fit within a single 192GB node. At 1.6GHz, four nodes
were required to carry out the job. The performance benefit of the larger cores processors, such as the 6254
diminishes as the problem is run across more nodes, but this effective relative speedup is better maintained
for the higher frequency models.
Figure 15 shows the performance of the benchmark model, “F5” at both low and medium range frequencies.
This model is much larger than the Sedan model described above, and benchmarks were carried out using
the iterative Multi Level Fast Multipole Method (MLFMM). The MLFMM is very efficient at solving very large
full wave problems and require a very fast interconnect. The problem size only warranted presenting results
computed on a single node. These benchmarks were carried out on a 6142-based server equipped with
768GB of memory, as neither simulation would fit in 192GB of memory.
1.0
2.0
4.0
1-node 2-node 4-node 8-node
Performance Relative to 1
-node 6142
Figure 14: Feko Sedan (MoM) Parallel Scaling
700_6142 1100_6142 1600_6142
700_6252 1100_6252 1600_6252
1
1.1
1.2
1.3
1.4
1.5
1.6
16-core 20-core 24-core 32-core
Performance (relative to 16
-cores)
Figure 15: Feko (MLFMM) Parallel Scaling on a Single Node
F5_low
F5_medium