White Papers

Altair Feko Performance

24 Dell EMC Ready Solution for HPC Digital Manufacturing— Altair Performance

All the models show good scaling when running a single job with up to 8 nodes. It should be noted that Feko

can efficiently distribute the problem datasets across nodes for larger direct full wave problems. The larger

sedan examples solved at 1.6 GHz which could not fit within a single 192GB node. At 1.6GHz, four nodes

were required to carry out the job. The performance benefit of the larger cores processors, such as the 6254

diminishes as the problem is run across more nodes, but this effective relative speedup is better maintained

for the higher frequency models.

Figure 15 shows the performance of the benchmark model, “F5” at both low and medium range frequencies.

This model is much larger than the Sedan model described above, and benchmarks were carried out using

the iterative Multi Level Fast Multipole Method (MLFMM). The MLFMM is very efficient at solving very large

full wave problems and require a very fast interconnect. The problem size only warranted presenting results

computed on a single node. These benchmarks were carried out on a 6142-based server equipped with

768GB of memory, as neither simulation would fit in 192GB of memory.

1.0

2.0

4.0

1-node 2-node 4-node 8-node

Performance Relative to 1

-node 6142

Figure 14: Feko Sedan (MoM) Parallel Scaling

700_6142 1100_6142 1600_6142

700_6252 1100_6252 1600_6252

1.1

1.2

1.3

1.4

1.5

1.6

16-core 20-core 24-core 32-core

Performance (relative to 16

-cores)

Figure 15: Feko (MLFMM) Parallel Scaling on a Single Node

F5_low

F5_medium