Administrator Guide

Appendix A Storage array cabling
14 Dell EMC Ready Solutions for HPC BeeGFS High Capacity Storage | ID 424
4 Performance evaluation
Our performance studies of the solution uses Mellanox HDR100 data networks. Performance testing
objectives were to quantify the capabilities of the solution, identify performance peaks, and determine the
most appropriate methods for scaling. We ran multiple performance studies, stressed the configuration with
different types of workloads to determine the limitations of performance, and defined the sustainability of that
performance.
We generally try to maintain a standard and consistent testing environment and methodology. In some areas
we purposely optimized server or storage configurations and took measures to limit caching effects
4.1 Large base configuration
We performed the tests on the solution configuration described in Table 1. The following table details the
client test bed that we used to provide the I/O workload:
Client configuration
Component
Specification
Operating system
Red Hat Enterprise Linux Server release 7.6 (Maipo)
Kernel version
3.10.0-957.el7.x86_64
Servers
8x Dell EMC PowerEdge R840
BIOS version
2.4.7
Mellanox OFED version
4.7-3.2.9.0
BeeGFS file system version
7.2 (beta2)
Number of physical nodes
8
Processors
4 x Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz, 24 cores
Memory
24 x 16GB DDR4 2933MT/s DIMMs - 384GB
Our performance analysis focused on these key performance characteristics:
Throughput, data sequentially transferred in GB/s
I/O operations per second (IOPS)
Metadata operations per second (OP/s)
The goal was a broad but accurate overview of the capabilities of the solution using the Mellanox InfiniBand
HDR100. We used the IOzone, IOR and MDtest
benchmarks. IOzone uses an N-to-N file-access method. N-
to-N load was tested, where every thread of the benchmark (N clients) writes to a different file (N files) on the
storage system. For examples of the commands that we used to run these benchmarks, see
Appendix B
Benchmark command reference.
We ran each set of tests on a range of clients to test the scalability of the solution. The number of
simultaneous physical clients involved in each test ranged from a single client to eight clients. The number of
threads per node corresponds to the number of physical compute nodes, up to eight. The total number of
threads above eight were simulated by increasing the number of threads per client across all clients. For
instance, for 128 threads, each of the eight clients ran 16 threads.