White Papers
Dell Storage for HPC with Intel Enterprise Edition for Lustre sofware
request size is used because it aligns with Lustre’s 4KB file system block size and is representative of
small block accesses for a random workload. Performance is measured in I/O Operations per second
(IOPS)
Figure 10 shows that the random writes peak at little over 10K IOPS with 240 threads, while random
reads peak at 65K IOPS with 192 threads. The IOPs of random reads increase rapidly from 120 to 192
threads and then slight decline before leveling. As the writes require a file lock per OST accessed,
saturation is not unexpected. Reads take advantage of Lustre’s ability to grant overlapping read extent
locks for part or all of a file.
Figure 10: N-to-N Random reads and writes
4.3 IOR N-to-1 Reads and Writes
Performance analysis of the Dell Storage for HPC with Intel EE for Lustre solution with reads and writes
to a single file was done with the IOR benchmarking tool. IOR accommodates MPI communications for
parallel operations and has support for manipulating Lustre striping. IOR allows several different IO
interfaces for working with the files. For purpose of our tests, we used the POSIX interface to exclude
the advanced features and associated overhead of the other available IO interfaces. This gives us an
opportunity to review the file system and hardware performance independent of those additional
enhancements.
IOR benchmark version 3.0.1 was used in this study. The MPI stack used was Intel MPI version 5.0
Update 1.
The configuration for the write test included a directory set with striping characteristics designed to
stripe across 24 OSTs with a stripe size of 4MB. Therefore, all threads write to a file that is striped
across all 24 OSTs. In this test, the request size for Lustre was set to 1MB, however, a transfer size of
4MB was used to match the stripe size used on the target file.