White Papers

Dell Storage for HPC with Intel Enterprise Edition for Lustre sofware

request size is used because it aligns with Lustre’s 4KB file system block size and is representative of

small block accesses for a random workload. Performance is measured in I/O Operations per second

(IOPS)

Figure 10 shows that the random writes peak at little over 10K IOPS with 240 threads, while random

reads peak at 65K IOPS with 192 threads. The IOPs of random reads increase rapidly from 120 to 192

threads and then slight decline before leveling. As the writes require a file lock per OST accessed,

saturation is not unexpected. Reads take advantage of Lustre’s ability to grant overlapping read extent

locks for part or all of a file.

Figure 10: N-to-N Random reads and writes

4.3 IOR N-to-1 Reads and Writes

Performance analysis of the Dell Storage for HPC with Intel EE for Lustre solution with reads and writes

to a single file was done with the IOR benchmarking tool. IOR accommodates MPI communications for

parallel operations and has support for manipulating Lustre striping. IOR allows several different IO

interfaces for working with the files. For purpose of our tests, we used the POSIX interface to exclude

the advanced features and associated overhead of the other available IO interfaces. This gives us an

opportunity to review the file system and hardware performance independent of those additional

enhancements.

IOR benchmark version 3.0.1 was used in this study. The MPI stack used was Intel MPI version 5.0

Update 1.

The configuration for the write test included a directory set with striping characteristics designed to

stripe across 24 OSTs with a stripe size of 4MB. Therefore, all threads write to a file that is striped

across all 24 OSTs. In this test, the request size for Lustre was set to 1MB, however, a transfer size of

4MB was used to match the stripe size used on the target file.