User`s guide

Managing Lustre I/O with the Snapshot

Library [6]

6.1 About the Snapshot Library

The Cray XMT snapshot library provides a high speed bulk data transfer facility that

moves data between memory regions within an MTK application and files hosted on

the XMT Linux service partition. The primary use of the snapshot library is to load

and save large data sets that are being stored on a Lustre file system. For example,

an application might use the snapshot library to load a large data set at the beginning

of a run, process the data, then use the snapshot library to save the processed data in

a file at the end of a run. An application might also use the snapshot library to save

intermediate copies of the processed data during the course of a run.

The snapshot library uses the Fast IO (FIO) mechanism on the compute partition to

transfer data, in parallel, to and from files on the service partition using instances

of a helper program called fsworker that provide file system access on login

nodes. Multiple instances of fsworker can be used in parallel to provide higher

throughput. This figure shows the most common data communication paths between

an application using the snapshot library and a file on the compute partition. The data

moves, in four distinct stages, between a global memory buffer in the application and

a file on a Lustre file system hosted by the service partition.

Figure 1. Snapshot Library Data Paths

Global

Memory

Linux Service Partition Threadstorm Compute Nodes

Snapshot Client

Compute

Node

FIO

Lustre

File System

FC Portals

OSS

FSW

Application

Data Buffer

Compute

Node

Compute

Node

S–2479–20 67