White Papers

6 Dell HPC NFS Storage Solution – High Availability (NSS7.0-HA) Configuration
2 Overview of NSS-HA solutions
Along with the current version, four versions of NSS-HA solutions have been released since 2011. This
section provides a brief description of the NSS-HA solution, and lists the current available Dell NSS-HA
offerings.
2.1 A brief introduction to NSS-HA solutions
The design of the NSS-HA solution for each version is similar. In general, the core of the solution is a high
availability (HA) cluster
(6)
, which provides a highly reliable and available storage service to HPC compute
clusters by using a high performance network connection such as Intel Omni-Path (OPA), InfiniBand (IB),
or 10 Gigabit Ethernet (10GbE).
The HA cluster consists of a pair of Dell PowerEdge servers and a network switch. The two PowerEdge
servers have shared access to disk-based Dell PowerVault storage in a variety of capacities, and both are
directly connected to the HPC cluster by using OPA, IB or 10GbE. The two servers are equipped with two
fence devices: iDRAC8 Enterprise, and an APC Power Distribution Unit (PDU). If failures such as storage
disconnection, network disconnection, and system stopping from functioning, etc., occur on one server,
the HA cluster will failover the storage service to the healthy server with the assistance of the two fence
devices; and also ensure that the failed server does not return to life without the administrator’s
knowledge or control.
The disk-based storage array is formatted as a Red Hat Scalable file system (XFS) and exported to the HPC
cluster by using the NFS service of the HA cluster. Note that large capacity file systems (greater than 100
TB) have been supported since the 2
nd
version of NSS-HA solution
(2)
.
Figure 1 shows the general infrastructure of the NSS-HA solution. For more information about NSS-HA
solution, refer to the previous NSS-HA white papers
(1) (2) (3)
.