White Papers
Dell HPC NFS Storage Solution - High Availability (NSS5.5-HA) Configuration with Dell PowerVault
MD3460 and MD3060e Storage Arrays
6
1. Introduction
This solution guide provides information on the latest Dell NFS Storage Solution - High Availability
configurations (NSS-HA) with Dell PowerVault MD3460 and MD3060e storage arrays. The solution uses
Dell PowerEdge servers and PowerVault storage arrays along with Red Hat high Availability software
stack to provide an easy to manage, reliable, and cost effective storage solution for HPC clusters. It
leverages the latest Dell PowerVault Storage arrays (MD3460 and MD3060e) to offer higher performance
storage solutions than previous NSS-HA solutions. This version of the solution is NSS5.5-HA.
The design principle for this release remains the same as previous Dell NSS-HA solutions. The major
changes between the current and previous version of NSS-HA solution( NSS5.0-HA), are the change from
Dell PowerVault MD3260 storage array to the latest PowerVault MD3460 storage array, the change from
the RHEL 6.4 operating system to RHEL 6.5 and the change from 6Gbps SAS connections to 12Gbps SAS
conections. For complete details of the NSS-HA family, review this document along with the previous
NSS-HA white papers and blogs
(1) (2) (3) (4) (5)
.
The following sections describe the technical details, evaluation method, and the expected
performance of the solution.
2. Overview of NSS-HA solutions
Along with the current version, four versions of NSS-HA solutions have been released since 2011. This
section provides a brief description of the NSS-HA solution, and lists the available Dell NSS-HA
offerings.
2.1. A brief introduction to NSS-HA solutions
The design of the NSS-HA solution for each version is similar. In general, the core of the solution is a
high availability (HA) cluster
(4)
, which provides a highly reliable and available storage service to HPC
compute clusters via a high performance network connection such as InfiniBand (IB) or 10 Gigabit
Ethernet (10GbE).
The HA cluster consists of a pair of Dell PowerEdge servers and a network switch. The two PowerEdge
servers have shared access to disk-based Dell PowerVault storage in a variety of capacities, and both
are directly connected to the HPC cluster via IB or 10GbE. The two servers are equipped with two
fence devices: iDRAC7 Enterprise and an APC Power Distribution Unit (PDU). If failures such as storage
disconnection, network disconnection, system crash, etc., occur on one server, the HA cluster will
failover the storage service to the healthy server with the assistance of the two fence devices; and also
ensure that the failed server does not return to life without the administrator’s knowledge or control.
The disk-based storage array is formatted as a Red Hat Scalable file system (XFS) and exported to the
HPC cluster via NFS service of the HA cluster. Large capacity file systems (greater than 100TB) have
been supported since the 2
nd
version of NSS-HA solution
(2)
.
Figure 1 depicts the general infrastructure of the NSS-HA solution. For detailed information, refer to
the previous NSS-HA white papers
(1) (2) (3)
.