Reference Guide

Solution overview
7 Reference Architecture of Dell EMC Ready Solution for HPC Life Sciences | Document 309
In addition to compute, network, and storage options, there are several other components that perform
different functions in the Dell EMC Ready Solution for HPC Life Sciences. These include CIFS gateway, fat
node, acceleration node and other management components. Each of these components is described in
detail in the subsequent section.
The solutions are nearly identical for Intel OPA and IB EDR versions except for a few changes in the
switching infrastructure and network adapter. The solution ships in a deep and wide 48U rack enclosure,
which helps to make PDU mounting and cable management easier. In section 2.5, the components of two
fully loaded racks using 56x Dell EMC PowerEdge C6420 rack servers as a compute subsystem, a Dell EMC
PowerEdge R940 as a fat node, a Dell EMC PowerEdge C4140 as an accelerator node, a Dell EMC Ready
Solution for HPC NFS Storage, a Dell EMC Ready Solution for HPC Lustre Storage and an Intel OPA as the
cluster’s high-speed interconnect.
2.2 Compute and management components
There are several considerations when selecting the servers for master node, login node, compute node, fat
node and accelerator node. For master node, 1U form factor Dell EMC PowerEdge R440 is recommended.
The master node is responsible for managing the compute nodes and optimizing the overall compute
capacity. The login node (Dell EMC PowerEdge R640 is recommended) is used for user access, compilations
and job submissions. Usually, master and login nodes are the only nodes that communicate with the outside
world, and they act as a middle point between the actual cluster and the outside network. For this reason,
high availability (HA) can be provided for master and login nodes as an option.
Ideally, the compute nodes in a cluster should be as identical as possible since the performance of parallel
computation is bounded by the slowest component in the cluster. Heterogeneous clusters do work, but careful
execution is required to achieve the best performance. For Life Sciences applications, however
heterogeneous clusters work well to handle completely independent workloads such as DNA-Seq, de novo
assembly or molecular dynamics simulations. Because these workloads require quite different hardware
components, we recommend Dell EMC PowerEdge C6420 as a compute node to handle NGS data
processing due to its density, a wide choice of CPUs, and high maximum memory capacity. Dell EMC
PowerEdge R940 is an optional node with 6TB of RDIMM/LRDIMM memory and is recommended for
customers who need to run applications requiring large memory such as de novo assembly. Accelerators are
used to speed up computationally intensive applications such as molecular dynamics simulation applications.
We tested configurations K and M for this solution.
The compute and management infrastructure consist of the following components.
Compute
- Dell EMC PowerEdge C6400 enclosure with 4x C6420 servers
> CPU: 2x Intel Xeon Gold 6248 at 2.4 GHz 20 cores
> Memory: 12 x 32 GB at 2933 MHz
> Disk: 6x 480 GB 12 Gbps SAS mixed use SSDs
> Interconnect: Intel® Omni-Path, IB EDR or 10/40 GbE, or both
> BIOS system profile: Performance optimized
> Logical processor: Disabled
> Virtualization technology: Disabled
> Node interleaving: Enabled
> Operating system: RHEL 7.6