White Papers
Dell - Internal Use - Confidential
13
number of samples increases. A subtle pitfall is a storage cache effect. Since all of the simultaneous runs will read/write roughly at the
same time, the run time would be shorter than real cases. Despite these built-in inaccuracies, this variant analysis performance test can
provide valuable insights to estimating how much resources are required for an identical or even similar analysis pipeline with a defined
workload.
The throughput of Dell HPC Solution for Life Sciences
Total run time is the elapsed wall time from the earliest start of Phase 1, Step 1 to the latest completion of Phase 3, Step 2. Time
measurement for each step is from the latest completion time of the previous step to the latest completion time of the current step as
illustrated in Figure 11.
Feeding multiple samples into an analytical pipeline is the simplest way to increase parallelism, and this practice will improve the
throughput of a system if a system is well designed to accommodate the sample load. In Figure 12, the throughputs in total number of
genomes per day for all tests with various numbers of 30x whole genome sequencing data are summarized. The tests performed here
are designed to demonstrate performance at the server level, not for comparisons on individual components. At the same time, the
tests were also designed to estimate the sizing information of Dell EMC Isilon F800/H600 and Dell EMC Lustre Storage. The data
points in Figure 12 are calculated based on the total number of samples (X axis in the figure) that were processed concurrently. The
number of genomes per day metric is obtained from total running time taken to process the total number of samples in a test. The
smoothed curves are generated by using a polynomial spline with the piecewise polynomial degree of 3 generating B-spline basis
matrix. The details of BWA-GATK pipeline information can be obtained from the Broad Institute web site (10).
Figure 11 Running time measurement method
Figure 12 Performances in 13/14 generation servers with Isilon and Lustre