User Manual

ManualsBrandsAsus ManualsServer AccessoriesPEM-FDR

151

152

153

154

155

156

157

158

159

160

HPC FeaturesRev 2.2-1.0.1

Mellanox Technologies

152

By default the transports (TLS) used are: MXM_TLS=self,shm,ud

5.4.7 Configuring Service Level Support

Service Level Support is currently at alpha level.

Please be aware that the content below is subject to change.

MXM v3.0 added support for Service Level to enable Quality of Service (QoS). If set, every

InfiniBand endpoint in MXM will generate a random Service Level (SL) within the given range,

and use it for outbound communication.

Setting the value is done via the following environment parameter:

MXM_IB_NUM_SLS

Available Service Level values are 1-16 where the default is 1.

5.5 Fabric Collective Accelerator

To meet the needs of scientific research and engineering simulations, supercomputers are grow-

ing at an unrelenting rate. As supercomputers increase in size from mere thousands to hundreds-

of-thousands of processor cores, new performance and scalability challenges have emer

ged. In

the past, performance tuning of parallel applications could be accomplished fairly easily by sepa-

rately optimizing their algorithms, communication, and computational aspects. However, as sys-

tems continue to scale to larger machines, these issues become co-mingled and must be

addressed comprehensively

Collective communications execute global communication operations to couple all processes/

nodes in the system and therefore must be executed as quickly and as ef

ficiently as possible.

Indeed, the scalability of most scientific and engineering applications is bound by the scalability

and performance of the collective routines employed. Most current implementations of collective

operations will suffer from the effects of systems noise at extreme-scale (system noise increases

the latency of collective operations by amplifying the effect of small, randomly occurring OS

interrupts during collective progression.) Furthermore, collective operations will consume a sig-

nificant fraction of CPU cycles, cycles that could be better spent doing meaningful computation.

Mellanox Technologies has addressed these two issues, lost CPU cycles and performance lost to

the ef

fects of system noise, by offloading the communications to the host channel adapters

(HCAs) and switches. The technology, named CORE-Direct® (Collectives Offload Resource

Engine), provides the most advanced solution available for handling collective operations

thereby ensuring maximal scalability, minimal CPU overhead, and providing the capability to

overlap communication operations with computation allowing applications to maximize asyn-

chronous communication.

Users may benefit immediately from CORE-Direct® out-of-the-box by simply specifying the

necessary BCOL/SBGP combinations. In order to take maximum advantage of CORE-Direct®,

users may modify their applications to use MPI 3.0 non-blocking routines while using CORE-

Direct® to of

fload the collective "under-the-covers", thereby allowing maximum opportunity to

overlap communication with computation.

Additionally, FCA 3.0 also contains support to build runtime configurable hierarchical collec-

tives. We currently support socket and UMA level discovery with network topology slated for

future versions.

As with FCA 2.X we also provide the ability to accelerate collectives with hard-