User Manual
Features Overview and ConfigurationRev 2.3-1.0.1
Mellanox Technologies
84
• Two levels of QoS, assuming switches support 8 data VLs
• Ability to route around a single failed switch, and/or multiple failed links, without:
• introducing credit loops
• changing path SL values
• Very short run times, with good scaling properties as fabric size increases
3.2.2.5.6.1 Unicast Routing Cache
Torus-2 QoS is a DOR-based algorithm that avoids deadlocks that would otherwise occur in a
torus using the concept of a dateline for each torus dimension. It encodes into a path SL which
datelines the path crosses as follows:
sl = 0;
for (d = 0; d < torus_dimensions; d++)
/* path_crosses_dateline(d) returns 0 or 1 */
sl |= path_crosses_dateline(d) << d;
For a 3D torus, that leaves one SL bit free, which torus-2 QoS uses to implement two QoS levels.
Torus-2 QoS also makes use of the output port dependence of switch SL2VL maps to encode into
one VL bit the information encoded in three SL bits. It computes in which torus coordinate direc-
tion each inter-switch link "points", and writes SL2VL maps for such ports as follows:
for (sl = 0; sl < 16; sl ++)
/* cdir(port) reports which torus coordinate direction a switch port
* "points" in, and returns 0, 1, or 2 */
sl2vl(iport,oport,sl) = 0x1 & (sl >> cdir(oport));
Thus, on a pristine 3D torus, i.e., in the absence of failed fabric switches, torus-2 QoS consumes
8 SL values (SL bits 0-2) and 2 VL values (VL bit 0) per QoS level to provide deadlock-free
routing on a 3D torus. Torus-2 QoS routes around link failure by "taking the long way around"
any 1D ring interrupted by a link failure. For example, consider the 2D 6x5 torus below, where
switches are denoted by [+a-zA-Z]:
For a pristine fabric the path from S to D would be S-n-T-r-D. In the event that either link S-n or
n-T has failed, torus-2QoS would use the path S-m-p-o-T-r-D.