User Manual
Rev 2.2-1.0.1
Mellanox Technologies
187
occurs if torus-2QoS is misconfigured, i.e., the radix of a torus dimension as configured does not
match the radix of that torus dimension as wired, and many switches/links in the fabric will not
be placed into the torus.
8.5.7.4 Quality Of Service Configuration
OpenSM will not program switchs and channel adapters with SL2VL maps or VL arbitration
configuration unless it is invoked with -Q. Since torus-2QoS depends on such functionality for
correct operation, always invoke OpenSM with -Q when torus-2QoS is in the list of routing
engines. Any quality of service configuration method supported by OpenSM will work with
torus-2QoS, subject to the following limitations and considerations. For all routing engines sup
-
ported by OpenSM except torus-2QoS, there is a one-to-one correspondence between QoS level
and SL. Torus-2QoS can only support two quality of service levels, so only the high-order bit of
any SL value used for unicast QoS configuration will be honored by torus-2QoS. For multicast
QoS configuration, only SL values 0 and 8 should be used with torus-2QoS.
Since SL to VL map configuration must be under the complete control of torus-2QoS, any con-
figuration via qos_sl2vl, qos_swe_sl2vl, etc., must and will be ignored, and a warning will be
generated. Torus-2QoS uses VL values 0-3 to implement one of its supported QoS levels, and VL
values 4-7 to implement the other. Hard-to-diagnose application issues may arise if traffic is not
delivered fairly across each of these two VL ranges. Torus-2QoS will detect and warn if VL arbi
-
tration is configured unfairly across VLs in the range 0-3, and also in the range 4-7. Note that the
default OpenSM VL arbitration configuration does not meet this constraint, so all torus-2QoS
users should configure VL arbitration via qos_vlarb_high, qos_vlarb_low, etc.
8.5.7.5 Operational Considerations
Any routing algorithm for a torus IB fabric must employ path SL values to avoid credit loops. As
a result, all applications run over such fabrics must perform a path record query to obtain the cor
-
rect path SL for connection setup. Applications that use rdma_cm for connection setup will auto-
matically meet this requirement.
If a change in fabric topology causes changes in path SL values required to route without credit
loops, in general all applications would need to repath to avoid message deadlock. Since torus-
2QoS has the ability to reroute after a single switch failure without changing path SL values,
repathing by running applications is not required when the fabric is routed with torus-2QoS.
Torus-2QoS can provide unchanging path SL values in the presence of subnet manager failover
provided that all OpenSM instances have the same idea of dateline location. See torus-
2QoS.conf(5) for details. Torus-2QoS will detect configurations of failed switches and links that
prevent routing that is free of credit loops, and will log warnings and refuse to route. If
"no_fallback" was configured in the list of OpenSM routing engines, then no other routing
engine will attempt to route the fabric. In that case all paths that do not transit the failed compo
-
nents will continue to work, and the subset of paths that are still operational will continue to
remain free of credit loops. OpenSM will continue to attempt to route the fabric after every
sweep interval, and after any change (such as a link up) in the fabric topology. When the fabric
components are repaired, full functionality will be restored. In the event OpenSM was config
-
ured to allow some other engine to route the fabric if torus-2QoS fails, then credit loops and mes-
sage deadlock are likely if torus-2QoS had previously routed the fabric successfully. Even if the
other engine is capable of routing a torus without credit loops, applications that built connections
with path SL values granted under torus-2QoS will likely experience message deadlock under
routing generated by a different engine, unless they repath. To verify that a torus fabric is routed
free of credit loops, use ibdmchk to analyze data collected via ibdiagnet -vlr.