User Manual
Rev 2.3-1.0.1
Mellanox Technologies
79
7. “Unicast Routing Cache”
Unicast routing cache prevents routing recalculation (which is a heavy task in a large cluster)
when no topology change was detected during the heavy sweep, or when the topology change
does not require new routing calculation (for example, when one or more CAs/R
TRs/leaf
switches going down, or one or more of these nodes coming back after being down).
8. “Routing Chains”
Allows routing configuration of different parts of a single InfiniBand subnet by different rout-
ing engines. In the current release, minhop/updn/ftree/dor/torus-2QoS can be combined.
OpenSM also supports a file method which can load routes from a table – see Modular Routing
Engine below
.
MINHOP/UPDN/DOR routing algorithms are comprised of two stages:
1. MinHop matrix calculation. How many hops are required to get from each port to each LID?
The algorithm to fill these tables is dif
ferent if you run standard (min hop) or Up/Down. For
standard routing, a "relaxation" algorithm is used to propagate min hop from every destina-
tion LID through neighbor switches. For Up/Down routing, a BFS from every target is used.
The BFS tracks link
direction (up or down) and avoid steps that will perform up after a down
step was used.
2. Once MinHop matrices exist, each switch is visited and for each target LID a decision is
made as to what port should be used to get to that LID.
This step is common to standard and
Up/Down routing. Each port has a counter counting the number of target LIDs going through
it. When there are multiple alternative ports with same MinHop to a LID, the one with less
previously assigned ports is selected.
If LMC > 0, more checks are added. Within each group of LIDs assigned to same target
port:
a. Use only ports which have same MinHop
b. First prefer the ones that go to different systemImageGuid (then the previous LID of the same LMC group)
c. If none, prefer those which go through another NodeGuid
d. Fall back to the number of paths method (if all go to same node).
3.2.2.5.1 Min Hop Algorithm
The Min Hop algorithm is invoked by default if no routing algorithm is specified. It can also be
invoked by specifying '-R minhop'.
The Min Hop algorithm is divided into two stages: computation of min-hop tables on every
switch and LFT output port assignment. Link subscription is also equalized with the ability to
override based on port GUID.
The latter is supplied by:
-i <equalize-ignore-guids-file>
-ignore-guids <equalize-ignore-guids-file>
This option provides the means to define a set of ports (by guids)
that will be ignored by the link load equalization algorithm.
LMC awareness routes based on (remote) system or switch basis.
3.2.2.5.2 UPDN Algorithm
The UPDN algorithm is designed to prevent deadlocks from occurring in loops of the subnet. A
loop-deadlock is a situation in which it is no longer possible to send data between any two hosts