User Manual
Rev 2.1-1.0.6
Mellanox Technologies
155
8.5.3 UPDN Algorithm
The UPDN algorithm is designed to prevent deadlocks from occurring in loops of the subnet. A
loop-deadlock is a situation in which it is no longer possible to send data between any two
hosts connected through the loop. As such, the UPDN routing algorithm should be used if the
subnet is not a pure Fat Tree, and one of its loops may experience a deadlock (due, for example,
to high pressure).
The UPDN algorithm is based on the following main stages:
1. Auto-detect root nodes - based on the CA hop length from any switch in the subnet, a statisti-
cal histogram is built for each switch (hop num vs number of occurrences). If the histogram
reflects a specific column (higher than others) for a certain node, then it is marked as a root
node. Since the algorithm is statistical, it may not find any root nodes. The list of the root
nodes found by this auto-detect stage is used by the ranking process stage.
2. Ranking process - All root switch nodes (found in stage 1) are assigned a rank of 0. Using
the BFS algorithm, the rest of the switch nodes in the subnet are ranked incrementally. This
ranking aids in the process of enforcing rules that ensure loop-free paths.
3. Min Hop Table setting - after ranking is done, a BFS algorithm is run from each (CA or
switch) node in the subnet. During the BFS process, the FDB table of each switch node tra
-
versed by BFS is updated, in reference to the starting node, based on the ranking rules and
guid values.
At the end of the process, the updated FDB tables ensure loop-free paths through the subnet.
8.5.3.1 UPDN Algorithm Usage
Activation through OpenSM
• Use '-R updn' option (instead of old '-u') to activate the UPDN algorithm.
• Use '-a <root_guid_file>' for adding an UPDN guid file that contains the root nodes for
ranking. If the `-a' option is not used, OpenSM uses its auto-detect root nodes algo-
rithm.
Notes on the guid list file:
The user can override the node list manually
If this stage cannot find any root nodes, and the user did not specify a guid list file,
OpenSM defaults back to the Min Hop routing algorithm.
Up/Down routing does not allow LID routing communication between switches that
are located inside spine “switch systems”. The reason is that there is no way to allow a
LID route between them that does not break the Up/Down rule. One ramification
of this is that you cannot run SM on switches other than the leaf switches of the fabric.