User Manual

Rev 2.3-1.0.1
Mellanox Technologies
81
ary-N-Trees, by handling for non-constant K, cases where not all leafs (CAs) are present, any
Constant Bisectional Ratio (CBB )ratio. As in UPDN, fat-tree also prevents credit-loop-dead
-
locks.
If the root guid file is not provided ('-a' or '--root_guid_file' options), the topology has to be
pure fat-tree that complies with the following rules:
Tree rank should be between two and eight (inclusively)
Switches of the same rank should have the same number of UP-going port groups
1
,
unless they are root switches, in which case the shouldn't have UP-going ports at all.
Switches of the same rank should have the same number of DOWN-going port groups,
unless they are leaf switches.
Switches of the same rank should have the same number of ports in each UP-going port
group.
Switches of the same rank should have the same number of ports in each DOWN-going
port group.
All the CAs have to be at the same tree level (rank).
If the root guid file is provided, the topology does not have to be pure fat-tree, and it should only
comply with the following rules:
Tree rank should be between two and eight (inclusively)
All the Compute Nodes
2
have to be at the same tree level (rank). Note that non-compute
node CAs are allowed here to be at different tree ranks.
Topologies that do not comply cause a fallback to min hop routing. Note that this can also occur
on link failures which cause the topology to no longer be a “pure” fat-tree.
Note that although fat-tree algorithm supports trees with non-integer CBB ratio, the routing will
not be as balanced as in case of integer CBB ratio. In addition to this, although the algorithm
allows leaf switches to have any number of CAs, the closer the tree is to be fully populated, the
more effective the "shift" communication pattern will be. In general, even if the root list is pro
-
vided, the closer the topology to a pure and symmetrical fat-tree, the more optimal the routing
will be.
The algorithm also dumps compute node ordering file (opensm-ftree-ca-order.dump) in
the same directory where the OpenSM log resides. This ordering file provides the CN order that
may be used to create efficient communication pattern, that will match the routing tables.
Routing between non-CN Nodes
The use of the io_guid_file option allows non-CN nodes to be located on different levels in the
fat tree. In such case, it is not guaranteed that the Fat Tree algorithm will route between two non-
CN nodes. In the scheme below, N1, N2 and N3 are non-CN nodes. Although all the CN have
routes to and from them, there will not necessarily be a route between N1,N2 and N3. Such
routes would require to use at least one of the switches the wrong way around.
Spine1 Spine2 Spine 3
/ \ / | \ / \
/ \ / | \ / \
N1 Switch N2 Switch N3
1. Ports that are connected to the same remote switch are referenced as ‘port group’
2. List of compute nodes (CNs) can be specified by ‘-u’ or ‘--cn_guid_file’ OpenSM options.