User Manual
Table Of Contents
- Mellanox WinOF VPI User Manual
- Table of Contents
- List of Tables
- Document Revision History
- About this Manual
- 1 Introduction
- 2 Firmware Upgrade
- 3 Driver Features
- 3.1 Hyper-V with VMQ
- 3.2 Header Data Split
- 3.3 Receive Side Scaling (RSS)
- 3.4 Port Configuration
- 3.5 Load Balancing, Fail-Over (LBFO) and VLAN
- 3.6 Ports TX Arbitration
- 3.7 RDMA over Converged Ethernet (RoCE)
- 3.8 Network Virtualization using Generic Routing Encapsulation
- 3.9 Differentiated Services Code Point (DSCP)
- 4 Deploying Windows Server 2012 and Above with SMB Direct
- 5 Driver Configuration
- 6 Performance Tuning
- 7 OpenSM - Subnet Manager
- 8 InfiniBand Fabric
- 8.1 Network Direct Interface
- 8.2 part_man - Virtual IPoIB Port Creation Utility
- 8.3 InfiniBand Fabric Diagnostic Utilities
- 8.3.1 Utilities Usage
- 8.3.2 ibdiagnet
- 8.3.3 ibportstate
- 8.3.4 ibroute
- 8.3.5 ibdump
- 8.3.6 smpquery
- 8.3.7 perfquery
- 8.3.8 ibping
- 8.3.9 ibnetdiscover
- 8.3.10 ibtracert
- 8.3.11 sminfo
- 8.3.12 ibclearerrors
- 8.3.13 ibstat
- 8.3.14 vstat
- 8.3.15 osmtest
- 8.3.16 ibaddr
- 8.3.17 ibcacheedit
- 8.3.18 iblinkinfo
- 8.3.19 ibqueryerrors
- 8.3.20 ibsysstat
- 8.3.21 saquery
- 8.3.22 smpdump
- 8.4 InfiniBand Fabric Performance Utilities
- 8.4.1 ib_read_bw
- 8.4.2 ib_read_lat
- 8.4.3 ib_send_bw
- 8.4.4 ib_send_lat
- 8.4.5 ib_write_bw
- 8.4.6 ib_write_lat
- 8.4.7 ibv_read_bw
- 8.4.8 ibv_read_lat
- 8.4.9 ibv_send_bw
- 8.4.10 ibv_send_lat
- 8.4.11 ibv_write_bw
- 8.4.12 ibv_write_lat
- 8.4.13 nd_write_bw
- 8.4.14 nd_write_lat
- 8.4.15 nd_read_bw
- 8.4.16 nd_read_lat
- 8.4.17 nd_send_bw
- 8.4.18 nd_send_lat
- 8.4.19 NTttcp
- 9 Software Development Kit
- 10 Troubleshooting
- 11 Documentation
- Appendix A: Windows MPI (MS-MPI)
- Appendix B: NVGRE Configuration Scrips Examples
Rev 4.60
Mellanox Technologies
29
Step 5. Choose the ‘Tx Throughput Port Arbiter’ option.
Step 6. Set one of the following values:
• Best Effort (Default) - Default behavior. No precedence is given to this port over the other.
• Guaranteed - Give higher precedence to this port.
• Not Present - No configuration exists, defaults are used.
3.7 RDMA over Converged Ethernet (RoCE)
3.7.1 RoCE Overview
Remote Direct Memory Access (RDMA) is the remote memory management capability that
allows server to server data movement directly between application memory without any CPU
involvement. RDMA over Converged Ethernet (RoCE) is a mechanism to provide this efficient
data transfer with very low latencies on loss-less Ethernet networks. With advances in data center
convergence over reliable Ethernet, ConnectX®-2/ConnectX®-3 EN/ ConnectX®-3 Pro EN
with RoCE uses the proven and efficient RDMA transport to provide the platform for deploying
RDMA technology in mainstream data center application at 10GigE and 40GigE link-speed.
ConnectX®-2/ ConnectX®-3/ ConnectX®-3 Pro EN with its hardware offload support takes
advantage of this efficient RDMA transport (InfiniBand) services over Ethernet to deliver ultra-
low latency for performance-critical and transaction intensive applications such as financial,
database, storage, and content delivery networks. RoCE encapsulates IB transport and GRH
headers in Ethernet packets bearing a dedicated ether type. While the use of GRH is optional
within InfiniBand subnets, it is mandatory when using RoCE. Applications written over IB verbs
should work seamlessly, but they require provisioning of GRH information when creating
address vectors. The library and driver are modified to provide mapping from GID to MAC
addresses required by the hardware.
3.7.2 RoCE Configuration
In order to function reliably, RoCE requires a form of flow control. While it is possible to use
global flow control, this is normally undesirable, for performance reasons.
The normal and optimal way to use RoCE is to use Priority Flow Control (PFC). To use PFC, it
must be enabled on all endpoints and switches in the flow path.
In the following section we present instructions to configure PFC on Mellanox ConnectX™
cards. There are multiple configuration steps required, all of which may be performed via Power-
Shell. Therefore, although we present each step individually, you may ultimately choose to write
a PowerShell script to do them all in one step. Note that administrator privileges are required for
these steps.
For further information, please refer to:
http://blogs.technet.com/b/josebda/archive/2012/07/31/deploying-windows-server-2012-with-
smb-direct-smb-over-rdma-and-the-mellanox-connectx-3-using-10gbe-40gbe-roce-step-by-
step.aspx
3.7.2.1 Prerequisites
The following are the driver’s prerequisites in order to set or configure RoCE:
• ConnectX®-3 firmware version 2.30.3000 or higher