Front cover Implementing Cisco InfiniBand on IBM BladeCenter Add high-speed 4X InfiniBand networking to your BladeCenter Plan and configure the solution for your environment Learn about the latest InfiniBand DDR products Khalid M Ansari Robyn McGlotten Matt Slavin David Watts ibm.
International Technical Support Organization Implementing Cisco InfiniBand on IBM BladeCenter October 2007 REDP-3949-01
Note: Before using this information and the product it supports, read the information in “Notices” on page vii.
Contents Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The team that wrote this paper . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.2 Interprocessor communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.5.3 Storage area networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Chapter 4. Configuring InfiniBand with QLogic bridges . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Configuring QLogic InfiniBand Bridge Modules with Linux . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Installing the drivers . . . . . . . . . . . . . . . . . . . .
.4 Using a VLAN tagged bridge group design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Some comments on VLANs and the Ethernet gateway module . . . . . . . . . . . . . 7.4.2 Summary of steps to create a tagged VLAN bridge group design . . . . . . . . . . . 7.4.3 Detailed steps to implement a tagged VLAN bridge group design . . . . . . . . . . . 7.4.4 CLI reference for this section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.
vi Implementing Cisco InfiniBand on IBM BladeCenter
Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used.
Trademarks The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: Redbooks (logo) eServer™ BladeCenter® Chipkill™ DS4000™ ® IBM® PowerExecutive™ PowerPC® Predictive Failure Analysis® Redbooks® ServerProven® System x™ System Storage™ The following terms are trademarks of other companies: Oracle, JD Edwards, PeopleSoft, Siebel, and TopLink are registered trademarks of Oracle Corporation and/or its affiliates.
Preface InfiniBand® networking offers a high speed, low latency interconnect that is often a requirement for High Performance Computing (HPC) networks. Combined with various Ethernet and Fibre Channel gateways, this technology also offers a simplified multifabric I/O to permit InfiniBand clients to access more traditional interconnects, using InfiniBand as the single unifying infrastructure.
include infrastructure design and support, with a special focus on high density data center networking and security. David Watts is a Consulting IT Specialist at the IBM ITSO Center in Raleigh. He manages residencies and produces IBM Redbooks publications on hardware and software topics related to IBM System x and BladeCenter servers and associated client platforms. He has authored over 80 books, papers and technotes.
Become a published author Join us for a two- to six-week residency program! Help write a book dealing with specific products or solutions, while getting hands-on experience with leading-edge technologies. You will have the opportunity to team with IBM technical professionals, Business Partners, and Clients. Your efforts will help increase product acceptance and customer satisfaction. As a bonus, you will develop a network of contacts in IBM development labs, and increase your productivity and marketability.
xii Implementing Cisco InfiniBand on IBM BladeCenter
1 Chapter 1. IBM BladeCenter products and technology Blade servers are thin servers that insert into a single rack-mounted chassis which supplies shared power, cooling, and networking infrastructure. Each server is an independent server with its own processors, memory, storage, network controllers, operating system, and applications. Blade servers came to market around 2000, initially to meet clients’ needs for greater ease of administration and increased server density in the data center environment.
1.1 BladeCenter chassis There are four chassis in the BladeCenter family: IBM BladeCenter provides the greatest density and common fabric support and is the lowest entry cost option. To eliminate any confusion, this chassis is also known as IBM BladeCenter 8677 or IBM BladeCenter Enterprise. IBM BladeCenter H delivers high performance, extreme reliability, and ultimate flexibility for the most demanding IT environments.
Figure 1-1 and Figure 1-2 display the front view of an IBM BladeCenter H. Figure 1-1 BladeCenter H front view Power module 1 Blade server control panel Power module bay 3 Media tray Front system LED panel CD/DVD drive activity LED System service cards CD/DVD drive eject button USB connectors Power module 2 Power module bay 4 Figure 1-2 Diagram of BladeCenter H front view with the key features of the BladeCenter H chassis Chapter 1.
The key features on the front of the BladeCenter H include: A media tray at the front right, with a DVD drive, two USB v2.0 ports, and system status LED panel. One pair of 2900-watt power modules. An additional power module option (containing two 2900 W power modules) is available. Two hot swap fan modules. (Two extra hot swap fan modules are included with the additional power module option.) 14 hot swap blade server bays supporting different blade server types.
I/O module bay 7 I/O module bay 8 Power connector 2 Power connector 1 Management module 1 I/O module bay 1 I/O module bay 5 I/O module bay 3 Blower module 1 error LED Blower module 1 Management module bay 2 I/O module bay 2 I/O module bay 6 I/O module bay 4 Blower module 2 error LED Blower module 2 Rear system LED panel Serial connector I/O module bay 9 I/O module bay 10 Figure 1-4 Diagram of BladeCenter H rear view showing the key features of the BladeCenter H chassis The BladeCenter H chassis al
The benefits of the blade approach will be obvious to anyone tasked with running down hundreds of cables strung through racks just to add and remove servers. With switches and power units shared, precious space is freed and blade servers enable higher density with far greater ease.
brownouts, rather than shut down completely, or fail, the HS21 XM reduces the processor frequency automatically to maintain acceptable thermal and power levels. All HS21 XM models also include support one SAS hard disk drive and one USB-based modular flash drive. The optional 30mm Storage and I/O (SIO) Expansion Unit connects to a blade (model-dependent) to provide an additional three 2.5 SAS HDDs with hot-swap support, optional RAID-5 with battery-backed cache, and four additional communication ports.
Three-year, on-site limited warranty. Table 1-2 provides details on the features of the HS21 XM.
Figure 1-6 IBM BladeCenter LS21 Table 1-3 summarizes the features of the LS21. Table 1-3 Features of the LS21 Feature Specification Processor AMD Opteron Rev F Model 2212, 2212HE, 2216HE and 2218 Number of processors (std/max) 1/2 Cache 1 MB L2 per processor core Memory 8 VLP DIMM slots / DDR2 667 / 32 GB maximum Internal hard disk drives (standard / maximum) 0/1 Maximum internal storage On board: one 2.5” Non Hot Swap SAS HDD Optional: SIO blade offers support for 3 additional 2.
One slot for an I/O expansion adapter, either standard form factor (StFF) or small form factor (SFF) design. Light Path Diagnostics on system board speeds recovery from individual blade server failures. Integrated BMC management processor. Integrated SAS controller with support for up to two fixed 2.5inch SFF SAS hard disk with support for RAID-0 and RAID-1. Table 1-4 provides details of the features of the JS21.
IBM BladeCenter H chassis has a total of ten I/O bays. Each blade bay has a total of eight dedicated connection paths to the I/O modules. See Figure 1-7.
The I/O bays are connected to two separate and redundant midplanes and the blade servers and expansion cards in the blades have ports that connect to both midplanes. The midplanes that connect each of the bays are shown in Table 1-5.
For list of supported combinations of high-speed modules and high-speed expansion cards for BladeCenter H see Table 1-7. Table 1-7 Supported combinations of high-speed I/O modules and I/O expansion cards - BladeCenter H Part High-speed expansion card I/O module bay number ESMa,b CPMb FCSMb,c 7, 9d 8, 10d OPMb 7, 8, 9, 10d HSIBSM HSESM 7, 9 7, 9 39Y9271 NetXen 10 Gb Ethernet Expansion Card (CFFh) No No No No Yes 39R8624 QLogic Ethernet and 4 Gb FC Exp.
The 4X InfiniBand HCAs installed in the blade servers are PCI Express x4 cards with two output ports that are routed to bays 7 and 9. Two 4X InfiniBand high-speed switch modules are installed in these bays (two modules provide redundancy). Because the 4X InfiniBand HCA is a two-port card and not a four-port card, bays 8 and 10 are not connected. The bridge module bays used in this configuration are bays 3 and 5, as shown in Figure 1-8.
2 Chapter 2. InfiniBand products for BladeCenter This chapter discusses the InfiniBand products that are available for the BladeCenter H. From a hardware perspective, this discussion includes the Cisco 4x InfiniBand Switch Module and associated 4x HBA and cables, the QLogic InfiniBand to Ethernet and InfiniBand to Fibre Channel Bridge Modules, and some newly announced DDR InfiniBand products. From a software perspective, this discussion introduces the VFrame for InfiniBand virtualization product.
2.1 Cisco 4X InfiniBand Switch Module The Cisco 4X InfiniBand Switch Module for IBM BladeCenter, part number 32R1756, adds InfiniBand switching capability to hosts in your IBM BladeCenter H chassis. When you add one or two switch modules to your BladeCenter H chassis and add an HCA expansion card to your blade servers, your servers can communicate to one another over InfiniBand within the chassis.
management modules) to facilitate setup and management. After you set up a switch module and bring it online, the on-board Cisco Subnet Manager brings distributed intelligence to the InfiniBand network. Within the BladeCenter chassis, Cisco InfiniBand switch modules manage traffic to and from HCA expansion cards on the BladeCenter hosts. Each HCA expansion card adds two InfiniBand ports to a BladeCenter host. Each HCA port connects through the unit backplane to a particular switch module slot.
You can manage your InfiniBand switch module with any of the following interfaces: Simple Network Management Protocol (SNMP) versions 1, 2, and 3 with Cisco’s Management Information Base (MIBs) TopspinOS command line interface (CLI) Chassis Manager Web-based GUI Element Manager Java™-based GUI APIs (through SNMP) Note: To implement the VFrame solution using the Cisco InfiniBand switch module, you need to purchase Cisco VFrame Server Fabric Virtualization software separately from Cisco resellers.
The Cisco 4X InfiniBand HCA Expansion Card is a High Speed Form Factor (HSFF) card (see Figure 2-3). It requires that 4X InfiniBand Switch Modules be installed in bay 7 or bay 9 (or both, for redundancy). Figure 2-3 Cisco 4X InfiniBand HCA Expansion Card 2.3 Cisco VFrame for InfiniBand virtualization software Cisco VFrame for InfiniBand virtualization software used in conjunction with Cisco InfiniBand switches and gateways provide server and I/O virtualization capabilities for customers.
Note: If the bridge module is installed in bay 4 or 6 there will not be an internal InfiniBand connection because the 4X InfiniBand HCA does not have a third and fourth port routed to bays 8 and 10. See 1.4, “High-speed I/O paths in BladeCenter H” on page 13. The Ethernet Bridge Module offers six 10/100/1000 Mbps external Ethernet RJ45 ports to connect to an upstream Ethernet network, and two 4x internal InfiniBand ports to connect to the 4x switch modules in slot 7 and 9. See Figure 2-4.
– Unconnected Six external autosensing 10/100/1000 Mbps RJ-45 Ethernet (copper) ports 802.3ad link aggregation Support for jumbo frames Support of IEEE 802.1q VLAN tagging Support for up to 1150 Virtual NIC ports per module Automatic port and module failover TCP/UDP and IP header checksum offload and host checking 802.
through Ethernet ports that connect from the bridge module to the BladeCenter H Management Module slots. As with the Ethernet Bridge Module, this product can help simplify server deployments by allowing the BladeServer to only have a single physical interconnect (InfiniBand) and still have access to other technologies, such as Fibre Channel through the QLogic InfiniBand to FC bridge module.
Logical unit number (LUN) mapping and masking SCSI-SRP, SCSI-FCP, and FC-PH-3 compliant Switched internal I2C Interface to the management modules Two internal 100 Mbps full-duplex Ethernet links to the management modules Power-on diagnostics and status reporting Support of simple network management protocol (SNMP) management information bases (MIBs) and traps through the Ethernet management ports The bridge module supports the following management methods: CLI through telnet or SSH Web-based
2.6.1 InfiniBand 4X DDR Pass-thru Module With 14 InfiniBand 4X DDR ports towards the servers and 14 InfiniBand 4X DDR ports toward the upstream network, the 4X InfiniBand DDR Pass-thru Module, part number 43W4419, offers full non-blocking 4X DDR support to all 14 blade servers in a BladeCenter H chassis. The InfiniBand Pass-thru Module is a double-height module, and up to two can be installed in an IBM BladeCenter H, utilizing either switch bays 7 and 8, or switch bays 9 and 10 in the rear of the chassis.
Note: The QLogic Bridge modules are not compatible with the pass-thru module, because there is no connectivity from the module to the bridge slots. Also, unlike other InfiniBand switch modules for the BladeCenter, if InfiniBand communications are desired between blade servers in the same chassis, some sort of external InfiniBand connectivity must be provided. 2.6.2 InfiniBand DDR HCAs Three recently added 4X DDR HCAs are also available, resulting from a partnership between IBM and Mellanox.
HCAs. However, there are plans to support both the existing Cisco 4X InfiniBand HCA, and the Cisco 4X InfiniBand Switch Module. Check IBM ServerProven® for the latest support information: http://www.ibm.com/servers/eserver/serverproven/compat/us/ 2.7 Cisco SFS 3012R Multifabric Server Switch While technically not part of the IBM BladeCenter H systems, the SFS 3012R is an important component of any InfiniBand deployment, by offering Multifabric I/O (MFIO) to InfiniBand clients.
Figure 2-8 Left - 2 port Fibre Channel gateway module, right - 6 port Ethernet gateway module The Cisco SFS 3012R is designed with enterprise-class redundancy in mind. By combining dynamic load balancing and port aggregation, The Cisco SFS 3012R minimizes failure points in switch blades, controllers, gateways, and ports. All removable components are also hot-swappable, including controllers, gateway modules, power, and cooling.
Load Distribution – Supports redundancy groups across multiple gateways and multiple chassis High-Availability – Options Flexible deployment in active-active or active-passive modes to eliminate single points of failure DHCP Relay Support – Allows DHCP to work across Ethernet and InfiniBand fabrics The following are some key features and benefits of the Fibre Channel gateway module Virtual I/O for Fibre Channel – Allows a group of servers to share a pool of centralized Fibre Channel I/O resources.
Feature Description Reliability and Availability Redundant (active/standby) control processor modules Redundant, hot-swappable AC power supply modules and cooling Dual AC inputs Hot-plug expansion modules Fully passive backplane Deployable in redundant pairs Physical Dimensions Rack-mountable in a standard 19 inch EIA rack 7 inch (4RU) height 24 inch depth 30–95 lb Approvals and Compliance FCC: CFR 47 Part 15, Subpart B Class A, UL60950, 3rd ed. ICES-003 Issue 2, CSA 22.2 No.
Feature Description Link Speed Negotiation Automatic, 10/100/1000 Mbps IP Protocols Transparent Topology emulation; IP over InfiniBand (IPoIB) Table 2-6 lists the SFS 3012R Fibre channel gateway module specifications.
3 Chapter 3. InfiniBand technology InfiniBand is an industry standard, switch based serial I/O interconnect architecture that features high speed, low latency interconnects. InfiniBand has the ability to combine networks into a single unified fabric that maximizes bandwidth and is easily scalable. Similar to an Ethernet network, InfiniBand resources can be seperated into function specific subnets. InfiniBand provides Quality of Service (QoS) and Reliability and Servicability (RAS).
3.1 InfiniBand Network Layer Model InfiniBand uses a multi-layer architecture (similar to the seven layer OSI model) to transfer data between nodes, as shown in Figure 3-1. Each layer is responsible for separate tasks in passing messages.
InfiniBand specifies multiple transport services for data reliability. Table 3-1 describes each of the supported services. For a given queue pair, one transport level is used. Table 3-1 Transport services Class of service Description Reliable connection Acknowledged - connection oriented Reliable datagram Acknowledged - multiplexed Unreliable connection Unacknowledged - connection oriented Unreliable datagram Unacknowledged - connectionless Raw datagram Unacknowledged - connectionless 3.1.
As a packet traverses the subnet, a service level (SL) is defined to ensure its QoS level. Each link along a path can have a different VL, and the SL provides each link a desired priority of communication. Each switch/router has an SL-to-VL mapping table that is set by the subnet manager to keep the proper priority with the number of VLs supported on each link. Therefore, the InfiniBand Architecture can ensure end-to-end QoS through switches, routers, and over the long haul.
1X Link 4X Link 12X Link Figure 3-2 InfiniBand physical link 3.2 InfiniBand protocols InfiniBand supports a number of upper-layer protocols (ULPs) that enable InfiniBand to be exploited by different types of software with different requirements and objectives. 3.2.1 Internet Protocol over InfiniBand Internet Protocol over InfiniBand (IPoIB) is the lowest-level existing network interface that is provided on InfiniBand.
enqueue the requested work, and return immediately with the request not completed (often, not even started) and the data buffers still in use. The use of asynchronous sockets is standard under the Windows® operating systems. Sockets-based communications for Windows programs is always asynchronous, and therefore requires no modification to operate over SDP. Historically, UNIX® and Linux® applications have used synchronous sockets and, therefore, do not map transparently into the SDP protocol.
underlying iSCSI-like mechanism. This does not preclude emulating a Fibre Channel connection, while enabling a much more functional mode, where applications can have visibility to mixed SAN fabric environments without having to understand and account for the differences between those fabrics. 3.2.
3.3 Hardware This section describes the hardware used in an InfiniBand configuration: Host Channel Adapters A Host Channel Adapter (HCA) provides a processor node, such as a server or workstation, with a connection port to other InfiniBand devices. An HCA can have one or multiple ports and can be an add in card on a standard interconnect bus or onboard the system main board. The adapter can be connected to an InfiniBand switch, a target device or another HCA.
service and does not follow the same flow control restriction as other VLs on the links. Subnet management information is passed through the subnet ahead of all other traffic on a link Subnet Manager (SM) The SM monitors the health and performance of the InfiniBand subnet, maintains topology information, and provides routing information to all of the switches in the network. These responsibilities include LID assignment, SL-to-VL mapping, link bringup and teardown, and link failover.
3.5.1 Application clustering The Internet today has evolved into a global infrastructure supporting applications such as streaming media, business-to-business solutions, e-commerce, and interactive portal sites. Each of these applications must support an ever-increasing volume of data and demand for reliability. Service providers are in turn experiencing tremendous pressure to support these applications.
4 Chapter 4. Configuring InfiniBand with QLogic bridges In this chapter, we discuss the steps to connect and configure the hardware for the following basics tasks: 1. 2. 3. 4. Install the QLogic InfiniBand Bridge Module drivers for Windows and Linux. Configure the Virtual NIC driver for the InfiniBand Ethernet Bridge Module. Configure IPoIB. Configure SRP for the InfiniBand Fibre Channel Bridge Module. For these tasks, we used a basic configuration as shown in Figure 4-1.
4.1 Configuring QLogic InfiniBand Bridge Modules with Linux This section shows how to install the QLogic bridge module drivers for Linux operating systems, how to configure the InfiniBand Ethernet Bridge Module, and how to configure the InfiniBand Fibre Channel Bridge Module. The steps are: 4.1.1, “Installing the drivers” 4.1.2, “QLogic InfiniBand Ethernet Bridge Module: Configuring VNIC” on page 45 4.1.3, “QLogic InfiniServ Driver: Configuring IPoIB” on page 48 4.1.
SilverStorm Technologies Inc. IBM Install (4.1.0.6.
Installing IB Network Stack... Adding module dependencies... Adding memory locking limits... Copying ibt.ko... Copying ics_dsc.ko... Copying 82808XA.ko... Copying mt23108vpd.ko... Copying mt25218vpd.ko... Creating IB Network Stack (iba) system startup files... Creating IB Port Monitor (iba_mon) system startup files... ------------------------------------------------------------------------------Installing IB Development...
7. The installation continues. As shown in Figure 4-5, when prompted to install the SilverStorm firmware onto the HCA, select n to select no. Generating module dependencies... Updating HCA Firmware ... Select HCAs to Update: 1) HCA 1 (25208 Rev a0 psid "" Node GUID: 0x0005ad0000050464) Selection (a for all, n for none) [a]: n ------------------------------------------------------------------------------Updating dynamic linker cache... Figure 4-5 Do not install SilverStorm firmware 8.
To create virtual network interfaces on a Linux blade server, complete the following steps. 1. Get the IO Controller (IOC) numbers for the Ethernet bridge ports by running the command shown in Figure 4-7.
For this setup, we used the settings (config file) shown in Figure 4-10. DEVICE=eioc1 BOOTPROTO=static IPADDR=192.168.199.4 NETMASK=255.255.255.0 ONBOOT=yes TYPE=Ethernet Figure 4-10 The eioc config file 5. Restart VNIC using the command ics_inic restart so that the new VNIC configuration settings (ics_inic.cfg) are applied, as shown in Figure 4-11. [root@localhost ~]# /etc/init.
7. Verify your connection works by pinging another device on the same subnet as shown in Figure 4-13. [root@localhost network-scripts]# ping 192.168.199.254 PING 192.168.199.254 (192.168.199.254) 56(84) bytes of data. 64 bytes from 192.168.199.254: icmp_seq=1 ttl=255 time=0.292 64 bytes from 192.168.199.254: icmp_seq=2 ttl=255 time=0.299 64 bytes from 192.168.199.254: icmp_seq=3 ttl=255 time=0.302 64 bytes from 192.168.199.254: icmp_seq=4 ttl=255 time=0.329 ms ms ms ms --- 192.168.199.
Tip: If no information is returned on the port, verify that the system can see the HCA by running the lspci command. Also, check to verify that the HCA driver is loaded using the lsmod command. See 4.1.5, “Hints and tips” on page 60 for more details. 2. Open the IPoIB config file /etc/sysconfig/ipoib.cfg in a text editor. 3. Uncomment the lines in the ipoib.cfg file as shown in Figure 4-15 (at the bottom) and update the information based on your desired configuration.
6. Check the new IP settings with the ifconfig command as shown in Figure 4-18. [root@localhost network-scripts]# ifconfig eth0 Link encap:Ethernet HWaddr 00:14:5E:D6:16:A8 inet addr:9.42.166.100 Bcast:9.42.166.255 Mask:255.255.255.0 inet6 addr: fe80::214:5eff:fed6:16a8/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:603 errors:0 dropped:0 overruns:0 frame:0 TX packets:25 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:44940 (43.
To create virtual SRP interfaces on a Linux blade server, complete these steps: 1. Each bridge module has two IO Controllers (IOC). Find the SRP IOCs of your bridge module by displaying the SRP driver information using the cat command as shown in Figure 4-19. The driver file is located in the folder /proc/driver/ics_srp. [root@localhost ~]# cat /proc/driver/ics_srp/driver SilverStorm Technologies Inc. Virtual HBA (SRP) SCSI Driver, version 4.1.0.6.3 Built for Linux Kernel 2.6.9-42.
3. Uncomment the lines in the ics_srp.cfg file as shown in Figure 4-20 (at the bottom) and update the information based on your desired configuration. session begin card: 0 port: 1 targetIOCGuid: 0x00066a01e000018c initiatorExtension: 1 end adapter begin description: "ITSO HBA 1" end Figure 4-20 The ics_srp.cfg file This creates a virtual Fibre Channel HBA (SRP initiator) on your system that communicates through the specified SRP IO Controller on your Fibre Channel bridge module.
5. Open the Web interface of your bridge module and click SRP Initiator Discovery, then click Start on the upper, right side of the page as shown in Figure 4-21. The SRP initiators that you created should be discovered and displayed under Discovered Hosts. Figure 4-21 SRP Initiator Discovery screen The following components are used to identify your virtual SRP initiator: – GUID: InfiniBand HCA Port GUID that is being used for the virtual HBA.
8. Each Fibre Channel port on the bridge module has a port world wide name (WWN). On the storage device, the bridge port WWN needs to be mapped to the drives. A Fibre Channel connection has to be established between the bridge port and the storage device for the storage device to see the bridge port WWN. Figure 4-24 Bridge Port information Figure 4-25 Bridge Port WWN on IBM FAStT DS4300 9.
10.Click Configure to designate a name for the device as shown in Figure 4-27. This name will be used later when mapping to an SRP Initiator. Figure 4-27 Assign a name to the storage device 11.After you designate a name, click Submit. The remote drive moves from Discovered Devices to Configured Devices (Figure 4-28). Figure 4-28 Configured Storage Device 12.Close the FCP Device Discovery window and click SRP map config on the GUI main menu page.
Figure 4-29 Add explicit map 14.Select a storage target and identify the host LUN and target LUN. Then click Add Row. The host LUN is the LUN number that displays on the Linux host. The target LUN is LUN number that was determined for the logical drive on the storage device. When you are done adding all logical drives that you want mapped to this virtual HBA, click Finish.
Figure 4-30 Explicit map configuration 15.Restart SRP on the host so that it picks up the new mapping. Check for the remote drives using the command cat /proc/scsi/scsi. Figure 4-31 shows sample output.
Figure 4-32 Choose Direct Map 19.Select a storage target and click Finish (Figure 4-33).
20.Restart SRP on the host so that it picks up the new mapping. Check for the remote drives using the command cat /proc/scsi/scsi. Sample output is shown in Figure 4-34.
4.1.5 Hints and tips The following information might help you resolve connectivity issues with your Linux installation: Verify that the HCA can be seen by the operating system. If the HCA is not detected, use the linux command lspci as shown in Figure 4-36 to ensure that the operating system can see the card. If the card is not detected, shut down the blade and ensure that it is properly seated in the blade. 0c:00.
Restart network services. If the virtual NIC does not come up, restart the linux network service using the command service network restart as shown in Figure 4-39 so that the eioc interface profile (ifcfg-eiocX) is re-initiated.
Check the InfiniBand switch Web interface to verify that you have an active connection to the bridge module. 4.2 Configuring QLogic InfiniBand Bridge Modules with Windows This section shows how to install the QLogic bridge module drivers for Windows operating systems, how to configure the InfiniBand Ethernet Bridge Module, and how to configure the InfiniBand Fibre Channel Bridge Module. The steps are: 4.2.1, “Installing the HCA Drivers” on page 62 4.2.
7. Click Next to continue the driver installation. The Wizard finds and installs InfiniHost Mellanox InfiniBand HCA for PCI Express. When the driver instillation is complete, you should see an entry for the HCA in device manager under InfiniBand Host Channel Adapters as shown in Figure 4-43. Figure 4-43 InfiniBand HCA in Windows Device Manager 4.2.
4. Click Next to continue driver installation. When the driver instillation is complete, you see the bridge module under System Devices in device manager as shown in Figure 4-44. Figure 4-44 Bridge in System Devices 5. Windows now finds the SilverStorm Technologies VEx I/O Controller. Select Install from a specific location and click Next. 6. Click Browse under Include this location in the search. Select the following location and click OK. C:\Program Files\SilverStorm\SilverStorm HCA\net 7.
9. Make an Ethernet connection from the bridge to a test or other network. Inside Network Connections, you see six Local Area Connections, one for each virtual NIC, as shown in Figure 4-46. Figure 4-46 Virtual NICs in Windows Network Connections 10.Set an IP address as you would with any Ethernet adapter and save your settings. Ping another device on the test network to test your connection as shown in Figure 4-47. C:\>ping 192.168.70.111 Pinging 192.168.1.
needed. However without the bridge module, you cannot ping a true Ethernet device—just another InfiniBand interface running IP over InfiniBand. Follow these steps: 1. Windows finds the Open IPoIB Adapter. Select Install from a specific location and click next. 2. Click the Browse button under Include this location in the search. Select the following location and click OK: C:\Program Files\SilverStorm\SilverStorm HCA\net 3. Click Next to continue the driver installation.
6. Set an IP address as you would with any Ethernet adapter and save your settings. See Figure 4-50. Figure 4-50 Local Area Connection Properties 7. Ping another device on the test network to test your connection. If both blades are in the same chassis, you do not need any external cables as long as a functioning InfiniBand switch is in one of the supported I/O bays. If the devices are in two separate chassis, you need an InfiniBand cable to connect the InfiniBand switch modules in the respective chassis.
3. Click Next to continue driver installation. When the driver instillation is complete, you see the bridge module under System Devices in Device Manager, as shown in Figure 4-51. Figure 4-51 Fibre Channel Bridge in Device Manager 4. Windows now finds the SilverStorm Technologies VFx I/O Controller. Select Install from a specific location and click Next. 5. Click Browse under Include this location in the search. Select the following location and then click OK.
7. Repeat steps 4 through 6 for the other VFx Controller. There are a total of two controllers. When the driver instillation is complete, you see two SilverStorm VFx I/O Controllers under SCSI and RAID Controllers in Device Manager as shown in Figure 4-52. Figure 4-52 VFx Controllers in Device Manager The VFx Controllers will have exclamation points (!) on them until an SRP mapping is created. Each bridge module has two IO Controllers (IOC), thus the two VFx controllers in Device Manager. 8.
9. Click Configure to designate a name for the initiator (Figure 4-54). This name will be used later when mapping to a storage target. Figure 4-54 Assign name to SRP Initiator 10.After you designate a name, click Submit. The initiator moves from Discovered Hosts to Configured Initiators (Figure 4-55). Figure 4-55 Configures SRP Initiator 11.Each Fibre Channel port on the bridge module has a port WWN. On the storage device, the bridge port WWN needs to be mapped to the logical drives.
12.After you establish the bridge port to remote drive mapping, you need to establish another mapping from SRP initiator to remote drive. On the bridge Web GUI main menu, click FCP Device Discovery, and then Start. The remote drives that are mapped to the bridge ports will show up in Discovered Devices (Figure 4-58). Figure 4-58 Discovered Fibre Channel Target 13.Click Configure to designate a name for the device (Figure 4-59). This name will be used later when mapping to an SRP Initiator.
Figure 4-61 Add Explicit Map 17.Select a storage target and identify the host LUN and target LUN, then click Add Row. The host LUN is the LUN number that displays on the Linux host. The target LUN is LUN number that was determined for the logical drive on the storage device. When you are done adding all logical drives that you want mapped to this virtual HBA, click Finish.
Figure 4-62 Add Remote Drive Figure 4-63 Completed SRP Map 18.Back on the Windows blade, open Device Manager. Find the VFx (I/O) Controller that you used to create your SRP mapping. Right-click the controller to disable and then re-enable Chapter 4.
it. The exclamation point (!) should go away, and your mapping is now active. I/O Controller 1 was used for the explicit mapping. Figure 4-64 VFx Controller in Device Manager 19.You should now see your remote drives in Disk Management, as shown in Figure 4-65.
20.On the bridge web GUI, the boxes beside the SRP initiator name and the IOC Map name indicate the active connections to the virtual HBA, as shown in Figure 4-66. It should go from 0 to 1. You might need to click Refresh on the SRP map page. Figure 4-66 Active SRP Map 21.Now you create a direct mapping. Click SRP map config on the GUI main menu page. Here you see the virtual HBAs listed: – If you want to map your target through IOC 1, click the Click To Add link under the IOC 1 heading.
22.Give the map a name and select Direct type, and then click Next.
22. Select a storage target and click Finish. Figure 4-68 Choose Direct Map Chapter 4.
23.Back on the windows blade, open device manager. Find the VFx (I/O) Controller that you used to create your direct SRP mapping (Figure 4-69). Right-click the controller to disable then re-enable it. The exclamation point (!) should go away and your mapping is now active. I/O Controller 2 was used for the direct mapping.
24.You should now see your additional remote drives in disk management as shown in Figure 4-70. Figure 4-70 Remote Drives in Disk Management Chapter 4.
25.On the bridge web GUI, the boxes beside the SRP initiator name and the IOC Map name indicate the active connections to the virtual HBA, as shown in Figure 4-71. It should go from 0 to 1. You might need to click Refresh on the SRP map page.
5 Chapter 5. Configuring IP over InfiniBand for use with Cisco switches This chapter covers the details of implementing IP over InfiniBand (IPoIB) on BladeCenter HS21 servers with the Cisco InfiniBand SDR dual port 4X Host Channel Adapter (HCA) installed. It includes procedures for both Red Hat Enterprise Linux 4 U4 and Windows Server® 2003 x64.
Figure 5-1 BCH high speed InfiniBand Switch Module GUI The topics that we discuss in this chapter are: 5.1, “Configuring IPoIB on a Linux host” on page 82 5.2, “Configuring the IPoIB on a Windows host” on page 88 5.1 Configuring IPoIB on a Linux host This section covers the step-by-step procedure of installing the InfiniBand host drivers and configuring IPoIB on the blade servers running Linux. 5.1.
The choice of which driver to use will be based on the customers specific requirements, which is usually driven by what upper layer applications will be making use of the drivers. In this section we will be focusing on the Cisco OFED drivers. For more information about the Cisco InfiniBand host drivers, refer to the release notes and user guide available at the Cisco software download site. You can obtain the Cisco OFED and SRP host drivers for Linux download site from: http://www.cisco.
libmthca-1.0.2-1 dapl-devel-1.2.0-1 libibcommon-devel-1.0-1 libipathverbs-devel-1.0-1 opensm-devel-1.2.0-1 libibmad-devel-1.0-1 opensm-libs-1.2.0-1 dapl-1.2.0-1 libibcm-0.9.0-1 libibverbs-utils-1.0.3-1 mstflint-1.0-1 libipathverbs-1.0-1 libibverbs-1.0.3-1 libibumad-devel-1.0-1 kernel-ib-1.0-1 libmthca-devel-1.0.2-1 opensm-1.2.0-1 srptools-0.0.4-1 tvflash-0.9.0-1 Installing packages. Preparing... 1:kernel-ib Preparing... 1:kernel-ib-devel Preparing... 1:ib-bonding Preparing...
28:mpitests_mvapich2_gcc ########################################### 29:mpitests_mvapich2_intel########################################### 30:mpitests_mvapich2_pgi ########################################### 31:mpitests_mvapich_gcc ########################################### 32:mpitests_mvapich_intel ########################################### 33:mpitests_mvapich_pgi ########################################### 34:mpitests_openmpi_gcc ########################################### 35:mpitests_openmpi_intel ####
5.1.3 Configuring IP over InfiniBand The IPoIB protocol passes IP traffic over the InfiniBand network. Configuring IPoIB is very similar to configuring IP on the Ethernet interfaces. The procedure for configuring IPoIB using the Cisco SRP drivers is also the same, but not specifically shown in this chapter. To configure IPoIB, you assign an IP address and subnet mask to each InfiniBand port on the host. IPoIB automatically adds InfiniBand interface names to the IP network configuration.
ib0 Link encap:InfiniBand HWaddr 80:00:04:04:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 inet addr:172.16.240.10 Bcast:172.16.240.255 Mask:255.255.255.0 inet6 addr: fe80::205:ad00:5:391/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:79 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:128 RX bytes:0 (0.0 b) TX bytes:17380 (16.9 KiB) 5.1.
5.2 Configuring the IPoIB on a Windows host The follow sections cover the procedure to install the InfiniBand host drivers on an HS21 blade with Windows Server 2003 x64. 5.2.1 Installing Windows host drivers Prior to beginning this process, you might need to download the Cisco InfiniBand host drivers for Windows, which is available at: http://www.cisco.com/pcgi-bin/tablebuild.pl?topic=280511613 To install drivers after you install the HCA, perform the following steps: 1.
2. Insert the InfiniBand host driver CD or download the Windows Server 2003 drivers on the blade and verify the CD contents as shown in Figure 5-3. Figure 5-3 Verify InfiniBand Host driver CD contents 3. Run the topspin-ib-W2k3-x86-es-2.0.3-214.exe file as shown in Figure 5-3. 4. Begin the installation. The Product Install window displays. Click Next to proceed with installation. 5. Read the license and select I Agree to the terms listed above. Then click Next to continue. 6.
5.2.2 Verifying the installation Follow these steps to verify the installation: 1. To verify the driver installation, log in to your host and check whether the Topspin InfiniBand SDK menu is listed under All Programs as shown in Figure 5-5. Figure 5-5 Verify InfiniBand Windows Host Driver menu under Program Files 2. Open Device Manager to verify that the InfiniBand HCA is listed as shown in Figure 5-6.
3. Verify that a Topspin IPoIB Virtual Channel Adapters for each port on the blade server are listed in the Network Connections menu as shown in Figure 5-7. Figure 5-7 Verify IP over InfiniBand virtual interfaces from Network Connections menu 5.2.3 Configuring IPoIB Now that the required drivers are installed and configured, the next step is to configure IPoIB. The IPoIB driver is automatically initialized when the installation is successfully completed.
gateway assigned), using the Network connections menu. The first port on the HCA becomes interface ib0 and the second port on the same HCA is interface ib1 on the blade server. Complete the following steps: 1. From the Windows Network Connections menu, right-click one of the IPoIB interfaces and select Properties as shown in Figure 5-8. Figure 5-8 configure IPoIB Interface, select Properties 2. Select Internet Protocol (TCP/IP) and then click Properties as shown in Figure 5-9.
3. Enter an IP address, subnet mask, and optionally a default gateway. Then, click OK as shown in Figure 5-10. Figure 5-10 Internet Protocol TCP/IP Properties menu 4. Click OK to apply the TCP/IP configuration. 5. Upon successful binding of IP address and subnet mask information to the interface ib0, verify successful initialization of the ib0 interface from the command prompt using the command ipconfig, as shown in Figure 5-11.
This completes the IPoIB configuration process. 5.2.4 Verifying the IPoIB communication To verify IPoIB functionality on Linux hosts: 1. Login in to the hosts and ensure that the IPoIB interfaces on the source and destination nodes are up by using the ifconfig command specified in Example 5-6 on page 86. 2. Issue a ping command between a source node to a destination node to ensure successful IPoIB communication as shown in Figure 5-12. Pinging 172.16.240.
6 Chapter 6. Boot from InfiniBand using the Cisco 3012 InfiniBand to FC gateway This chapter describes how to configure booting from external storage attached through a high-speed InfiniBand fabric. Boot from InfiniBand (BoIB) allows diskless servers to mount the boot device located on an external RAID-capable storage subsystem attached to Fibre Channel SAN. When a blade server with 4x InfiniBand HCA is initialized, it executes the BoIB firmware.
6.1 Benefits of booting from high-speed InfiniBand The BoIB feature on the IBM BladeCenter blade servers provides high availability, reduces cabling complexity, minimizes the downtime windows and helps consolidate the IT infrastructure. The BoIB feature allows for deployment of diskless servers and exploit the10 Gbps bandwidth available with BC-H solution.
6.2.1 Pre-configuration checklist Before proceeding to implement the function, ensure that the following tasks are completed. Installing hardware To set up the physical environment for your SAN boot, perform the following high-level steps: 1. Install the HCA expansion card in the blade server. 2. Install the high-speed InfiniBand switch into bay 7 of the BladeCenter H chassis. 3. Connect the high-speed InfiniBand switch in bay 7 to the 4X InfiniBand port on the SFS3012 server switch. 4.
Look for the .Boot extension in the description as shown in Figure 6-2 to confirm that Boot over InfiniBand is enabled. [root@localhost sbin]# ./tvflash -i HCA #0: MT25208 Tavor Compat, BC2 HSDC, revision A0 Primary image is v4.7.600 build 3.2.0.106, with label 'HCA.HSDC.A0.Boot' Secondary image is v4.7.600 build 3.2.0.106, with label 'HCA.HSDC.A0.
2. From the Element Manager UI, select Fibre Channel → Storage Manager. Select the SRP Hosts folder in the left-hand pane as shown in Figure 6-4. Figure 6-4 Define a New SRP Host from the TS3012 Storage Manager menu 3. Click Define New. The define New SRP Host Windows displays as shown in Figure 6-5. Figure 6-5 Select the host GUID from the drop-down list 4. From the drop-down menu, select the GUID of the blade that you recorded in step 1 on page 98 as shown in Figure 6-5. 5.
6. Click Next. The WWNN is assigned to the SRP host shown in Figure 6-6.
7. Select Finish to complete the SRP host creation process as shown in Figure 6-7. Figure 6-7 Define New SRP Host menu Chapter 6.
The new SRP host Blade 10 (Linux BoIB) that you just created is now listed under the SRP Hosts folder as shown in Figure 6-8. Figure 6-8 Define New SRP Host menu 6.2.4 Configuring Fibre Channel SAN This section discusses the Fibre Channel SAN configuration process. The SFS3012 Gateway and the IBM System Storage DS4000™ storage subsystem is connected to a Cisco MDS9124 FC switch. For detailed procedure on implementing zoning refer to the MDS9000 Switch User Guide.
To configure Fibre Channel SAN, follow these steps: 1. Configure Fibre Channel SAN implies for end-to-end connectivity, the WWPNs of blade, storage, and Fibre Channel gateway are listed in the Name Server as shown in Figure 6-9.
6.2.5 Configuring storage Tip: The storage configuration procedure is different for different vendor products, but the concept of host definition and LUN masking is generally the same. For detail instructions about storage configuration, refer to the corresponding storage subsystem user guide. To configure storage on the DS4000 storage subsystems, perform the following steps: 1. 2. 3. 4. 5. Define a Host Group. Define a Host. Define a Host Port. Create a Logical Drive. Map the Logical Drive.
6.2.6 Discovering storage If the FC zoning, host definition and LUN masking are configured correctly, then the target ports and the boot LUN is discovered from the SFS3012 storage configuration menu. From the Storage configuration, select the new initiator under the SRP Hosts folder (Blade10 in our example) and then go to the Targets tab. Verify that the WWPN of the target onto which you want to install the image displays in the list as shown in Figure 6-12.
Figure 6-13 Default host access policies 6.2.8 Port masking This section covers port masking on the SFS3012 gateway. The port masking feature is used to permit or deny host access through the port. Follow these steps: 1. Select the SRP initiator listed in the SRP Hosts folder, then go to the Targets tab (Figure 6-14 on page 107). 2. Verify that the WWPN of the target controller or controllers are listed.
Figure 6-14 SRP Host: List target devices accessible 3. Double-click any of the target controller WWPN ports through which the BoIB image will be installed. The ITL Properties window opens as shown in Figure 6-15. Notice that the default value is to permit access through ports 6/1 and 6/2. Figure 6-15 Configure port masking for the SRP host Chapter 6.
4. Click the button shown in Figure 6-15 to open the Select Ports menu that gives the option to allow or to deny access to the SRP host through one or more ports. Add check marks to only one of the two ports as shown in Figure 6-16 and click OK to save the changes. Note: It is critical that a single and unique path is enabled from the host to the boot LUN for the initial operating system installation to complete successfully.
6.2.9 Discovering the boot LUN This section covers the procedure to discover the boot LUN from the Storage Manager menu of the SFS3012 FC Gateway. Follow these steps: 1. Select the SRP initiator listed in the SRP Hosts folder, then go to the Targets tab. 2. Click Discover LUNs (Figure 6-18). 3. Select the target port and the LUN on which the boot image will be installed and click Add. Then, click Apply to save the change. Figure 6-18 Accessible LUNs menu Chapter 6.
4. Verify that the Boot LUN ID is 0 by selecting the boot LUN that was added in the preceding step and selecting ITL properties. Figure 6-19 opens with an SRP LUN ID=0. Figure 6-19 Verify Boot LUN ID = 0 6.3 Installing the OS on the Fibre Channel storage This section covers some key configuration steps for installing Red Hat Enterprise Linux 4 Update 3 x64 on the HS21 blade booting over InfiniBand through the NFS Server.
The blade boot sequence can be configured from the Advanced Management Module Web interface. Follow these steps: 1. Configure the blade server to boot over the network using the BladeCenter Advanced Management Module by clicking Blade Tasks → Configuration as shown in Figure 6-21. Figure 6-21 Verify boot device order from the Advanced Management Module Web interface Chapter 6.
2. Change specific servers by clicking the server name. Figure 6-22 opens, and you can select the boot order. Verify that the first boot device is selected as Network PXE.
3. Upon power on or restart, the service initializes into the Boot over InfiniBand firmware menu shown in Figure 6-23. Ensure that option 2 “Well Known boot service” (the default value) is selected. Then select option s to save the configuration and select x to exit the InfiniBand firmware menu. InfiniBand boot driver v3.2.0 build 106 (c)Copyright Topspin Communications, Inc. 2003-2005 Type 'x' to configure boot options Node GUID: 0005ad0000050724 Waiting for SM to configure ports...........................
4. The blade power-on initialization process proceeds to look for a DHCP server using PXE. Upon successful discovery of DHCP server, it allows you to select the appropriate operating system to install. The installation menu in Figure 6-24 is a custom menu and can also be implemented differently in your environment. At the boot prompt, we selected RHEL 4 U3 x86_64. Welcome to SC Lab Network Installer! Enter number of the Operation System you wish to install: 0. Local Machine 20. 21. 22. 23. 24. 25. 26.
8. If there was a previous image installed on the boot disk, then it is highly recommended that you remove all the partitions as shown in Figure 6-26. Click Next. Figure 6-26 Remove all partitions on the system 9. Select Yes to confirm deleting all the partitions. Chapter 6.
10.Figure 6-27 shows the storage configuration and confirms that the Fibre Channel-attached boot LUN is accessible by the host for read and write operations. Figure 6-27 Disk Partitions on the boot LUN through automatic storage partitioning utility 11.Continue with the remainder of the installation process.
7 Chapter 7. Configuring the Cisco 3012 InfiniBand to Ethernet gateway This chapter discusses the Cisco 3012 InfiniBand to Ethernet Gateway Module that is available for use in the Cisco SFS 3000 platforms. We discuss the basics of how the product works and then cover basic usage and more advanced topics, such as using VLANs, EtherChannel, and gateway redundancy. The topics that we discuss in this chapter are: 7.1, “Introduction to the Ethernet gateway module” on page 118 7.
7.1 Introduction to the Ethernet gateway module The optional InfiniBand to Ethernet gateway module provides a way for InfiniBand clients to talk to devices on an Ethernet LAN, and vise versa. This module is considered a transparent L2 device, in that the clients do not need to be aware they are using it. Physically it offers two inward facing 4X (10G) InfiniBand ports, and six outward facing 10/100/1000 Ethernet ports.
Features and characteristics of the Ethernet gateway module include: Two inward facing 4X (10G) InfiniBand ports Six outward facing RJ45 10/100/1000 Ethernet ports with MDI/MDI-X support Can be installed in any of the expansion slots of the SFS 3012 Hot-swappable Jumbo frame support built in – 9 K, Ethernet to InfiniBand – 2 K, InfiniBand to Ethernet Supports up to 32 VLANs, both static VLANs and 802.
The 3012 (and thus the Ethernet gateway module) supports four different options for configuring: – – – – CLI GUI - http GUI - Element Manager SNMP LED meanings at the gateway module level: – Green – InfiniBand Module Status Indicator • On – Module active • Off – Module inactive – Yellow – InfiniBand Attention Indicator • On – Attention required (Something is wrong) • Off – All OK Port Level LED meanings – Dual color LED • Off – no signal • Solid Yellow – Port Disabled • Flashing Yellow – Port Fault
7.1.2 Some general items to consider Some general topics that are important to those working with this product include: Use of the GUI or CLI interface Either the GUI or the CLI or both can be used to configure the gateway module. In general, the GUI (either the Element Manager or the built-in http support) is good for one-off configurations or for use by those that are not familiar with the CLI.
You can find more information about understanding and configuring the Ethernet gateway module in documentation that is available at the following links: Cisco SFS 3000 series documentation home http://www.cisco.com/en/US/products/ps6422/index.html Cisco SFS InfiniBand Ethernet Gateway User Guide http://www.cisco.com/en/US/products/ps6422/products_user_guide_list.html Cisco SFS InfiniBand Redundancy Configuration Guide http://www.cisco.
After you have configured these two clients and they can ping each other, you can proceed to configuring the Ethernet gateway module, examples of which can be found in the following sections. 7.3 Implementing a simple bridge group connection to an upstream Ethernet network In this section, we demonstrate the ability to ping from an InfiniBand server to an IP address on the upstream Ethernet switch, using a single Ethernet connection out of the gateway module.
Cisco 4948 Ethernet Switch, the 4948 will put any untagged packet on to VLAN 88 inside the 4948). Assuming the client and other components are already configured, all we have to do is create a bridge group with the desired options in the gateway module, to permit packets to flow. Note that a bridge group is a logical entity that allows the gateway to pass InfiniBand frames to the Ethernet network. 7.3.
4. Commit the new Bridge Group information and review (step 4 on page 129). 5. Save the 3012 config to NVRAM (step 5 on page 130). 6. Configure the upstream 4948 to match the Ethernet Gateway configuration (step 6 on page 130). 7. Test the configurations by pinging from the InfiniBand client to the default gateway on the 4948 (step 7 on page 131). 7.3.4 Detailed steps to implement a simple bridge group design To implement a simple bridge group design, follow these detailed steps: 1.
Assuming a fresh installation, Figure 7-6 opens. Figure 7-6 Bridge group page with no bridge groups 2. Click Add to create your first bridge group with the following characteristics: – The Ethernet port that we use is 4/1. This port is untagged because we want to send untagged packets to our upstream switch. – The InfiniBand port is 4/1 in this example and uses the default p_key, ff:ff. Tip: If only using a single Ethernet Gateway, you should use InfiniBand port 2 for connection to the InfiniBand fabric.
• Delayed proxy ARP transaction: By default, the delayed proxy ARP transaction feature is always active. This feature is an extension of the self-cancelling ARP request and comes into play when a duplicate ARP request is delayed. – Setting Loop Protection to one allows for ARP packet painting, an extra loop protection mechanism.
3. After making changes to the Add Bridge Group Groups tab, you need to configure the Forwarding tab and the Subnet Tab as shown in Figure 7-8 and Figure 7-9 on page 129 Figure 7-8 Forwarding tab on Add Bridge Group window In the example in Figure 7-8, you set up a default gateway for this bridge group to use on the upstream Ethernet network. Some comments about the settings in the Forwarding tab: – Created by clicking the Add button in this tab (not the Add button at the bottom of the page).
Figure 7-9 Subnet tab on Add Bridge Group window In the example in Figure 7-9, we define the Subnet for this bridge group. Some comments on these settings: – 172.16.225.0 is the subnet being used for this bridge group (based on us using a 24-bit mask of 255.255.255.0. – Because we are using 24-bit mask, this is the prefix length for this subnet. – In our example, we only have a single IP subnet using this bridge group, and that subnet is what we entered into the Subnet tab.
5. The final step on the 3012 is to save this configuration to NVRAM. Note: As mentioned previously, all changes done through the GUI take effect immediately and are placed in the running config, but the changes in the running config are not explicitly saved to NVRAM until you perform this step. From the main page of the Element Manager, click Maintenance → Save Config to save the configuration to NVRAM. 6.
After logging into the 4948 with sufficient privileges, execute the commands as shown in Example 7-2 on the 4948. Items starting with an exclamation mark (!) are for reference or comment only and are not executed.
7.3.5 CLI reference for this section This section includes what the appropriate CLI looks like after the steps were executed in the GUI to create this bridge group, as well as after the changes made to the 4948. It is a useful reference for those wanting to understand the CLI better, or for those that simply want to use the CLI and not the GUI to achieve this task. Example 7-4 is the CLI of the 3012 after bridge group BG1 was created and fully operational.
7.4.1 Some comments on VLANs and the Ethernet gateway module When discussing VLANs and the Ethernet gateway module, remember: Each Ethernet gateway module supports up to 32 VLANs Ethernet bridge module ports can be tagged or untagged Standard 802.
7.4.2 Summary of steps to create a tagged VLAN bridge group design Here, we provide a summary of the steps that we describe in the remainder of this section to implement a tagged VLAN bridge group design: 1. In the 3012 Element Manager, bring up the Bridging window and fill in the Groups, Forwarding and subnet tabs (step 1 on page 134). 2. Commit the new Bridge Group information and review (step 2 on page 135). 3. Save the 3012 config to NVRAM (step 3 on page 135). 4.
2. After you have configured all of the tabs of the Add Bridge Group window, click Add to create the new bridge group BG1 and make it active. Figure 7-13 shows the bridge group with tagged VLAN 88 selected and ready for use. Figure 7-13 BG1 ready to use tagged packets on VLAN 88 3.
After logging into the 4948 with sufficient privileges, execute the commands as shown in Example 7-6 on the 4948 (items starting with an exclamation mark (!) are for reference/comments only and will not be executed). Example 7-6 Configure the 4948 to accept packets tagged with VLAN 88 ! Assumes starting from enable mode, enter config mode conf t ! Change int gi1/16 to support tagging for VLAN 88 int gi1/16 description Connection to 3012 Ethernet Switch Module port 4/1 - VLAN 88 ! Tell the port to use 802.
Example 7-8 is the CLI of the 3012 creating a new BG1 to use VLAN tagging on VLAN 88. Example 7-8 3012 - after VLAN 88 tagging added 3012-1# show config ... bridge-group 1 subnet-prefix 172.16.225.0 24 bridge-group 1 ib-next-hop 172.16.225.250 bridge-group 1 name "BG1" ! interface gateway 4/1 bridge-group 1 pkey ff:ff ! interface Ethernet 4/1 bridge-group 1 vlan-tag 88 ! Example 7-9 is the CLI of the 4948 after changes to support tagged VLAN 88 packets.
Seven different types of frame distribution algorithms are supported with this product, as shown in Table 7-1. Table 7-1 Load balancing options for link aggregation Distribution Description dst-ip Load distribution is based on the destination IP address.
Configuring aggregation must be done on both sides of the links (in the Ethernet gateway module and the 4948). If you only configure one side, it might or might not work, and unexpected or undesired results will more than likely occur. Figure 7-14 shows the configuration we will be using to demonstrate aggregation. Cisco 4948 Ethernet Switch EGW = Ethernet Gateway Module BG1 = Bridge Group 1 Blue = InfiniBand Green = Ethernet Access VLAN 88 Int VLAN 88 172.16.225.
8. Re-attach or bring up the links between the 3012 and the 4948 (step 8 on page 143). 9. Test the configurations by pinging from the InfiniBand client to the default gateway on the 4948 and remove links from the trunk one at a time to ensure trunk is working (step 9 on page 143). 7.5.3 Detailed steps to implement a trunked bridge group design To implement a trunked bridge group design, follow these detailed steps: 1.
4. Click Insert, select a trunk port number to assign (trunk group 1 in this example) and give it a name (TG1 in this example), then click the option to select port members and select the port member numbers (4/1 and 4/2 in this example). See Figure 7-17. Tip: Port channel numbers and trunk group numbers are locally significant only. In other words, when creating aggregations, the port channel number on the upstream switch and the trunk number on the gateway module can be different on each side of the link.
5. After configuring the trunk group (TG1) to use the gateway module Ethernet ports 4/1 and 4/2, recreate the bridge group BG1 and assign trunk group TG1 as its Ethernet member. Follow the steps in starting at step 1 on page 125, but instead of selecting a physical port for the Ethernet connection, select trunk 1 as the Ethernet port, as seen in Figure 7-20. Remember to also configure the Forwarding and Subnet tabs per the instructions starting at step 1 on page 125.
7. After the 3012 configuration is saved, you need to configure the 4948. In this example, we use ports gi1/16 and gi1/17 on the 4948 to connect to TG1 on the Ethernet gateway module. On the 4948 side, we use static aggregation, because this is the only mode that works correctly with the Ethernet gateway module. Contact your network engineer if you are not sure what you need to do for this to work.
a time) and make sure pings continue to work, regardless of if we have either cable plugged in, or both cables plugged in. In Example 7-11, we lost a total of three pings while pulling and adding cables. Ping packet 9 was lost when we pulled the first cable, ping packets 21 and 22 were lost when we reinserted the first cable and pulled the second cable. Example 7-11 InfiniBand host pinging the default gateway while cables being pulled and re-added [root@localhost ~]# ping 172.16.225.250 PING 172.16.225.
Example 7-12 is the CLI of the 3012 after creating trunk group TG1 and adding it to bridge group BG1. Example 7-12 3012 - after changing to untagged and adding trunk group 3012-1# show config ... bridge-group 1 subnet-prefix 172.16.225.0 24 bridge-group 1 ib-next-hop 172.16.225.
7.6 Implementing a design using gateway redundancy In many cases, customers want to make sure the InfiniBand hosts have High Availability (HA) connectivity to the outside world. One element of this HA environment is redundant Ethernet gateway modules to the Ethernet network. 7.6.1 Some considerations for redundancy When implementing a design using gateway redundancy, remember the following considerations: In this example, both gateway modules are in the same 3012.
Figure 7-22 represents the test environment for demonstrating gateway redundancy. In this example bridge group 1 (BG1) and bridge group 2 (BG2) will be put into Redundancy Group 1 (RG1). Cisco 4948 Ethernet Switch Int VLAN 88 172.16.225.
6. Create the redundancy group using the two previously created bridge groups (step 6 on page 151). 7. Save the 3012 config to NVRAM (step 7 on page 152). 8. Configure the upstream 4948 to match the Ethernet Gateway configuration (step 8 on page 152). 9. Re-attach or bring up the links between the 3012 and the 4948 (step 9 on page 153). 10.
4. When the mgmt-ib interface is configured, proceed to create BG1 and BG2. Note that for redundancy to work, the Bridge groups should be configured similarly. Same p_key on the InfiniBand side and same VLAN (or untagged) on upstream side. To begin the process of creating bridge group BG1 and BG2, use the example starting at step 1 on page 125 and create BG1, then repeat the process for creating bridge group BG2, using ID2, name BG2, and for the Ethernet port 8/1 and for the InfiniBand port 8/2.
Figure 7-24 Bridge group BG2 ready to be added When BG1 and BG 2 are added, the Bridging window should look as it does in Figure 7-25.
5. After BG1 and BG 2 have been created, close the Bridging window and from the main window of the Element Manager, click Ethernet → Redundancy. This opens a window similar to Figure 7-26. Figure 7-26 Redundancy Groups window before a Redundancy groups has been created 6. Click Add in the Redundancy Groups window to begin creating the Redundancy Group. This will open the Add Redundancy Group window, Figure 7-27. Select the ID (default is 1). Give the group a name (RG1 in this example).
Click Apply to create the Redundancy group as shown in Figure 7-28. Figure 7-28 Redundancy Group RG1 created and ready for operation 7. The final step on the 3012 is to save this config to NVRAM. Note: As mentioned previously, all changes done through the GUI take effect immediately and are placed in the running config, but the changes in the running config are not explicitly saved to NVRAM until you perform this step.
! Exit config mode end ! Save config to NVRAM write Tip: Example 7-15 is for a switch running IOS. If the upstream switch is running CatOS, the configuration would be different, CatOS configuration examples are not included in this document. 9. After the changes have been made to both the Ethernet gateway module and the 4948, re-attach the cables or re-enable the ports. 10.To verify redundancy is working, start a looping ping from the InfiniBand client to the default gateway IP address (172.16.225.
Note that we include only the commands that effect the area of the device that we configured. We do not include the entire device configuration. Example 7-17 is the CLI of the 3012 after the changes made in this section. Example 7-17 3012 - after redundancy added 3012-1# show config ... interface mgmt-ib ip address 192.168.0.1 255.255.255.0 no shutdown bridge-group 1 subnet-prefix 172.16.225.0 24 bridge-group 1 ib-next-hop 172.16.225.250 bridge-group 1 name "BG1" bridge-group 2 subnet-prefix 172.16.225.
Example 7-19 is the CLI of the 4948 after changes to support this bridge group. Example 7-19 4948 - after support for redundancy added 4848-1#show run ... ! interface GigabitEthernet1/16 description Connection to 3012 Ethernet Switch Module port 4/1 and 8/1 switchport access vlan 88 switchport mode access spanning-tree portfast ! interface GigabitEthernet1/17 description Connection to 3012 Ethernet Switch Module port 4/1 and 8/1 switchport access vlan 88 switchport mode access spanning-tree portfast ...
7.7.2 For the 4948 Table 7-3 shows some useful commands for supporting an upstream Cisco IOS device (a 4948 in our example). Table 7-3 IOS commands 156 Command Description show run Shows the current running configuration of the 4948 show int status Shows a snap-shot of all of the ports on the system and their status (connected/not connected, VLAN in use, speed, and so forth) show int trunk Shows information and status on any ports configured as 802.
Abbreviations and acronyms AC alternating current HA high availability AMD Advanced Micro Devices™ HBA host bus adapter API application programming interface HCA host channel adapter ARP Address Resolution Protocol HDD hard disk drive ASP active server page HPC high performance computing ATM asynchronous transfer mode HSFF high-speed form factor BG bridge group I/O input/output BIOS basic input output system IB InfiniBand BMC baseboard management controller IBM BTH base t
MSI Microsoft Installer TOE TCP offload engine MTU maximum transfer unit TX transmit NEBS network equipment building system UI user interface NFS network file system URL Uniform Resource Locator NGN next-generation networks USB universal serial bus NIC network interface card VCRC variant cyclic redundancy check NVRAM non-volatile random access memory VL virtual lane VLAN virtual LAN VNIC virtual network interface card VOIP Voice over Internet Protcol WAN wide area network
Related publications We consider the publications that we list in this section particularly suitable for a more detailed discussion of the topics that we cover in this paper. IBM Redbooks You can search for, view, or download books, papers, Technotes, draft publications and additional materials, as well as order hardcopy Redbooks, at the IBM Redbooks Web site: ibm.
Online resources These Web sites are also relevant as further information sources: IBM Web sites IBM ServerProven http://www.ibm.com/servers/eserver/serverproven/compat/us/ QLogic InfiniBand Fibre Channel Bridge Module firmware update 4.1.0.2.2 http://www.ibm.com/support/docview.wss?uid=psg1MIGR-5069861 QLogic InfiniBand Fibre Channel Bridge Module firmware update 3.3.0050.0 http://www.ibm.com/support/docview.