Configuration manual
E-Series ExaScale Debugging and Diagnostics | 1247
In a dual RPM system, the two RPMs send synchronization messages via inter-RPM communication
(IRC). As described in the High Availability chapter, an RPM failover can be triggered by loss of the
heartbeat (similar to a keepalive message) between the two RPMs. FTOS reports this condition via syslog
messages, as follows:
FTOS automatically saves critical information, about the IRC failure, to NVRAM. Use the same three-step
procedure to capture this file for analysis by Dell Force10.
FTOS actually saves up to three persistent files depending upon the type of failure. When reporting an
RPM failover triggered by a loss of the IPC or IRC heartbeats, look for failure records in the following
directories:
• Application or kernel core dump RP in the CORE_DUMP_DIR
• CP trace log file (look for a filename with the phrase “failure_trace”) in the TRACE_LOG_DIR
• RP and/or CP sysinfo file in the CRASH_LOG_DIR, as explained above
Software debugging commands
FTOS supports an extensive suite of debug commands for troubleshooting specific problems while
working with Dell Force10 technical support staff. All debug commands are entered in EXEC Privileged
mode. See the FTOS Command Reference for details.
Hardware debugging commands
The hardware commands show control-traffic and show ipc-traffic show the control traffic and IPC traffic
between the line card and RPM through the party-bus path.
When viewing the output of the commands, note that Ingress refers to the path from the line card to the
RPM; Egress refers to the path from the RPM to the line card.
The ingress and egress keywords provide a general overview of the packet counters and packet drops for
the IPC or control traffic (top of Figure 63-11 and Figure 63-12).
The other keywords specify the component information that gets displayed. Note that the ipc-traffic
command does not have an ACL-FPGA option. See the FTOS Command Reference for details regarding
the keywords and their meanings.
Message 8 RPM Failover Message
20:29:07: %RPM1-S:CP %IRC-4-IRC_WARNLINKDN: Keepalive packet 7 to peer RPM is lost
20:29:07: %RPM1-S:CP %IRC-4-IRC_COMMDOWN: Link to peer RPM is down
%RPM1-S:CP %RAM-4-MISSING_HB: Heartbeat lost with peer RPM. Auto failover on heart beat lost.
%RPM1-S:CP %RAM-6-ELECTION_ROLE: RPM1 is transitioning to Primary RPM.