S2600GZ and S2600GL

Table Of Contents
Platform Management Functional Overview Intel® Server Board S2600GZ/GL TPS
Command
Routed through command processor
Turns power on or off, or power cycle
Power state retention
Implemented by means of BMC internal logic
Turns power on when AC power returns
Chipset
Sleep S4/S5 signal (same as POWER_ON)
Turns power on or off
CPU Thermal
CPU Thermtrip
Turns power off
WOL(Wake On LAN)
LAN
Turns power on
6.4
BMC Watchdog
The BMC FW is increasingly called upon to perform system functions that are time-critical in that failure to
provide these functions in a timely manner can result in system or component damage. Intel
®
S1400/S1600/S2400/S2600/S4600 Server Platforms introduce a BMC watchdog feature to provide a safe-
guard against this scenario by providing an automatic recovery mechanism. It also can provide automatic
recovery of functionality that has failed due to a fatal FW defect triggered by a rare sequence of events or a
BMC hang due to some type of HW glitch (for example, power).
This feature is comprised of a set of capabilities whose purpose is to detect misbehaving subsections of BMC
firmware, the BMC CPU itself, or HW subsystems of the BMC component, and to take appropriate action to
restore proper operation. The action taken is dependent on the nature of the detected failure and may result in
a restart of the BMC CPU, one or more BMC HW subsystems, or a restart of malfunctioning FW subsystems.
The BMC watchdog feature will only allow up to three resets of the BMC CPU (such as HW reset) or entire FW
stack (such as a SW reset) before giving up and remaining in the uBOOT code. This count is cleared upon
cycling of power to the BMC or upon continuous operation of the BMC without a watchdog-generated reset
occurring for a period of > 30 minutes. The BMC FW logs a SEL event indicating that a watchdog-generated
BMC reset (either soft or hard reset) has occurred. This event may be logged after the actual reset has
occurred. Refer sensor section for details for the related sensor definition. The BMC will also indicate a
degraded system status on the Front Panel Status LED after a BMC HW reset or FW stack reset. This state
(which follows the state of the associated sensor) will be cleared upon system reset or (AC or DC) power cycle.
Note: A reset of the BMC may result in the following system degradations that will require a system reset or
power cycle to correct:
1. Timeout value for the rotation period can be set using this parameterPotentially incorrect ACPI Power
State reported by the BMC.
2. Reversion of temporary test modes for the BMC back to normal operational modes.
3. FP status LED and DIMM fault LEDs may not reflect BIOS detected errors.
6.5
Fault Resilient Booting (FRB)
Fault resilient booting (FRB) is a set of BIOS and BMC algorithms and hardware support that allow a
multiprocessor system to boot even if the bootstrap processor (BSP) fails. Only FRB2 is supported using
watchdog timer commands.
FRB2 refers to the FRB algorithm that detects system failures during POST. The BIOS uses the BMC
watchdog timer to back up its operation during POST. The BIOS configures the watchdog timer to indicate that
the BIOS is using the timer for the FRB2 phase of the boot operation.
After the BIOS has identified and saved the BSP information, it sets the FRB2 timer use bit and loads the
watchdog timer with the new timeout interval.
If the watchdog timer expires while the watchdog use bit is set to FRB2, the BMC (if so configured) logs a
watchdog expiration event showing the FRB2 timeout in the event data bytes. The BMC then hard resets the
system, assuming the BIOS-selected reset as the watchdog timeout action.
The BIOS is responsible for disabling the FRB2 timeout before initiating the option ROM scan and before
displaying a request for a boot password. If the processor fails and causes an FRB2 timeout, the BMC resets
the system.
Revision 2.4
60