Specifications
Processor Presence and Population Check QSSC-S4R Technical Product Specification
274
AND
x Processor VR current trip point (default setting: 90% of supported TDP current) is triggered.
AND
x System power utilization is high and exceeds a pre-set limit of 80%
BMC monitors throttling of CPU and Memory Controller and logs an SEL event. Power throttle sensor is implemented
as auto rearm sensor.
Upon assertion of the sensor offset, BMC starts an internal time of 30 mins. BMC will re-arm the sensor when the timer
expires. The sensor is also re-armed when the system is reset or DC power-cycled.
24.19 Memory Riser Power Failure Monitoring
The BMC supports detection of memory riser power failures. As soon as power failure happens in any of memory
riser/s, PLD detects power failures and powers down the server. BMC reads the PLD status bits to find out location of
failed memory riser and logs assertion event for Memory Riser Power Fail sensor assertion offset. BMC implements
eight memory riser power failure sensors one for each memory riser, These sensors are readable in DC off state as
well, so that users can see if these sensors are asserted by any of memory board failure. Memory riser power failure
sensors are implemented as auto-rearm sensors. Once the event is asserted by BMC due to failed memory riser, it
would be de-asserted during DC reset.
24.20 Memory Hot Plug and Memory Offline/Online
QSSC-S4R supports memory RAS features for memory hot-plug and onlining/off-lining operations.
The memory hot-plug feature allows the end user to remove and/or insert memory boards while the system continues
to run. Only a single memory board may be removed or inserted at a time.
Memory Hot Plug is supported by the system BIOS and the BMC FW does not directly participate, however there are
interactions with the BMC’s polling of the DIMM temperature sensors and Mill Brook temperature sensors, as described
below.
BIOS must utilize the appropriate SPD SMBus segment to access the DIMM SPD EEPROM as part of the hot-plug/on-
line/off-line operation. The BMC uses these same SPD SMBus segments for polling of the DIMM and Memory Buffer
temperature sensors. Since memory-hot plug and memory on-lining can take place at any time.
Additionally, just as it does during POST, when new memory is added or brought online, BIOS must configure the
DIMM temperature sensors appropriately and provide the BMC with the new DIMM population status as well as
notification that the configuration has completed.
When the memory hot-plug is initiated, DIMM and Memory Buffer temperature sensors are no longer available to the
BMC FW and the fan control algorithms will apply a default fan speed to fan zones controlled by these sensors. As the
hot-plug operation completes, BIOS will update the BMC with new memory device and Memory Buffer population data
and the BMC will regain access to the Sensors
24.20.1 Semaphore Operation
To facilitate sharing of these SMBus segments, semaphores are supported (one semaphore per SMBus segments
attached to each CPU). In normal operational flow during runtime, ownership of a semaphore is requested from the
BMC by BIOS by use of an IPMI OEM command. However, in case the BMC is not responsive or otherwise does not
give up the bus in a timely manner, BIOS may forcibly take over the bus.
The semaphores are instantiated in the form of 4 bits in one of the IBMC’s mailbox registers, which can be set or
cleared by both the BMC and BIOS. The usage of these bits is defined as follows:
x A 0 indicates that BIOS owns the bus and a 1 indicates that the BMC owns the bus.
x At AC power-on, the default state of these mailbox register bits is 0.
x BIOS is the default owner of the all the busses once a reset has occurred until POST completes. At the start of
POST, BIOS clears all the semaphore bits (= BIOS ownership). Before POST completes, BIOS sets all the
semaphore bits (= BMC ownership)
x During runtime, if BIOS needs bus ownership, it must first try to acquire the bus ownership through the IPMI OEM
command method. Only if the BMCdoesn’t give up the bus after a timeout and retry by BIOS, then BIOS may
forcibly take over the bus by clearing the associated semaphore bit.