Specifications

3
Software/Firmware Description
59
Atruntime,theCPUtrig gersaSystemManagementInterrupt(SMI)whenmemoryerrors
reachapresetthreshold.Iftheruntimeerrorloggingisenabled.thenSMIdeterminesthe
cause,clearstheerrorstatus,andreportsthememoryerrortoIPMC.Memoryerrorscanbe
eithercorrectableoruncorrectable.Ifthe
countofcorrectablememoryerrorsgoesabov ethe
BIOS"MaxMemErrEvents"value,theSMIhandlerreportsthatthecorrectableerrorlimit
hasbeenexceededanddisablesfurthercorrectableerrorreporting(thuspreventing
performancedegradation).Uncorrectablememory errorsarealsoreportedtoIPMC,buterror
handlingisdeterminedbyBIOSandOSsettings.
PCIe error handling
TheBIOSusesbothlegacyPCIerrorsignaling(PERR/SERR)andPCIExpressAdvancedError
Reporting(AER).TheAERmappingreportstheerrorseverity(correctable,uncorrectable/non
fatal,oruncorrectable/fatal)inadditiontoreportingtheerror.
IftheBIOShasbeensetuptoenablePCIerrorloggingsupport,theBIOSenumeratesallPCI
devicesdetectedonthesystematPOSTtime,andenablestheerrorreportingPERR/SERR
forlegacydevicesandAERreportingifthedevicesupportsit.TheBIOSappliesanerrormask
toallAERsupporteddeviceswhenerrorsarereported,andmaytriggercriticalerroraction
fordetectedAERerrorsofthe properseverity.
Aswithmemoryerrors,atruntimePCIerrorsaresignaledtoSMI.ThePCIdevicecausingthe
errorisnextdetermined.TheSMIroutinethenclearstheerrorstatusandreportsaplatform
eventtoIPMC.TheSMIhandlermaythentriggercriticalerroractiondependingonBIOS
setupoptions.
Processor and integrated controller error handling
TheCPUsaswellastheintegratedQuickPathInterconnect(QPI)andIntegratedI/O(IIO)
controllersimplementvarioustypesoferrordetection,correction,containment,and
reportingfeatures.
ProcessorcoreanduncoreerrorreportingisperformedviaMachineCheckArchitecture
(MCA).Atstartuporafterapowergoodreset,BIOSinitializesthemachine
checkregisters,
clearsthestatusregistersbywritingzerosintotheregisters,andwritesallonesintothe
controlregisterstoenableallMCAfeatures.Ifthesystemisnotcomingupfromapowergood
reset,itretainsanyerrorinformationbypreservingthecontentofmachinecheckstatus
registers.
TheQPIprotocolusesaCRCmechanismtoensurethedataintegrityofaserialstream.Unless
acorruptdatacontainmentmechanismisenabled,theprocessorgenerat e saQPIerror
signalonerrordetection,whichinturngeneratesanSMIfortheBIOStoreportaplatform
event.
TheIIOmodule
usesanAERmechanism,similartoPCIerrorhandling,totriggerdifferent
systemerrorseverityresponsesdependingonthetypeofdetectederror.