Specifications
156 IBM Power 770 and 780 Technical Overview and Introduction
4.1.2 Placement of components
Packaging is designed to deliver both high performance and high reliability. For example,
the reliability of electronic components is directly related to their thermal environment, that
is, large decreases in component reliability are directly correlated with relatively small
increases in temperature. POWER processor-based systems are carefully packaged to
ensure adequate cooling. Critical system components such as the POWER7 processor chips
are positioned on printed circuit cards so that they receive fresh air during operation. In
addition, POWER processor-based systems are built with redundant, variable-speed fans that
can automatically increase output to compensate for increased heat in the central
electronic complex.
4.1.3 Redundant components and concurrent repair
High-opportunity components, or those that most affect system availability, are protected with
redundancy and the ability to be repaired concurrently.
The use of redundant parts allows the system to remain operational. Among the parts are:
POWER7 cores, which include redundant bits in L1-I, L1-D, and L2 caches, and in L2 and
L3 directories
Power 770 and Power 780 main memory DIMMs, which contain an extra DRAM chip for
improved redundancy
Power 770 and 780 redundant system clock and service processor for configurations with
two or more central electronics complex (CEC) drawers
Redundant and hot-swap cooling
Redundant and hot-swap power supplies
Redundant 12X loops to I/O subsystem
For maximum availability, be sure to connect power cords from the same system to two
separate Power Distribution Units (PDUs) in the rack and to connect each PDU to
independent power sources. Deskside form factor power cords must be plugged into two
independent power sources to achieve maximum availability.
4.2 Availability
The IBM hardware and microcode capability to continuously monitor execution of hardware
functions is generally described as the process of first-failure data capture (FFDC). This
process includes the strategy of predictive failure analysis, which refers to the ability to track
intermittent correctable errors and to vary components off-line before they reach the point of
hard failure, causing a system outage, and without the need to re-create the problem.
Note: Check your configuration for optional redundant components before ordering
your system.