Providing Open Architecture High Availability Solutions

Providing Open Architecture High Availability Solutions
32
4.2.2 Compatibility and Interoperability
The definitions of an HA Framework must be clear and unambiguous. They must define the
component interfaces well enough to ensure that different implementations of the same
components will interoperate.
The definitions may leave room for future expandability and growth, but must require standards-
compliant implementations to be compatible. While it is possible that the component boundaries
may change over time, it is important to clearly define the initial models and implementations.
4.2.3 Related Standards Considered
The HA Framework must leverage or recommend existing hardware and software standards and
describe how they are used and integrated into the HA Framework standards. Some may be
considered prerequisites and listed as requirements, while others may emerge as the HA framework
definition evolves. For example:
Management: WBEM, SNMP, IPMI, etc.
Configuration/system modeling: CIM, X.731, etc.
I/O standards: PCI, CompactPCI*, H.100/110, InfiniBand*, etc.
Networking standards: TCP/IP, ATM, T1/E1, Sonet/SDH, etc.
Middleware standards: CORBA, etc.
De-facto standards for operating system, middleware etc.
4.3 System Topologies and Components
Typically the telecom and Internet infrastructure applications are built using clusters of systems
connected by network(s) such as SAN(s) or LAN(s). These clusters contain extra components (e.g.,
redundancy organized in modes such as N+1, 2N, etc.) that can be reconfigured in the event of a
single component failure so that the overall service provided by the cluster is acceptably
maintained.
A system within a cluster may simply be built of standard non-redundant components, or it may
have some internal redundancy to improve its resilience — for example, mirrored discs, redundant
power supply and fans.
The system components can also improve the systems integrity by checking their internal operation
for correctness – for example parity on data paths, and ECC on memory. This kind of checking
improves the speed that faults can be detected and helps contain the damage caused by otherwise
silent faults.