Providing Open Architecture High Availability Solutions

Providing Open Architecture High Availability Solutions
48
Web-Based Enterprise Management (WBEM). WBEM is a standard being driven by the DMTF
(Distributed Management Task Force) for managing groups of computers connected in a network.
There are several standards feeding into WBEM, including Common Information Model (CIM)
and Intelligent Platform Management Interface (IPMI).
5.5.4 User Interface
A user can see what is going on within a system from three positions:
At the system. Indicators to show which components are active, standby and failed are a
requirement of maintainability. If the indicators are not local to the component that fails, it is too
easy to pull out and replace the working component instead of the failed component. Additionally,
local alarm indicators are useful in case the remote management console fails to detect an error.
At a local console. A GUI screen is useful for debug and configuration, however in most large
applications, no one will ever monitor the system state from a local console.
At remote consoles. Most of the techniques specified in Section 5.5.3 are designed to enable
monitoring of a set of systems from multiple locations over a network. The locations may be local
to the building, across the country, or both.
The function of monitoring a system is critical in high availability systems, so monitor and control
functions must be available from multiple sites. When the system is designed, the methods and
locations of monitoring must be reviewed to ensure that the three monitoring positions are covered.
5.6 In-Service Upgrading
5.6.1 Introduction
System administrators need the ability to perform hardware and software upgrades without
interrupting the service. To support this, the system should provide redundant components so that
operation can be offloaded from the component that needs to be upgraded. In addition, the system
should provide support for hot-swap so that hardware components can be inserted into or removed
from the system while it is operating. Similarly, the system should enable software components to
be upgraded without impacting service. This is known as a rolling upgrade.
5.6.2 Objective
Enable on-line upgrades of hardware and software without interrupting the service.
5.6.3 Concepts
Hot-Swap. The ability to insert or remove a hardware component while the system is powered and
operational.
Rollback. The ability to regress to a previously known good version if the upgrade is unsuccessful.
Split-Mode Upgrade. For systems with redundant components, split-mode upgrades enable the
upgrade of one component at a time by transferring system operation to the component that isn’t
being upgraded.