Providing Open Architecture High Availability Solutions
Providing Open Architecture High Availability Solutions
40
5.3 Dynamic System Model
5.3.1 Introduction
The system model provides the basis for both configuration and fault management and is a critical
component in meeting availability targets within an HA system. The model is typically
implemented in an in-memory database within the management middleware. This complete system
model may use information from models of other components, such as a model of the hardware.
Information about these component models may be made available via OS hooks or hardware
management processors. The discussions on data collection in section Section 5.4 cover this in
more detail.
A system model is an abstract description of component capabilities, interfaces, relevant data
structures, and interactions. The system model dynamically tracks all of the managed components
in the system and incorporates the configuration, state, relationships and dependencies of these
components.
5.3.2 Concepts
The information contained in the system model should contain:
• Actual System Population – The Actual System Population is a reflection of the components
available in the system at any given point in time. This is a dynamic population, especially on
systems with hot-swap and hot-plug capabilities. This is essentially a census of the system
component population.
• Intended System Population – The Intended System Population is a reflection of the
components that have been, are currently, or are expected to be part of the system. It should be
possible to remove components from the intended population if the system has changed to the
point where some of the components no longer could be used in that system.
• Component Information – Contains static information about the component including the
class, product, manufacturer and revision of a component. For physical components such as an
assembly, this information is also known as the field replaceable unit (FRU) information.
• Component State – Contains the dynamic information for individual system components that
are relevant to availability management. For an HA system, the information should be
dynamically updated as the actual state of the components changes. For example, the X.731
state model supports attributes such as administrative state, operational state, usage state(s),
availability status and control status [X.731
].
• Component Role – In an HA system with redundant components, these components have
roles such as active, standby and spare/unassigned. Active means that the component is
actively engaged in providing service; standby means that the component is idle and waiting to
take over. Unassigned or spare applies to components that are neither active nor standby, but
are available to be made either active or standby. When an active component fails, the standby
component is promoted to take over the active role. The roles are maintained in the system
model.
• Physical Dependencies – System components have physical parent/child dependencies that
need to be understood in order to manage the availability of a service. Children depend on
their parent’s presence and health for them to be assigned work in the system, while the
parents depend on the presence and health of their children for their own proper operation. For
example, a system’s external I/O may depend on an I/O card, which in turn depends on the
system power. Knowledge of these dependencies enables a management system or an operator
to determine how a fault in one system component affects other components within the system.