Providing Open Architecture High Availability Solutions

Providing Open Architecture High Availability Solutions
109
Robustness – The property of a software component, particularly an OS, that incorporates tests for
many error conditions and has been designed in a way which protects it from errant behavior.
Role – The function a component plays in a redundant system. Typical roles are active, standby,
and unassigned or spare.
Rolling upgrade – The process of upgrading software by making changes on one redundant
component at a time, so that service availability is maintained
Safety – The attribute associated with systems that can not damage either the environment or the
users.
Security – The attribute associated with systems that do not allow information to flow to
unauthorized components or users.
Service availability – The availability of the primary service provided by a computing system as
measured by the users of that service. This differs from general high availability which can be
discussed on a component-by-component level.
Service group – A grouping of one or more components, along with their redundant counterparts,
which provide a service. A set of three power supplies might form a power supply service group
which supplies power to the system.
SNMPSimple Network Message Protocol. SNMP is a set of protocols that pass information
about whether or not a component is operating properly. SNMP uses management information
bases (MIBs) to store data about a system.
Software Rejuvenation – Restarting a software component from no later than the point at which it
gets allocated resources and initializes its variables.
Spatial redundancy – Redundancy by use of extra components in either hardware or software. See
also temporal redundancy
Spare – The role of a redundant component that is not in standby, but is available to be placed into
service by the configuration management service.
StandbyThe role of a redundant component that is monitoring its redundant counterpart and is
ready to be switched into service in place of its counterpart.
State Information used by an application to determine its function and significant points of
operation. State may be as simple as available vs. failed, or as complex as the frame ID for the next
package to be processed along with the parameters needed for processing.
State change – Any change in the state information of an application. This term is sometimes used
in the strict sense to refer only to changes in the operational state from available to in-service to
faulted or similar states.
SwitchoverChanging from using one component to using its redundant counterpart.
System LogA file that keep track of major events which occur in the system. This typically
includes starts, stops, faults and other similar information.
System MIBA Management Information Base (MIB) that contains the information about basic
system parameters and kernel operation.