Technical information

ManualsBrandsCompaq ManualsComputer equipmentLH4r - NetServer - 256 MB RAM

10 Steps To Resilience

Resilience is a way of virtually eliminating unexpected system downtime. It consists of a combination of

redundancy, availability and manageability features. Complete resilience leads to a system which is always

running, 100% of the time. One definition of resilience in a Business Computing Environment:

Maximising system tolerance to any failure

[and]

providing the highest level of system uptime.

A server either serves a number of people or is involved in supporting mission critical operations. It is

therefore necessary to consider the implications of the server failing. In an ideal world, the server should

never fail. The quality of components used by vendors and the level of quality testing the vendor performs is

crucially important.

However if a component should fail it should be ensured that there is no data loss and that the system can be

brought back up and running as soon as possible with minimal disruption to the users.

Resilience is an important aspect to the Fujitsu server range, and begins at the design stage. A major share of

the company's huge R&D budget has been invested in creating world class technology, which is capable of

meeting the ever-increasing demands of the real world.

This lies at the core of Fujitsu's “10 steps to resilience”.

Each additional ‘step’ builds up the solid foundation of the previous layers:

Prevent Component Failure

1. Component reliability and investment in underlying technology

- stringent quality criteria are

applied to the selection of all Fujitsu's server components.

2. Reliability through design, validation and testing

- rigorous testing procedures are applied to the

system, operating system and environment as well as the manufacturing process.

3. Fujitsu Software Partnerships

- alliances with the world's leading software technology companies -

such as Microsoft, SCO, Novell, SAP and Citrix - enable Fujitsu to bring the best platform technology to

its customers, with compatibility pre-certified by the software vendor.

4. Server Management

– analysing the status of the server, disk usage, processor temperature etc.,

allowing pre-failure problems to be dealt with before they might cause a failure.

These four steps are built in at no extra cost, all focussed on minimising the chance of a component failure.

Prevent System Failure

5. Redundancy against data loss

– saving your business by duplicating your critical business data

against loss or corruption e.g. ECC memory, RAID disks.

6. Redundant components (extra)

– ensuring your server continues to service your users/customers even

if a component should fail, e.g. disks, ECC memory, LAN Card, fans, PSU.

7. Uninterruptible Power Supplies (UPS)

– protecting your server if a power cut occurs or someone

accidentally pulls out the wrong plug; the UPS will kick in to keep the server running. The UPS also

smoothes out any “spikes” in the mains power to prevent any harm to your server.

Zero Down-time

8. Hot-swap/spare components

– replacing failed (redundant) components without impacting your

user/customer service i.e. zero downtime, by using hot-swap:- disks, fans and PSUs. Risk can be reduced

even further by stopping the chance that two component failures might bring the service down. The

server automatically configures the hot-spare to replace a failed component e.g. hot-spare redundant disk.

Manage Failure

9. Automatic Server Recovery

– often under-rated, it constantly monitors whether the operating system

is alive and running. Once it is sure the O/S has hung it automatically re-boots the system.

10. Availability Clusters

– consists of two servers where the applications on one server can be moved

across to the other, either manually if you want to upgrade one of the servers or automatically in the event

of a component failure where the component is a single point of failure.

All of this adds up to ...

Maximum System Resilience