Specifications

ManualsBrandsSentry ManualsMusical equipmentSPD 5.5.5

121

122

123

124

125

126

127

128

129

130

Page 121 /148

components each of which runs in its own protected environment and interacts

with the other components over clean, well-defined interfaces. The second is to

provide enough computing power to each component such that it rarely, if ever,

runs under stress.

The routing system is built on top of a version of Unix that has been custom

modified for robust operation under loaded conditions. In addition to providing the

stability that comes with over 15 years of accumulated industry experience, the

Unix operating system provides protected environments (separate address

spaces) for the routing protocols, network management, and user interface to run.

This removes most opportunities for runaway applications to corrupt each other

and/or the kernel. The routing system is powered by a state-of-the art Intel

processor that provides sufficient computing cycles to keep the processor from

being heavily loaded.

The embedded system itself is broken up into two independent pieces, one of

which runs on a processor in the SCB or SSB and the other on a processor on

the individual FPCs. This structure makes it difficult for errors in one of these

components to corrupt the other, or to corrupt the routing system. Additionally, as

is the case for the routing system, the SCB / SSB and the FPC processors

provide more than enough computing power for the task so that failures due to

loaded conditions should be extremely rare. Neither the SCB / SSB nor the FPC

processor handles the data traffic to be switched. This means that the operating

conditions seen by the software span a much smaller dynamic range than a

system in which the CPU’s are doing the switching, making the software much

easier to test and get right.

4.1.6 Hardware Errors

The packet-forwarding engine is built using state-of-the-art hardware that uses

conservative design rules to achieve high reliability. Perhaps the single most

important contributor to the reliability of the PFE is the fact that it is implemented

using a small number of extremely highly integrated CMOS circuits. Almost all the

improvements in the reliability of digital electronic systems over the last 30 years

can be attributed to the increased use of monolithic integrated circuits, and the

Mxxx exploit this fact to the maximum extent allowed by today’s technology. A

small handful of custom ASICs, high volume SRAMs, DRAMs, and

microprocessors implement over 95% of the system’s functionality. The

approach results in a superlative MTBF for the Mxxx.

Most performance parameters of the PFE are deliberately over-engineered to

make it extremely unlikely for any kind of traffic to overwhelm the system. Shared

memory capacity is many times the strict minimum necessary and is pooled into a

single common resource to make it effectively even larger. Input and output

packet engines are sized for a minimum of twice the line rate to avoid any

problems with runs of short packets. The route lookup engine is also centralized

and is sized to be roughly four times faster than is called for by average packet

size.

All signals that cross chip boundaries are either parity or CRC checked for

corruption, and all data stored in external memory is either ECC or parity

protected. There is extensive internal consistency checking and logging built in to

the ASICs. The system is designed for testability and provides full support of

JTAG for boundary as well as full-scan.

The core PFE system is fully synchronous, and uses time tested digital design

practices for timing, clocking, and signal integrity. All timing and voltage

margining was done for the worst case process, supply, and temperature corners

to ensure that the system will function reliably under the most marginal of

environmental conditions.

The PFE features redundant fans and power supplies to ensure that the most

commonly occurring hardware failures are removed from the system. Either of the

dual fan trays is capable of cooling the system indefinitely, while the dual power

supplies are load sharing and the system can operate on either one of them.

The system architecture deliberately avoids the use of switching cards to reduce

the number of backplane connections for the sake of improved reliability. Since

connectors are amongst the most frequent causes of failure, halving the number

of connections makes a significant dent in the computed failure rate. In fact, the

failure rate for the machine improves by 400 FIT simply as a result of this

packaging choice. Furthermore, the M20 also avoids the use of extensive in-

system redundancy because this would increase complexity and potentially make

the machine less reliable.