Cplant attempts to replicate the system support network employed by the Intel TFLOPS machine. Each TFLOPS cabinet contains a Intel 80386 processor that monitors the health of each node. Node failures are reported to the system so that they can be dynamically configured out of the machine. This reliability, availability, and serviceability (RAS) network is critical to maintaining a stable computing platform.
The Cplant approach uses a set of system support nodes on a support network to perform diagnostics, maintenance, and administration of the machine. The support system is only accessible to administrators and is not visible to users. The following diagram illustrates a two-level system support hierarchy.
A level-0 system support station (SSS-0) is used to monitor and manage a scalable unit (SU), which is a collection of nodes. The SSS-0 is responsible for:
A minimal SSS-0 would have console lines to every node in the SU to permit an administrator to login and diagnose each node. However, this is not a very scalable approach. A more powerful SSS-0 would have the ability to access the console remotely, remote power cycle individual nodes, and remotely configure the system software.
In order to tie SU's together, the SSS-0's must be on a system support network, typically Ethernet. The SSS-0's are connected to a root node, a level-1 system support station (SSS-1). The responsibilities of the SSS-1 are similar to those of an SSS-0, except that an SSS-1 manages a collection of SSS-0's rather than an SU. The SSS-1 can serve as the single point of entry for managing the machine. Access to any node within any SU should be available from the top level system support station. As the machine continues to grow, and the number of SSS-0's increases, an additional level may be needed to manage multiple SSS-1's. This tree topology helps to lessen the effects of expansion on the administration and management of the machine.
See the different platform phases of Cplant for more information on how each realizes the system support infrastructure.