The Sandia/Intel ASCI-red TFLOPS machine has proven to be one of the more technically successful efforts in massively parallel, high-performance computing. However, large MPP systems have drawbacks. Among these are:
While cluster-based projects have firmly established a foundation upon which small- and medium-scale clusters can be based, the current state of cluster technology does not support scaling to the level of compute performance, usability, and reliability of large MPP systems. In contrast, large-scale MPP systems have addressed the problems related to scalability, but are limited by their use of custom components.
In order to scale clusters to thousands of nodes, the following must be addressed:
Computational Plants build on the design of the TFLOPS system, from which several key concepts can be derived:
A key concept in the Cplant is the ability to grow and be pruned. Each year, a new phase, or branch, will be added to the plant to increase its capability with the latest, cost-effective hardware components. Older, possibly obsolete hardware will be pruned from the plant after three years. In order to realize this grow and prune strategy, the plant must be designed from building blocks that enable the machine to adapt to change.
Scalable Units (SU's) are the basic building blocks that provide computing resources to the Cplant. The definition of a scalable unit is intended to be as non-specific as possible in order to allow a variety of vendors to supply components that meet the criteria. The use of the partition model of resource provision allows scalable units to provide a variety oftypes of resources, such as service, compute, I/O, and network. Most of the Cplant resources will be compute resources that can service distributed memory programs and, at a minimum, run an MPI process. At least one scalable unit must provide a service capability as a direct interface to users from where jobs can be started, monitored, and debugged. Scalable units can also provide specialized resources such as enhanced secondary or tertiary storage and enhanced network capability. A specialized resource may be available to applications running within a compute partition and/or to users through a service partition. Scalable units must also respond to the queries and commands defined for the support system. It is also desirable that a scalable unit allow for remote power-cycling to install or update any software on the system.
The primary purpose of the support system is to bond scalable units together. Most of the functionality of the support system consists of querying and controlling scalable units. Typically the support system will encompass a superset of the functionality of any single scalable unit, since it must support all scalable units. The support system has two logical pieces, a system support network and a (usually pre-existing) local infrastructure.
The boundary between the support system and the scalable unit is somewhat blurred. If scalable units do not provide sufficient support for control and query, then the support system may be extended to include a system support station (sss) that is ``attached'' to the scalable unit. In this way, the support system grows in a scalable manner with the number of scalable units, but the scalable units do not have to include support system functions by design.
The system support network provides the ability to configure, customize, monitor, maintain, and control the entire Cplant. The simplest hardware realization of a system support network might be a console and keyboard with physical connections to every component of the system. This is, however, not a scalable solution. A minimal hardware realization must include a special component within each scalable unit that serves as a proxy for all interaction with the scalable unit, a system console that serves as the point of entry for system administrators, and a network connecting scalable unit components to the console. In general, some sort of broadcast medium is preferable. Fault tolerance through multiple paths in the network and multiple consoles is also desirable.