Cplant Architecture

Architecture

CPlant uses the partition model developed on Sandia's 1890-node Intel Paragon and Intel TeraFLOPS machines. All of the nodes on the high-speed network are treated by the system as a single pool of nodes. Administrators can divide the machine into several functional partitions: service, compute, disk I/O, and network I/O. On the Paragon and TFLOPS these partitions run different operating systems and are relatively difficult to reconfigure. On Cplant, all of these partitions run Linux, and kernel modules are used to adapt the operating system to the functionality of the partition. The ability to dynamically change the personality of node by unloading and loading kernel modules simplifies reconfiguration. The following describes the basic partitions. See this paper for more detailed information about the partition model.

Partition model

Service

The service partition provides the services that allow users to interact with the machine. Users that have logged into a node in the service partition have the ability to launch parallel programs, provide input, receive output, debug, and monitor performance. Nodes in the service partition are typically configured with the features of standard workstation. In addition, the tools needed to support parallel programs, such as debuggers and performance monitors, are available.

Compute

The compute partition provides the compute cycles to the applications. When a parallel job is launched, the individual processes run on compute nodes, one process per node. All of the resources of an individual compute node, including compute cycles, memory, and network, are dedicated to the process. The compute partition typically runs a high-performance operating system that assists in the total dedication of resources to the application process. The compute partition can only be accessed through services provided by the service partition. Direct user access is not permitted.

File I/O

The file I/O partition provides a high-performance parallel filesystem to parallel jobs. Typically, compute nodes are diskless, and secondary storage needs are satisfied through the file I/O partition. Application processes running in the compute partition move I/O data across the high speed network to nodes in the file I/O partition which then write the data to disk. This partition only can be accessed through services in the service partition, such as parallel FTP, or through application processes in the compute partition via a parallel I/O API, such as Intel's Parallel File System (PFS) or the I/O functions of MPI-2 .

Network I/O

The network I/O partition provides a means of moving data off of the machine to other networked resources, such as a visualization component. Network partition nodes contain additional network interfaces, such as ATM or Gigabit Ethernet, for moving data from the compute partition to external sites. This partition can only be accessed by services in the service partition or through application processes in the compute partition via a standard communication API, like UNIX sockets or MPI.