Coordinated Computing: Defining the Third Domain

homas Sterling
Center for Computation and Technology
Louisiana State University

Center for Advanced Computing Research
California Institute of Technology

Computer Science and Mathematics Division
Oak Ridge National Laboratory

A vast amount of scientific and technical computing may be characterized as “capacity” computing, also referred to as “throughput” computing. Conventionally, “capability” computing is reserved for more tightly couple supercomputing. Yet, this second domain is blurry with few vendors willing to admit that their system offerings are largely ensembles of commodity parts interconnected by medium bandwidth system area networks. Custom capability architectures if optimized for the task of near-fine grain parallel processing incorporate mechanisms to lower overheads, additional mechanisms to compensate for latency, embody the semantics of parallel execution, and employ a high bandwidth low latency communication fabric to minimize wasted cycles. Most systems today do not provide these functions but nonetheless are referred to as capability machines, even as they deliver single digit floating point efficiencies on important mainstream computational problems. Capacity machines do not need any of these global attributes when optimized for throughput workloads. Yet many systems like commodity clusters or MPPs of COTS microprocessors and DRAMs are applied to single problems, usually by means of message passing libraries. And, this strategy has proven very successful. It is clear, then, that the distinction of capability versus capacity computing is inadequate to describe what we really do. Something between the two that recognizes the reality rather than the hype of conventional practices is required. This talk will present such an alternative: “Coordinated Computing” that provides a third domain, positioned between these two that describes the use of communicating sequential processes on ensembles of commodity components, thereby reserving the former terms, “capacity” and “capability”, for the operational ranges to which they are optimally suited.

Sandia National Laboratories | Privacy and Security

Maintained by: Bernadette Watts
Modified on: February 13, 2006