|
Center for Advanced Computing Research Computer Science and Mathematics Division A
vast amount of scientific and technical computing may be characterized
as “capacity” computing, also referred to as “throughput” computing.
Conventionally, “capability” computing is reserved for more
tightly couple supercomputing. Yet, this second domain is blurry with
few vendors willing to admit that their system offerings are largely
ensembles of commodity parts interconnected by medium bandwidth system
area networks. Custom capability architectures if optimized for the task
of near-fine grain parallel processing incorporate mechanisms to lower
overheads, additional mechanisms to compensate for latency, embody the
semantics of parallel execution, and employ a high bandwidth low latency
communication fabric to minimize wasted cycles. Most systems today do
not provide these functions but nonetheless are referred to as capability
machines, even as they deliver single digit floating point efficiencies
on important mainstream computational problems. Capacity machines do
not need any of these global attributes when optimized for throughput
workloads. Yet many systems like commodity clusters or MPPs of COTS microprocessors
and DRAMs are applied to single problems, usually by means of message
passing libraries. And, this strategy has proven very successful. It
is clear, then, that the distinction of capability versus capacity computing
is inadequate to describe what we really do. Something between the two
that recognizes the reality rather than the hype of conventional practices
is required. This talk will present such an alternative: “Coordinated
Computing” that provides a third domain, positioned between these
two that describes the use of communicating sequential processes on ensembles
of commodity components, thereby reserving the former terms, “capacity” and “capability”,
for the operational ranges to which they are optimally suited. |