|
Complexity Issues in Data Intensive High End Computing
Panel Chair: Dr.
Garth Gibson, Assoc Prof, CMU, and CTO, Panasas, USA
Take it as a given
that computational speeds are going up and that this will lead to more
data being created, accessed and managed. So what is
the big deal? Clusters get bigger, applications get bigger, so why would
storage getting bigger be any harder? Could it be the 9s? Having every
byte of tera- and increasingly petabyte stores available to all nodes
with good performance for all but minutes a year, when files and volumes
are parallel apps on the storage servers, might be a higher standard
than compute nodes are held to.
Or perhaps it is deeper and deeper writebehind and readahead, and more
and more concurrency, needed to achieve the ever larger contiguous blocks
that are needed to minimize seeks in ever wider storage striping. Or
maybe Amdahl's law is hitting us with the need to parallelize more and
more of the metadata work which has been serial and synchronous for correctness
and error code simplicity in the past. Or maybe parallel file systems
developers have inadequate development tools in comparison to parallel
app writers. Or perhaps storage system developers are just wimps.
This panel will try to expose some heat and light on these concerns.
Questions for panel members:
- Bandwidth: In
the next decade is the bandwidth transferred into or out of one "high end computing file system" (a) going down
10X or more, (b) staying about the same, (c) going up 10X or more, or
(d) "your answer here", as a result of the expected increase
in computational speed in its client clusters/MPPs, and why?
- Spindle
Count: In the next decade is the number of magnetic disks in one "high
end computing file system" (a)
going down 10X or more, (b) staying about the same, (c) going up 10X
or more, or (d) "your
answer here", as a result of the expected increase in computational
speed in its client clusters/MPPs, and why?
- Concurrency:
In the next decade is the number of concurrent streams of requests
applied
to one "high end computing file system" (a)
going down 10X or more, (b) staying about the same, (c) going up 10X
or more, or (d) "your answer here", as a result of the expected
increase in concurrency in client clusters/MPPs, and why?
- Seek
Efficiency: In the next decade is the number of bytes moved per magnetic
disk seek
in
one "high end computing file system" (a)
going down 10X or more, (b) staying about the same, (c) going up 10X
or more, or (d) "your answer here", as a result of the expected
increase in computational speed in its client clusters/MPPs, and why?
- Failures:
In the next decade is the number of independent failure domains in
one "high
end computing file system" (a) going down
10X or more, (b) staying about the same, (c) going up 10X or more, or
(d) "your answer here", and why?
- Coping with Complexity:
If you have answered (c) one or more times, please explain why these
large increases are not going to increase the
complexity of storage software significantly? Are you relying on the
development of any currently insufficient technologies, and if so, which?
- Development Time
Trends: If complexity is increasing in high end computing file systems,
is the time and effort required to achieve acceptable
9s of availability at speed (a) going down 10X or more, (b) staying
about the same, (c) going up 10X or more, or (d) "your
answer here", and why? Are you relying on the development of any
currently insufficient technologies, and if so, which?
Sandia
National Laboratories | Privacy
and Security
|