Title: Using Memory Traffic Predictions to Estimate the Runtime of Linear
Algebra Kernels

Speaker: Ian Karlin, University of Colorado at Boulder

Date/Time: Tuesday, September 21, 2010, 10:30 am       

Location: CSRI Building/Room 90 (Sandia NM)

Brief Abstract: Data movement limits the performance of many scientific computing applications. For these programs, runtimes are most accurately expressed in terms of memory traffic, not floating-point operations.  To significantly improve the performance of these applications, memory traffic must be reduced.  Tuning techniques such as loop fusion decrease data movement, often producing speedups proportional to the resulting reduction in memory accesses. However, loop fusion sometimes decreases performance by causing capacity misses in caches and registers. Whether fusion causes misses depends on hardware and routine characteristics. Finding the optimal amount of fusion, which requires trying all possible fusion strategies for all sizes of interest, is often infeasible.

In this talk, we present a memory model that predicts the performance of linear algebra kernels. We include only the most distinguishing machine and routine features, allowing for an economical comparison while maintaining accuracy.  The model works in two phases, first calculating the amount of data accessed from each memory structure for a calculation.  We then turn these memory traffic predictions into runtime estimates and establish their accuracy for both single processor and shared memory parallel systems. Our model is integrated into a compilation framework where its runtime estimates reduce the large number of versions of a routine to a practical collection that is practical to test.  Through the use of the model, compile time is greatly reduced without sacrificing routine speed.

CSRI POC: Scott Collis, 505-284-1123



©2005 Sandia Corporation | Privacy and Security | Maintained by Bernadette Watts and Deanna Ceballos