Graph algorithms and applications have been central to my research for many years. But until relatively recently, I paid no attention to computer architecture. Many of the graph problems that arise in scientific computing have special structure associated with the underlying 2- or 3-dimensional world the science models. These graphs often have good separators due to geometric locality, and bounded degrees. But as my interests have broadened to include informatics and data mining, I am increasingly working with graphs that that have very little exploitable structure.
Algorithms on these highly unstructured graphs are quite challenging to parallelize on traditional distributed memory machines. They typically lack good partitions, thereby requiring lots of communication and also very large data structures for ghost nodes. They may have very high degree nodes which lead to load balancing challenges, and their lack of locality leads to terrible performance at all levels of the memory heirarchy. Instead, I have become a fan of massive multithreading as a means to tolerate latency and thereby achieve high performance even on these challenging applications. These issues are addressed in the following papers.