Title: High-Performance Computing for Massive Data Analysis 

Speaker: Kamesh Madduri, Georgia Tech University      

Date/Time: Wednesday, April 9, 2008  at  9:00am - 10:00am  

Location: CSRI Building, Room 90 (Sandia NM)

Brief Abstract: Graph-theoretic abstractions are at the core of data-intensive problems arising in social and technological network analysis (e.g., identification of implicit online communities, viral marketing strategies, quantifying centrality and influence in interaction networks, web algorithms), systems biology (for instance, interactome analysis, epidemiological studies, disease modeling), and homeland security (e.g., detecting trends, anomalous patterns from socio-economic interactions and communication data). Due to their large memory footprint, fine-grained computational granularity, and low degrees of spatial locality, massive graph problems pose serious challenges on current parallel machines. In this talk, we present novel algorithmic approaches for enabling large-scale graph analysis. Our shared-memory implementations on high-end multicore servers from IBM and Sun, and massively multithreaded architectures such as the Cray XMT, are the first known to achieve significant parallel speedup for traversal, connectivity, and centrality problems on graph instances of the order of billions of vertices and edges. We also present SNAP (Small-world Network Analysis and Partitioning), an open-source parallel graph framework that we have designed for exploratory network analysis. With a combination of algorithm engineering and novel parallel graph algorithms in SNAP, we achieve a speedup of nearly two orders of magnitude over current state-of-the-art approaches for community identification and centrality in real-world networks.

Bio:  Kamesh Madduri is a PhD candidate in the Computational Science and Engineering division at Georgia Institute of Technology. He is advised by Prof. David A. Bader, and his research interests include high performance computing, parallel algorithms, and software support for large-scale data analysis and scientific applications. Kamesh is currently a NASA graduate student fellow, has received the 2008 Outstanding Graduate Research Assistant award from the College of Computing at Georgia Tech, and awarded honorable mentions from the ACM/IEEE High Performance Computing PhD fellowship committee in 2007, and the NSF graduate research fellowship program in 2005. His dissertation research has been selected for presentation in doctoral colloquia events at Supercomputing 2007 and IPDPS 2008. Kamesh received his undergraduate degree in 2004 from the Indian Institute of Technology Madras.

CSRI POC: Scott Collis, 1416, 284-1123 



©2005 Sandia Corporation | Privacy and Security | Maintained by Bernadette Watts and Deanna Ceballos