Title: (Part I) Parallel symbolic analysis to enhance memory scalability of SuperLU (Part II) EigAdept - A framework to build expert eigensolver toolbox Speaker: Dr. Xiaoye (Sherry) Li, Lawrence Berkeley Lab Date/Time: Thursday, October 4, 2007 Location: CSRI Building, Room 90 Brief Abstract: Part I: We present the design, implementation and result of a memory scalable parallel symbolic factorization algorithm for sparse LU. We apply graph partitioning to the graph of A + A^T, to partition/reorder the matrix. The partitioning yields so-called separator tree which exposes the dependencies among the computations. We use the separator tree to distribute the input matrix over processors via a subtree to sub-processor mapping at the coarse-grain level, and a block cyclic layout at the fine-grain level. For large matrices, the parallel algorithm significantly reduces the memory requirement of the symbolic factorization phase, as well as that with the entire SuperLU solver. The maximum per-processor memory footprint is reduced by up to 5-fold on 256 processors. The already relatively small runtime of the sequential algorithm is further reduced. |