Title: Multilevel Algorithms for Clustering Documents and Large-scale Proteomic Networks

Speaker: Sang-Cheol Seok, University of Iowa

Date/Time: Monday, March 5, 2007, 9:15am – 10:15am

Location: CSRI Building/Room 95 (Sandia NM)

Brief Abstract: Multilevel algorithms have been used for solving partial differential equations [1, 4, 2] and also for partitioning graphs [3, 5]. Multilevel clustering algorithms, like multilevel partitioning algorithms, try to improve existing clustering algorithms by downsizing the original graphs, which is called coarsening. Multilevel algorithms not only improve qualities of the clustering but also significantly reduce computation time. A clustering method is applied to this smaller graph, and the partition is transferred to the original graph. This idea is very useful because smaller matrices or graphs require much less time to cluster. The process of constructing the smaller matrix is called coarsening, and the process of transferring the partition is called decoarsening. My talk focuses on two applications where multilevel algorithms improve clustering algorithms. One is developing faster and more accurate multilevel clustering algorithms for groups of documents.  And the other is exploiting multilevel algorithms for identifying protein complexes in the proteomic networks. Graph theory is commonly used as a method for analyzing these applications. Each node represents an object, and edges correspond to the relationship between two objects connected by the edge.  One common property of graphs modeling in applications is that nodes do not have weights. However, the graphs are very different from each other. Document clustering algorithms work on high density graphs where edge weights distributes between 0 and 1. Whereas, proteomic networks are quite spare and edges are not weighted.

References
[1] A. Brandt. Multi-level adaptive solutions to boundary value problems. Math.Comp., 31:333{390, 1977.
[2] W. Briggs. A Multigrid Tutorial. SIAM, Philadelphia, 1987.
[3] B. Hendrickson and R. Leland. A multilevel algorithm for partitioning graphs. Technical Report SAND93-1301, Sandia National Laboratories, 1993.1
[4] D. Jespersen. Multigrid methods for partial differential equations. Studies in Numerical Analysis, 24, 1984.
[5] George Karypis and Vipin Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput., 20(1):359{392, 1998.

CSRI POC: Erik Boman, (505) 844-2003


©2005 Sandia Corporation | Privacy and Security | Maintained by Bernadette Watts and Deanna Ceballos