Title: Domain Independent Hash-based Delayed Duplicate Detection for External Memory Search of Large Graphs        

Speaker: Peter Lamborn, Dept of Science and Eng, Mississippi State University

Date/Time: Monday, January 12, 2009, 3:30 – 4:30 pm

Location: CSRI Building/Room 279

Brief Abstract: Graph search has been applied in many areas including artificial intelligence, scheduling, web crawlers, and verification. Graphs suffer from state space explosion because the size of the graph grows exponentially with the number of variables in the state description.   Employing external memory is one method that allows for the search of graphs that exceed the size of RAM. Using external memory requires modifications to the search process. Because disk is accessed most efficiently in a sequential manner it is worthwhile to postpone duplicate elimination until a large batch can be processed. This postponement is called delayed duplicate detection (DDD). There are several ways to perform DDD including brute force comparison, sorting-based and hash based. Hash-based DDD is the most time and space efficient. Hash-based DDD allows heterogeneous parallelization of the state generation and DDD processes. Importantly the DDD can be performed ‘early,’ which saves time and space. Hash-based DDD works by dividing the graph into buckets with the requirement that each bucket fit into RAM. The only previous usage of hash-based DDD required a domain dependent handcrafted hash function to ensure buckets did not exceed RAM. In order to limit bucket size in a domain independent manner we invented a tree based hash function. A tree-based hash function simplifies the process of repartitioning the buckets. A specific bucket can be split while all others remain unmodified. Targeted bucket splitting allows us to dynamically repartition the buckets as the search progresses. Dynamic repartition in gallows us to use hash-based DDD without a domain dependent handcrafted hash function.

CSRI POC:



©2005 Sandia Corporation | Privacy and Security | Maintained by Bernadette Watts and Deanna Ceballos