The implementation of Recursive Coordinate Bisection (RCB) in Zoltan is due to Steve Plimpton of Sandia National Laboratories and was modified by Matt St. John and Courtenay Vaughan. In this implementation of RCB, the parallel computer is first divided into two pieces and then the computational domain is divided into two pieces such that the proportion of work in each piece is the same as the proportion of computational power. The division of the parallel machine is done by a subroutine which is part of the support for heterogenous architectures that is being built into the Zoltan library. This process is repeated recursively on each subdomain and its associated part of the computer. Each of these divisions are done with a cutting plane that is orthogonal to one of the coordinate axes.

At each of these stages, each subdomain of processors and the objects
that are contained on those processors are divided into two sets based
on which side of the cutting plane each object is on. Either or both of
these sets may be empty. On each processor, the set of objects which are
on the same side of the cut as the processor are retained by the processor,
while the other objects are sent to processors on the other side of the
cut. In order to minimize the maximum memory usage in each set of processors,
the objects that are being sent to each set of processors are distributed
such that each each processor in a set has about the same number of objects
after the objects from the other set of processors are sent. In the case
when a processor has more objects that it will retain than the average
number of objects that the rest of the processors have in its set, then
that processor will not receive any objects. Thus each processor may send
and receive objects from several (or no) processors in the other set. The
process of determining which outgoing objects are sent to which processors
is determined in the subroutine **Zoltan_Create_Proc_List**. Once this new
distribution of objects is determined, the
**unstructured communication package** in
Zoltan is used to determine which processors are going to receive which
objects and actually move the objects.

For applications that wish to add more objects to the decomposition
at a later time (e.g., through **Zoltan_LB_Box_Assign** or **Zoltan_LB_Point_Assign**), information to do this can be retained during the
decomposition phase. This information is kept if the parameter KEEP_CUTS
is set during the decomposition (see the RCB section in the
**Zoltan User's Guide**).
This information about the decomposition can be thought of as a tree with
the nodes which have children representing the cut information and the nodes
with no children representing processors. An object is dropped through the
tree starting with the root node and uses the cut information at each node it
encounters to determine which subtree it traverses. When it reaches a terminal
node, the node contains the processor number that the object belongs to.
The information to construct the tree is saved during the decomposition.
At each step in the decomposition, when each set is divided into two sets,
the set with the lowest numbered processor is designated to be the left set
and the information about the cut is stored in the lowest numbered processor
in the other set of processors which is the right set. As a result of this
process, each processor will store information for, at most, one cut, since
once a processor stores information about a cut, by being the lowest numbered
processor in the right set, it will always be in a left set after each
subsequent cut since it will be the lowest numbered processor in the set
being cut and the set it is put into will be the left set. The processor
which stores the cut information also stores the root node as its parent.
After the end of the division process, all of the information is collected
onto all of the processors. The parent information is then used to establish
the leaf information for the parent. When this information is gathered, the
tree structure is stored in arrays with the array position determined by the
processor number that was storing the information. There is an array which
stores the position of the cut information for the left set and one for the
right set as well as arrays for the cut information. Given that the lowest
numbered processor after a cut is in the left set, the cut information is
stored in the right set, and there is one fewer cut than the total number of
processors, processor 0 has no cut information, so the 0 position of the right
set array is empty and is used to store the position in the array that the
first cut is stored. When this information is used to process an object,
array position 0 in the right set array is used to determine the array
position of the first cut. From there, which side of the cut the object is
on is determined and that information is used to determine which cut to test
the object against next. This process is repeated recursively until a
terminal node is encountered which contains the processor number that the
object belongs to.

When the parameter RCB_REUSE is
specified, the RCB algorithm attempts to use information from a previous
RCB decomposition to generate an "initial guess" at the new decomposition.
For problems that change little between invocations of RCB, using RCB_REUSE
can reduce the amount of data movement in RCB, improving the performance
of the algorithm. When RCB_REUSE is true,the coordinates of all objects obtained through query functions are passed through
**Zoltan_LB_Point_Assign**
to determine their processor assignment in the previous RCB decomposition.
The information for the objects is then sent to the new processor assignments
using the unstructured communication utilities
to generate an initial condition matching the output of the previous RCB
decomposition.
The normal RCB algorithm is then applied to this new initial condition.

There are three major data structures in RCB and they are defined in
*rcb/rcb.h* and *rcb/shared.h*. The points which are being load balanced are represented as a
structure *Dot_Struct* which contains the location of the point, its weight, and
its originating processor number. The nodes on the decomposition tree are
represented by the structure *rcb_tree* which contains the position of the cut,
the dimension that the cut is perpendicular to, and the node's parent and two
children (if they exist) in the tree. The structure *RCB_Struct* is the RCB data
structure which holds pointers to all of the other data structures needed for
RCB. It contains an array of *Dot_Struct* to represent the points being load
balanced, global and local IDs for the points, and an array of *rcb_tree* (whose length is the number of processors)
which contains the decomposition tree.

The parameters used by RCB and their default values are described in the
RCB section of the **Zoltan User's
Guide**. These can be set by use of the **Zoltan_RCB_Set_Param** subroutine
in the file *rcb/rcb.c*.

When the parameter REDUCE_DIMENSIONS
is specified, the RCB algorithm will perform lower dimensional
partitioning if the geometry is found to be degenerate. More information
on detecting degenerate
geometries may be found in another
section.

The main routine for RCB is **Zoltan_RCB** in the file *rcb/rcb.c*.

[Table of Contents | Next: Recursive Inertial Bisection (RIB) | Previous: Using the Test Script | Privacy and Security]