Table of Contents

Name


bebopd -- Cplant node allocation daemon

Synopsis


bebopd [-D] [-S [1|0]] [-L [1|0]] [-daemon] [-alternative] [-r optional-file-name] [-help] [-PBSsupport] [-PBSupdate] [-PBSinteractive numNodes]

Description


The bebopd daemon runs in the service partition. It is the point in the Cplant where knowledge of compute node status resides. It has the following interfaces:

PCTs

The bebopd receives messages from the compute node PCTs when they start and end, and when an application terminates. If the bebopd is restarted, it contacts the PCTs to identify itself to them. The bebopd P sends status queries as needed to the PCTs and maintains the responses.
yod

The bebopd accepts yod requests on behalf of users wishing to run a parallel application. The bebopd attempts to allocate to the job the requested nodes, and assigns a numeric job ID to the applicat ion.
pingd

It also accepts pingd requests for updates from the compute parti tion, and returns pingd a list of compute node status information. It accepts requests from pingd to send a SIGTERM or a SIGKILL to an application, kill PCTs, or to note that a PCT it thought was out there is gone. The bebopd may also receive requests from pingd to turn on or off PBSsupport or PBSup date, or to change the number of nodes reserved for interactive (i.e. n on-PBS) use.

PBS server
When the bebopd is run in PBSupdate mode, it updates the PBS serv er whenever the number of live compute nodes changes. That is, it uses the PBS qmgr client to keep the resources_available.size and resources_max.size attributes of the PBS server accurate.

The bebopd as designed today exists as a single process on one node of the service partition. The plan is to run bebopd as a distributed service across the servic e partition, both in the interest of fault tolerance and to improve response time to yod and pingd users.

Options


-alternative
Every portals process has a portal ID. It is this ID that the portals module uses when dispatching received messages to process es. For testing purposes we may want to run another bebopd on the same node. This argument causes the bebopd to request an unused portal ID from the portals module. The bebopd will display it's alternative portal ID on startup.

-D
This option causes the bebopd to output information about what it is doing. Repeating the -D option on the command line increases the amount of information.

-S [0|1]
The bebopd outputs warnings and errors, and, if the -D option is used, status information. The 0 switch turns off all output from the bebopd to stderr. The 1 switch turns it on. By default, the bebopd does not write to stderr.
-L [0|1]
The bebopd outputs warnings and errors, and, if the -D option is used, status information. The 0 switch turns off all output from the bebopd to the log file. The 1 switch turns it on. By default, the bebopd does write to log file.

-r optional-file-name
This option specifies that the bebopd is being restarted. The bebopd always saves a file (CRsaved_pct_list in t he same directory as the bebopd registry file) containing a list of active PCTs when it exits. When bebopd restarts, it reads in this file and contacts the PCTs for their status. If an optional-file-name is given, the bebopd will look there for the PCT list instead of in the CRsaved_pct_list file.

-help
This option displays the list of bebopd options.

-daemon
This option runs the bebopd in the background. The default is to run the bebopd as a foreground process.

-PBSsupport
-PBSupdate
PBS (Portal Batch System) on Cplant requires support from the beb opd. The bebopd is running in PBSsupport mode if it is keeping track of the number of live comp ute nodes in the machine and policing PBS users to ensure they use no more nodes than they were allocated. The bebopd is running in PBSupda te mode if in addition it sends updates to the PBS server whenever the nu mber of live compute nodes changes. These two arguments can be used t o turn on PBSsupport or to turn on PBSupdate. Since PBSupdate implies PBSsupport, turning on PBSupdate automatically turns on PBSsupport.

-PBSinteractive numNodes
The bebopd can reserve numNodes nodes for interactive use. PBS will not be able to schedule these nodes for batch jobs.

Errors


Errors and warnings are logged to /var/log/cplant on the node hos ting the bebopd.

Signals


On receiving a SIGUSR1 or SIGUSR2, the bebopd will write to the l og file it's identifying information and what routine it is in. On receiving a SIGHUP, the bebopd will close and reopen it's log file, list identifying information to the log file, and re-read the sit e file.

Files


/etc/local/saved_pct_list
This file lists all PCTs that were active when the last bebopd terminated.
/etc/local/site
This file defines site specific information that may be required by the bebopd.
/var/log/cplant
This is the log file where Cplant daemons and utilities log status.

See Also


pingd yod pct site

Bugs


Let us know if you locate any (cplant-help@cs.sandia.gov).


Table of Contents