next up previous contents index
Next: Count of assigned nodes Up: Node counts out of Previous: Node counts out of   Contents   Index

Count of active compute nodes

The PBS server believes it's resources_available.size attribute is the total number of compute nodes available for use by PBS jobs. This value is set and updated by the bebopd. If it is incorrect, PBS may schedule too many jobs to run, or may not schedule jobs when there are nodes available. To verify the count is correct run both qmgr and pingd and compare the numbers as follows:

command>> qmgr -c "l s resources_available.size"

Server myri-0.n-4.r-3
        resources_available.size = 917

command>> pingd -s
Awaiting status from bebopd...
Awaiting pct list from bebopd

Total: 1018
Total busy: 907
Total free: 110
Total nodes unavailable: 1

Compute nodes are being scheduled by PBS, but 100
nodes are currently reserved for non-PBS interactive use.

Nodes currently hosting PBS jobs:         865
Nodes currently hosting interactive jobs: 43

Free nodes remaining for interactive jobs: 57

The qmgr request shows that PBS thinks there are 917 compute nodes at it's disposal. The pingd command shows that there are 1017 healthy nodes (Total minus unavailable) , but 100 are reserved for non-PBS use. So the PBS total is correct.

If the total is incorrect, you can force the bebopd to update the PBS server again by entering this command:

pingd -PBSupdate on

Normally the bebopd only updates the server when the total changes. So, for example, if that unavailable node came back online, the bebopd would invoke qmgr to update the server's total.


next up previous contents index
Next: Count of assigned nodes Up: Node counts out of Previous: Node counts out of   Contents   Index
Lee Ann Fisk 2001-06-25