Why aren't my batch jobs starting?

The RALPP Batch system uses the Maui scheduler to fairly allocate CPU resources between users, groups and Tier 2 (Grid) and Tier 3 (local) use. This means that which job starts next not on how long it has been in the queue but on recent usage of the batch system by the different user groups. So, although you job might head the list of queued jobs reported by qstat that does not mean that it will necessarily start soon.

Luckily Maui provides a few commands to look at the current status which allow you to work out what is going on.

The first showq is fairly straight forward in that it displays the current job queue in priority order:

ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME

5768701            prdcms12    Running     1  2:09:06:18  Sat Oct 23 21:30:51
5768715            prdcms12    Running     1  2:09:06:18  Sat Oct 23 21:30:51
...
5792780            pltatl09    Running     1  4:00:00:00  Mon Oct 25 12:24:33
5792777             prdh107    Running     1  4:00:00:00  Mon Oct 25 12:24:33

  1133 Active Jobs    1133 of 1640 Processors Active (69.09%)
                       225 of  243 Nodes Active      (92.59%)

IDLE JOBS----------------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


0 Idle Jobs

BLOCKED JOBS----------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


Total Jobs: 1146   Active Jobs: 1146   Idle Jobs: 0   Blocked Jobs: 0
-- ChrisBrew - 2010-10-25
Edit | Attach | Watch | Print version | History: r5 | r4 < r3 < r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r1 - 2010-10-25 - ChrisBrew
 
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback