Skip to content

Johnlihj/torque

 
 

Repository files navigation

This file contains information concerning the use of the new job array 
features in TORQUE 2.5.

--- WARNING --- 
TORQUE 2.5 uses a new format for job arrays. It in not backwards
compatible with job arrays from version 2.3 or 2.4. Therefore, it is 
imperative that the system be drained of any job arrays BEFORE upgrading. 
Upgrading with job arrays queued or running may cause data loss, crashes, 
etc, and is not supported.


COMMAND UPDATES FOR ARRAYS
--------------------------

The commands qalter, qdel, qhold, and qrls now all support TORQUE 
arrays and will have to be updated. The general command syntax is: 

command <array_name> [-t array_range] [other command options]

The array ranges accepted by -t here are exactly the same as the array ranges
that can be specified in qsub.
(http://www.clusterresources.com/products/torque/docs/commands/qsub.shtml)


SLOT LIMITS
--------------------------

It is now possible to limit the number of jobs that can run concurrently in a
job array. This is called a slot limit, and the default is unlimited. The slot
limit can be set in two ways.

The first method can be done at job submission:

qsub script.sh -t 0-299%5

This sets the slot limit to 5, meaning only 5 jobs from this array can be
running at the same time. 

The second method can be done on a server wide basis using the server 
parameter max_slot_limit. Since administrators are more likely to
be concerned with limiting arrays than users in many cases the 
max_slot_limit parameter is a convenient way to set a global policy.
If max_slot_limit is not set then the default limit is unlimited. To set 
max_slot_limit you can use the following queue manager command.

qmgr -c 'set server max_slot_limit=10'

This means that no array can request a slot limit greater than 10, and 
any array not requesting a slot limit will receive a slot limit of 10.
If a user requests a slot limit greater than 10, the job will be
rejected with the message:

Requested slot limit is too large, limit is X. In this case, X would be 10.

It is recommended that if you are using torque with a scheduler like Moab or Maui
that you also set the server parameter moab_array_compatible=true. Setting 
moab_array_compatible will put all jobs over the slot limit on hold
so the scheduler will not try and schedule jobs above the slot limit.


JOB ARRAY DEPENDENCIES
--------------------------

The following dependencies can now be used for job arrays:

afterstartarray 
afterokarray 
afternotokarray 
afteranyarray 
beforestartarray 
beforeokarray 
beforenotokarray 
beforeanyarray

The general syntax is:

qsub script.sh -W depend=dependtype:array_name[num_jobs]

The suffix [num_jobs] should appear exactly as above, although the number of
jobs is optional. If it isn't specified, the dependency will assume that it is
the entire array, for example: 

qsub script.sh -W depend=afterokarray:427[] 

will assume every job in array 427[] has to finish successfully for the
dependency to be satisfied. The submission: 

qsub script.sh -W depend=afterokarray:427[][5] 

means that 5 of the jobs in array 427 have to successfully finish in order for
the dependency to be satisfied.

NOTE: It is important to remember that the "[]" is part of the array name.


QSTAT FOR JOB ARRAYS
--------------------------

Normal qstat output will display a summary of the array instead of displaying
the entire array, job for job. 

qstat -t will expand the output to display the entire array.


ARRAY NAMING CONVENTION
--------------------------

Arrays are now named with brackets following the array name, for example:

dbeer@napali:~/dev/torque/array_changes$ echo sleep 20 | qsub -t 0-299 
189[].napali 

Individual jobs in the array are now also noted using square brackets instead of
dashes, for example, here is part of the output of qstat -t for the above
array:

189[287].napali            STDIN[287]       dbeer                  0 Q batch          
189[288].napali            STDIN[288]       dbeer                  0 Q batch          
189[289].napali            STDIN[289]       dbeer                  0 Q batch          
189[290].napali            STDIN[290]       dbeer                  0 Q batch          
189[291].napali            STDIN[291]       dbeer                  0 Q batch          
189[292].napali            STDIN[292]       dbeer                  0 Q batch          
189[293].napali            STDIN[293]       dbeer                  0 Q batch          
189[294].napali            STDIN[294]       dbeer                  0 Q batch          
189[295].napali            STDIN[295]       dbeer                  0 Q batch          
189[296].napali            STDIN[296]       dbeer                  0 Q batch          
189[297].napali            STDIN[297]       dbeer                  0 Q batch          
189[298].napali            STDIN[298]       dbeer                  0 Q batch          
189[299].napali            STDIN[299]       dbeer                  0 Q batch


About

A version of Torque (PBS) that runs without root permissions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 81.3%
  • Makefile 11.8%
  • Shell 2.6%
  • M4 1.7%
  • Perl 0.8%
  • Tcl 0.7%
  • Other 1.1%