Job submission and control
To use the Grid Engine software the environment must be updated by calling (or putting the following line into the .profile settings file):
or if you use environment modules:
module load gridengine-client
Basically every users needs only a couple of commands for handling the grid:
- qsub - submit a jobscript
- qstat - watch the status of one or more jobs
- qalter - change job parameters
- qdel - delete a job on the cluster
- qhost - show the status of hosts, queues, jobs
lugmayr@dali$ qsub test-batch.sh
Your job 1231099 ("test-batch.sh") has been submitted
Usually programs print text messages to a so called STDOUT and STDERR device. Normally these messages are shown in the commandline as usual text output. Since the jobs of the grid are running in the background on a node, these text messages are redirected to 2 files:
- scriptname.ojobnumber (e.g. test-batch.sh.o1231099) are the usual messages from STDOUT.
- scriptname.ejobnumber (e.g. test-batch.sh.e1231099) are the error messages from STDERR.
Submit commandline commands directly to the cluster
This can be done by defining an alias (in e.g. .bashrc) and change the queue and project name to the ones of your group.
alias qsubcmd='qsub -q training.q -P clustertraining -cwd -b y -shell y -N qsubcmd'
and it can be used as follows:
lugmayr@dali$ qsubcmd "hostname; date; pwd"
The STDOUT and STDERR output goes to qsubcmd.e* and qsubcmd.o*.
job-ID prior name user state submit/start at queue slots ja-task-ID
1231099 0.00000 test-batch lugmayr qw 03/02/2010 13:58:16 1
lugmayr@dali$ qstat -j 1231099
submission_time: Tue Mar 2 13:58:16 2010
usage 1: cpu=00:00:00, mem=0.00000 GBs, io=0.00000, vmem=N/A, maxvmem=N/A
scheduling info: scheduling info is turned off
lugmayr@dali$ qalter 1231099 -q nonofficehours.q
modified hard queue list of job 1231099
If your job does not start running, you can verify your settings by:
qalter -w v 1231099
You are only allowed to delete your own jobs!
lugmayr@dali$ qdel 1231099
lugmayr has registered the job 1231099 for deletion
Delete all jobs of a dedicated user:
qdel -u lugmayr
Delete all jobs of user flyuser running on 11/12(/2009) and are in status qw.
qstat -u flyuser | grep "11\/12" | grep qw | cut -c-7 | xargs qdel
qhost -q -j
qhost -j -h compute-6-8
qhost -u lugmayr
- Using "-j" option, qhost lists all currently running jobs beneath the hosts on which they are running.
- Using the "-q" option, qhost displays all queues that have instances on a host beneath the corresponding host.
- Using the "-h hostlist" option will display only the information about the listed hosts.
- Using the "-u user" option will display only jobs from the specified users. This implies the "-j" option.
Job status mails
The Grid Engine can keep you informed about your job status. To use this feature you can add to your scripts e.g.:
#$ -M email@example.com
#$ -m ea
The options for the -m flags are (as written in the man page):
- b - Mail is sent at the beginning of the job.
- e - Mail is sent at the end of the job.
- a - Mail is sent when the job is aborted or rescheduled.
- n - No mail is sent.