Simple batch scripts

Introduction

Batch scripts are the simplest way to use the cluster. But if you have very short-running jobs, the overhead of submitting and running the job on the cluster is much higher as running them local. So maybe it is better to put 10 short-running jobs into one batchscript.

test-batch1.sh

The following is the content of a simple Unix shell script:

 hostname
date
env
date

It can be submitted to the cluster as follows:

 qsub test-batch1.sh

or

 qsub -q nonofficehours.q -P test -cwd test-batch1.sh

test-batch2.sh

The following is the content of the same simple Unix shell script, but it contains Grid Engine control elements in the beginning (starting with #$ commandline-option):

 #$ -S /bin/sh
 #$ -q nonofficehours.q
 #$ -P test
 #$ -cwd
 hostname
 date
 env
 date

The Grid Engine prefers and is working using CSH, but we tell it to use SH/BASH/DASH (-S) and all logs and files can be found/should go into the current working directory (-cwd).

It can be submitted to the cluster as follows:

 qsub test-batch2.sh

Tips

Multipe queue entries

You can configure your job to run more than one queue:

 #$ -q bioinfo.q
 #$ -q nonofficehours.q

This means, if the bioinfo.q is full and the nonofficehours.q has available resources, the job will be started in the nonofficehours.q slots.

 

Controlled bulk job submission

If you have lots of jobs and you do not want occupy the whole cluster, you can edit your submission script for permanently running e.g. 100 jobs:

 MAX_JOBS=100  
 MY_SLEEP=15  
 CUR_JOBS=$(qstat | wc -l);  
 for i in `seq 1 1000000`; do  
  while [ $CUR_JOBS -ge $MAX_JOBS ]; do echo -e "current jobs: $CUR_JOBS\tsleeping $MY_SLEEP"; sleep $MY_SLEEP; CUR_JOBS=$(qstat | wc -l); done  
  qsub test-sleep.sh  
  CUR_JOBS=$(qstat | wc -l);  
 done

So 100 jobs are submitted and then the process sleeps 15 seconds before checking again. If the amount of running jobs is less than 100, new jobs are beeing submitted.