META TOPICPARENT |
name="WebHome" |
The PPD Batch System
The PPD Batch Systems are shared between the Grid Tier 2 and Local usage (often referred to as Tier 3). We now have two batch system, the legacy Scientific Linux 5 system uses the torque batch system (an evolution of PBS) and the Maui scheduler to allocate the job slots fairly. While the newer Scientific Linux 6 services use Condor.
Condor
Condor works differently to many other batch systems so it is advised that you have a look at the User Manual. We are currently only supporting the "Vanilla" universe.
Submitting a Job
To submit a jobs to the condor batch system you first need to write a "submit description file" to describe the job to the system: A very simple file would look like this:
####################
#
# Example 1
# Simple HTCondor submit description file
#
####################
Executable = myexe
Log = myexe.log
Input = inputfile
Output = outputfile
Queue
That runs myexe on the batch machine (after copying it and inputfile to a temporary directory on the machine) and copies back the standard output of the job to a file called outputfile
A more complex example submit description would look like:
####################
#
# Example 2
# More Complex HTCondor submit description file
#
####################
Universe = vanilla
Executable = my_analysis.sh
Arguments = input-$(process).txt result/output-$(process).txt
Log = log/my_analysis-$(Process).log
Input = input/input-$(process).txt
Output = output/my_analysis-$(Process).out
Error = output/my_analysis-$(Process).err
Request_memory = 2 GB
Transfer_output_files = result/output-$(process).txt
Transfer_output_remaps = "output-$(process).txt = results/output-$(process).txt"
Notification = complete
Notify_user = your.name@stfc.ac.uk
Getenv = True
Queue 20
This submit runs 20 copies (Queue 20 ) of my_analysis.sh input-$(process).txt result/output-$(process).txt where $(process) is replaced by a number 0 to 19. It will copy my_analysis.sh and input-$(process).txt to each of the worker nodes (taking the input file from the local input directory). The standard output and error from the job are copied back to the local output directory at the end of the job and the file result/output-$(process).txt is copied back to the local results directory. It copies over the local environment to the worker node (Getenv = True ) and requests 2GB of memory to run in on the worker node. Finally it e-mails the user when each job completes. |