Condor works differently to many other batch systems so it is advised that you have a look at the [[http://research.cs.wisc.edu/htcondor/manual/v8.0/2_Users_Manual.html][User Manual]]. We are currently only supporting the "Vanilla" universe. ---+++ Submitting a Job To submit a jobs to the condor batch system you first need to write a "submit description file" to describe the job to the system: A very simple file would look like this: <pre> #################### # # Example 1 # Simple HTCondor submit description file # #################### Executable = myexe Log = myexe.log Input = inputfile Output = outputfile Queue </pre> That runs <code>myexe</code> on the batch machine (after copying it and <code>inputfile</code> to a temporary directory on the machine) and copies back the standard output of the job to a file called <code>outputfile</code> A more complex example submit description would look like: <pre> #################### # # Example 2 # More Complex HTCondor submit description file # #################### Universe = vanilla Executable = my_analysis.sh Arguments = input-$(process).txt result/output-$(process).txt Log = log/my_analysis-$(Process).log Input = input/input-$(process).txt Output = output/my_analysis-$(Process).out Error = output/my_analysis-$(Process).err Request_memory = 2 GB Transfer_output_files = result/output-$(process).txt Transfer_output_remaps = "output-$(process).txt = results/output-$(process).txt" Notification = complete Notify_user = your.name@stfc.ac.uk Getenv = True Queue 20 </pre> This submit runs 20 copies (<code>Queue 20</code>) of <code>my_analysis.sh input-$(process).txt result/output-$(process).txt</code> where <code>$(process)</code> is replaced by a number 0 to 19. It will copy <code>my_analysis.sh</code> and <code>input-$(process).txt</code> to each of the worker nodes (taking the input file from the local <code>input</code> directory). The standard output and error from the job are copied back to the local <code>output</code> directory at the end of the job and the file <code>result/output-$(process).txt</code> is copied back to the local <results>results</code> directory. It copies over the local environment to the worker node (<code>Getenv = True</code>) and requests 2GB of memory to run in on the worker node. Finally it e-mails the user when each job completes. ---+++ Monitoring Your Jobs The basic command for monitoring jobs is <code>condor_q</code> by default this only shows jobs submitted to the "schedd" (essentially submit node) you are using, to see all the jobs in the system run <code>condor_q -global</code> If jobs have been idle for a while you can use <code>condor_q -analyze <job_id></code> to look at the resources requested by the job and how they match to the available resources on the cluster. Failed jobs often go into a "Held" state rather than disappearing, <code>condor_q -held <jobid></code> will often give some information on why the job failed. <code>condor_userprio</code> will give you an idea of the current usage and failshares on the cluster. ---+++ Local Commands The PPD interactive machines have some local commands for making the Condor batch system a bit more convenient to use and more like the LSF system used on lxplus at CERN. You can use =bqsub= to submit jobs (similar to LSF's bsub command). The full command can be specified on the command-line, so you don't need to create a "submit description file". Specify a time limit with =bqsub -c hh:mm=. eg. <verbatim> bqsub -c 24:00 Sherpa PTCUT:=20 EVENTS=1000 </verbatim> =qjobs= (like LSF's =bjobs=) lists your running jobs with more helpful information. =qpeek= (like LSF's =bpeek=) shows a running job's logfile (this may prompt for your password to ssh to the batch worker node). Use =bqsub -h=, =qjobs -h=, or =qpeek -h= for help. -- Main.ChrisBrew - 2014-03-26
This topic: Computing/Documentation
>
WebHome
>
BatchSystem
>
BatchUsageCondor
Topic revision: r2 - 2015-02-10 - TimAdye
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback