Difference: BatchSystem (7 vs. 8)

Revision 82015-01-13 - ChrisBrew

 
META TOPICPARENT name="WebHome"

The PPD Batch System

Changed:
<
<
The PPD Batch Systems are shared between the Grid Tier 2 and Local usage (often referred to as Tier 3). We now have two batch system, the legacy Scientific Linux 5 system uses the torque batch system (an evolution of PBS) and the Maui scheduler to allocate the job slots fairly. While the newer Scientific Linux 6 services use Condor.
>
>
The PPD Batch Systems are shared between the Grid Tier 2 and Local usage (often referred to as Tier 3). Since the switch to Sl6 the batch system is based on Condor
 

Resources

Changed:
<
<
The PPD Batch Cluster currently has a nominal capacity of 2700 job slots (one for each of the 2700 CPUs) on 240 nodes. The power of the CPUs are measured using a benchmark called HEPSPEC06, the individual CPUs range from a HEPSPEC06 rating of 8.18 for the oldest CPUs to 12.67 for the newest. The nominal total HEPSPEC06 rating of the cluster is 28,000
>
>
The PPD Batch Cluster currently has a nominal capacity about 3500 job slots (one for each of the 3500 CPUs) on 268 nodes. The power of the CPUs are measured using a benchmark called HEPSPEC06, the individual CPUs range from a HEPSPEC06 rating of 8.18 for the oldest CPUs to 12.67 for the newest. The nominal total HEPSPEC06 rating of the cluster is about 35,000
 
Deleted:
<
<
At the moment approximately 1800 jobs slots are assigned to the SL6 Condor service with about 900 of the older CPU allocated to the SL5 Torque/Maui service.
 

Allocations

The system tries to "fairly" share out jobs starts between different users, it uses a number of different factors to try to do this. Amongst them are whether it is a "local" or grid job, the Grid VO, groups within the VOs and individual user accounts. These are compared to the usage over the last 14 days.

Grid submission to the SL5 Cluster is now switched off so 100% of the remaining resources are available for local Tier 3 jobs.

For the SL6 Condor service the current highest level Shares are:

Local 15%
Grid 85%

There are currently no group shares in the Local partition, for the grid jobs the split below the Grid partition is as follows:

CMS 13%
Atlas 54%
LHCb 20%
Other 3%
Changed:
<
<

Using the PPD batch systems

>
>

Using the PPD Condor batch system

 
Deleted:
<
<

Condor

 Submitting and monitoring Condor jobs
Deleted:
<
<

Torque

Submitting and monitoring Torque jobs
 

-- ChrisBrew - 2009-11-17

 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback