Revision 102015-07-12 - MonikaWielers

META TOPICPARENT	name="WebHome"

Data Storage

This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.

On the local cluster

For small to medium amounts of data it is recommended that you store it in either:

/opt/ppd/scratch
/opt/ppd/month

Each has a capacity of 1.6 TB. Before downloading please check there is enough space left. Typically these directories are not erased periodically, however, this means you should not keep datasets which are no longer needed. If the disks are full you can send a mail to the group asking everyone to clean his/her space. To see who uses most, do 'du --max-depth=1' and then you can also ask people personally if they could reduce their diskspace.

On AFS

There is an /afs cell at RAL which you can find at:

/afs/rl.ac.uk/

On the dCache storage element locally

Detail instructions on using the dCache element can be found on the DCacheStorageElement page. To make things easier to manage please copy ATLAS datasets into:

/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/

You can use dq2-get to copy files directly into the dCache space. To do this log into heplnx109.gridpp.rl.ac.uk and do:

cd $HOME
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas
dq2-get -S srm://heplnx204.pp.rl.ac.uk:8443/srm/managerv2?SFN=/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/dewhurst user10.janstrube.ganga.data10_7TeV.00152221.physics_L1Calo.merge.AOD.r1239_p134.D3PD_v1.99.0

where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful. Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.

On the dCache storage element on the grid

The advantage of storing data on the grid is that there is a lot more available storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.

To download datafiles use rucio. The documentation on rucio can be found here

. As this is grid you need to type

localSetupRucioClients
voms-proxy-init -voms atlas

To download, find your dataset

rucio list-dids mc12_13TeV:
rucio list-dids mc12_13TeV:mc12_13TeV.116974.Pythia_AUET2BCTEQ6L1_ttbar_*
rucio list-dids mc15_13TeV:mc15_13TeV.*EXOT9*p2363/

Note, every datasets now has a scope called mc15_13TeV, data15_13TeV etc which precedes the actual dataset. For the moment dq-ls still works but you need to ask for at least data15.13TeV. (the dot is important).

As the dataset you want to access might be already existing on the group disk you might want to check this using

rucio list-datasets-rse UKI-SOUTHGRID-RALPP_LOCALGROUPDISK

This might take some time but will tell you what you asked for in the end. To ensure there is enough space in the group area, check
<a href=http://www.hep.lancs.ac.uk/~love/ukdata/site/1/>this page</a>.

Now when you are ready to download, do

rucio add-rule <scope>:<dataset> 1 UKI-SOUTHGRID-RALPP_LOCALGROUPDISK

This is a bit cryptic, but this does the replication. The scope is mc15_13TeV etc and does not seem to be really essential to be given. Rucio can find this out himself.To see something is happening

rucio list-rules --account <username>

To see the dataset itself and the location of the files do

rucio list-file-replicas --rse UKI-SOUTHGRID-RALPP_LOCALGROUPDISK <scope>:<dataset>

This is important as you can put the file location in a file and then give this file to your xAOD job.However the output is not just the complete path to the file. To create your filelist you might want to create it using sed

sed 's/^.*SFN=//' inputlist > outputlist

You might find the macro <a href="/twiki/pub/Atlas/DataStorage/preparelist.sh">preparelist.sh</a> useful. Just say preparelist.sh inputfile outputfile. It's very basic but works...

You can also check the file directly and access it via root, e.g.

ls -l /pnfs/pp.rl.ac.uk/data/atlas/atlaslocalgroupdisk/rucio/mc14_13TeV/
root -l /pnfs/pp.rl.ac.uk/data/atlas/atlaslocalgroupdisk/rucio/mc14_13TeV/f2

The following is obsolete but still kept for the moment!

If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via Data Replication

dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME

The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.

-- AlastairDewhurst - 2010-01-29

preparelist.sh: macro to get file list

META FILEATTACHMENT	attachment="preparelist.sh" attr="" comment="macro to get file list" date="1435834645" name="preparelist.sh" path="preparelist.sh" size="66" user="MonikaWielers" version="1"

Revision 92015-07-02 - MonikaWielers

META TOPICPARENT	name="WebHome"

Data Storage

This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.

On the local cluster

For small to medium amounts of data it is recommended that you store it in either:

Deleted:

<
<

/opt/ppd/scratch
/opt/ppd/month

Each has a capacity of 1.6 TB. Before downloading please check there is enough space left. Typically these directories are not erased periodically, however, this means you should not keep datasets which are no longer needed. If the disks are full you can send a mail to the group asking everyone to clean his/her space. To see who uses most, do 'du --max-depth=1' and then you can also ask people personally if they could reduce their diskspace.

On AFS

There is an /afs cell at RAL which you can find at:

Deleted:

<
<

/afs/rl.ac.uk/

On the dCache storage element locally

Detail instructions on using the dCache element can be found on the DCacheStorageElement page. To make things easier to manage please copy ATLAS datasets into:

Deleted:

<
<

/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/

You can use dq2-get to copy files directly into the dCache space. To do this log into heplnx109.gridpp.rl.ac.uk and do:

Deleted:

<
<

cd $HOME
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas
dq2-get -S srm://heplnx204.pp.rl.ac.uk:8443/srm/managerv2?SFN=/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/dewhurst user10.janstrube.ganga.data10_7TeV.00152221.physics_L1Calo.merge.AOD.r1239_p134.D3PD_v1.99.0

where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful. Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.

On the dCache storage element on the grid

The advantage of storing data on the grid is that there is a lot more available storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.

To download datafiles use rucio. The documentation on rucio can be found here

. As this is grid you need to type

Added:

>
>

localSetupRucioClients
voms-proxy-init -voms atlas

To download, find your dataset

Added:

>
>

rucio list-dids mc12_13TeV:
rucio list-dids mc12_13TeV:mc12_13TeV.116974.Pythia_AUET2BCTEQ6L1_ttbar_*
rucio list-dids mc15_13TeV:mc15_13TeV.*EXOT9*p2363/

Note, every datasets now has a scope called mc15_13TeV, data15_13TeV etc which precedes the actual dataset. For the moment dq-ls still works but you need to ask for at least data15.13TeV. (the dot is important).

As the dataset you want to access might be already existing on the group disk you might want to check this using

Added:

>
>

rucio list-datasets-rse UKI-SOUTHGRID-RALPP_LOCALGROUPDISK

This might take some time but will tell you what you asked for in the end. To ensure there is enough space in the group area, check
<a href=http://www.hep.lancs.ac.uk/~love/ukdata/site/1/>this page</a>.

Now when you are ready to download, do

Deleted:

<
<

rucio add-rule <scope>:<dataset> 1 UKI-SOUTHGRID-RALPP_LOCALGROUPDISK

This is a bit cryptic, but this does the replication. The scope is mc15_13TeV etc and does not seem to be really essential to be given. Rucio can find this out himself.To see something is happening

Deleted:

<
<

rucio list-rules --account <username>

To see the dataset itself and the location of the files do

Deleted:

<
<

rucio list-file-replicas --rse UKI-SOUTHGRID-RALPP_LOCALGROUPDISK <scope>:<dataset>

This is important as you can put the file location in a file and then give this file to your xAOD job.However the output is not just the complete path to the file. To create your filelist you might want to create it using sed

Added:

>
>

sed 's/^.*SFN=//' inputlist > outputlist

Changed:

<
<

You can also check the file directly and access it via root, e.g.

>
>

You might find the macro <a href="/twiki/pub/Atlas/DataStorage/preparelist.sh">preparelist.sh</a> useful. Just say preparelist.sh inputfile outputfile. It's very basic but works...

Added:

>
>

You can also check the file directly and access it via root, e.g.

ls -l /pnfs/pp.rl.ac.uk/data/atlas/atlaslocalgroupdisk/rucio/mc14_13TeV/
root -l /pnfs/pp.rl.ac.uk/data/atlas/atlaslocalgroupdisk/rucio/mc14_13TeV/f2

The following is obsolete but still kept for the moment!

If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via Data Replication

Deleted:

<
<

dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME

The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.

-- AlastairDewhurst - 2010-01-29

Added:

>
>

preparelist.sh: macro to get file list

META FILEATTACHMENT	attachment="preparelist.sh" attr="" comment="macro to get file list" date="1435834645" name="preparelist.sh" path="preparelist.sh" size="66" user="MonikaWielers" version="1"

Revision 82015-07-02 - MonikaWielers

  META TOPICPARENT 
 name="WebHome" 

 Data Storage 

This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.

 On the local cluster 

For small to medium amounts of data it is recommended that you store it in either:
- META TOPICPARENT
+ name="WebHome"
->
>
 /opt/ppd/scratch
/opt/ppd/month 

Each has a capacity of 1.6 TB. Before downloading please check there is enough space left. Typically these directories are not erased periodically, however, this means you should not keep datasets which are no longer needed. If the disks are full you can send a mail to the group asking everyone to clean his/her space. To see who uses most, do 'du --max-depth=1' and then you can also ask people personally if they could reduce their diskspace.

 On AFS 

There is an /afs cell at RAL which you can find at:
->
>
 /afs/rl.ac.uk/

 On the dCache storage element locally 

Detail instructions on using the dCache element can be found on the DCacheStorageElement page. To make things easier to manage please copy ATLAS datasets into:
->
>
 /pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/

You can use dq2-get to copy files directly into the dCache space. To do this log into heplnx109.gridpp.rl.ac.uk and do:
->
>
 cd $HOME
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas
dq2-get -S srm://heplnx204.pp.rl.ac.uk:8443/srm/managerv2?SFN=/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/dewhurst user10.janstrube.ganga.data10_7TeV.00152221.physics_L1Calo.merge.AOD.r1239_p134.D3PD_v1.99.0

where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful. Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.

 On the dCache storage element on the grid 

The advantage of storing data on the grid is that there is a lot more available storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.

To download datafiles use rucio. The documentation on rucio can be found here

. As this is grid you need to type
-<
<
 localSetupRucioClients
voms-proxy-init -voms atlas

To download, find your dataset
-<
<
 rucio list-dids mc12_13TeV:
rucio list-dids mc12_13TeV:mc12_13TeV.116974.Pythia_AUET2BCTEQ6L1_ttbar_*
rucio list-dids mc15_13TeV:mc15_13TeV.*EXOT9*p2363/



Note, every datasets now has a scope called mc15_13TeV, data15_13TeV etc which precedes the actual dataset. For the moment dq-ls still works but you need to ask for at least data15.13TeV. (the dot is important).

As the dataset you want to access might be already existing on the group disk you might want to check this using
-<
<
 rucio list-datasets-rse UKI-SOUTHGRID-RALPP_LOCALGROUPDISK

This might take some time but will tell you what you asked for in the end. To ensure there is enough space in the group area, check
 <a href=http://www.hep.lancs.ac.uk/~love/ukdata/site/1/>this page</a>.

Now when you are ready to download, do
->
>
 rucio add-rule <scope>:<dataset> 1 UKI-SOUTHGRID-RALPP_LOCALGROUPDISK

This is a bit cryptic, but this does the replication. The scope is mc15_13TeV etc and does not seem to be really essential to be given. Rucio can find this out himself.To see something is happening
->
>
 rucio list-rules --account <username>

To see the dataset itself and the location of the files do
->
>
 rucio list-file-replicas --rse UKI-SOUTHGRID-RALPP_LOCALGROUPDISK <scope>:<dataset>
-<
<
+This is important as you can put the file location in a file and then give this file to your xAOD job. You can also check the file directly and access it via root, e.g.
->
>
+This is important as you can put the file location in a file and then give this file to your xAOD job.However the output is not just the complete path to the file. To create your filelist you might want to create it using sed
->
>
+sed 's/^.*SFN=//' inputlist > outputlist

You can also check the file directly and access it via root, e.g.
 ls -l /pnfs/pp.rl.ac.uk/data/atlas/atlaslocalgroupdisk/rucio/mc14_13TeV/
root -l /pnfs/pp.rl.ac.uk/data/atlas/atlaslocalgroupdisk/rucio/mc14_13TeV/f2

The following is obsolete but still kept for the moment!

If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via Data Replication
->
>
 dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME

The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.

-- AlastairDewhurst - 2010-01-29

Revision 72015-07-01 - MonikaWielers

META TOPICPARENT	name="WebHome"

Data Storage

This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.

On the local cluster

For small to medium amounts of data it is recommended that you store it in either:

/opt/ppd/scratch
/opt/ppd/month

Each has a capacity of 1.6 TB. Before downloading please check there is enough space left. Typically these directories are not erased periodically, however, this means you should not keep datasets which are no longer needed. If the disks are full you can send a mail to the group asking everyone to clean his/her space. To see who uses most, do 'du --max-depth=1' and then you can also ask people personally if they could reduce their diskspace.

On AFS

There is an /afs cell at RAL which you can find at:

/afs/rl.ac.uk/

On the dCache storage element locally

Detail instructions on using the dCache element can be found on the DCacheStorageElement page. To make things easier to manage please copy ATLAS datasets into:

/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/

You can use dq2-get to copy files directly into the dCache space. To do this log into heplnx109.gridpp.rl.ac.uk and do:

cd $HOME
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas
dq2-get -S srm://heplnx204.pp.rl.ac.uk:8443/srm/managerv2?SFN=/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/dewhurst user10.janstrube.ganga.data10_7TeV.00152221.physics_L1Calo.merge.AOD.r1239_p134.D3PD_v1.99.0

where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful. Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.

On the dCache storage element on the grid

Changed:

<
<

The advantage of storing data on the grid is that there is a lot more avaiable storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.

>
>

The advantage of storing data on the grid is that there is a lot more available storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.

To download datafiles use rucio. The documentation on rucio can be found here

Added:

>
>

. As this is grid you need to type

localSetupRucioClients
voms-proxy-init -voms atlas

To download, find your dataset

rucio list-dids mc12_13TeV:
rucio list-dids mc12_13TeV:mc12_13TeV.116974.Pythia_AUET2BCTEQ6L1_ttbar_*
rucio list-dids mc15_13TeV:mc15_13TeV.*EXOT9*p2363/

Note, every datasets now has a scope called mc15_13TeV, data15_13TeV etc which precedes the actual dataset. For the moment dq-ls still works but you need to ask for at least data15.13TeV. (the dot is important).

As the dataset you want to access might be already existing on the group disk you might want to check this using

rucio list-datasets-rse UKI-SOUTHGRID-RALPP_LOCALGROUPDISK

This might take some time but will tell you what you asked for in the end. To ensure there is enough space in the group area, check
<a href=http://www.hep.lancs.ac.uk/~love/ukdata/site/1/>this page</a>.

Now when you are ready to download, do

rucio add-rule <scope>:<dataset> 1 UKI-SOUTHGRID-RALPP_LOCALGROUPDISK

This is a bit cryptic, but this does the replication. The scope is mc15_13TeV etc and does not seem to be really essential to be given. Rucio can find this out himself.To see something is happening

rucio list-rules --account <username>

To see the dataset itself and the location of the files do

rucio list-file-replicas --rse UKI-SOUTHGRID-RALPP_LOCALGROUPDISK <scope>:<dataset>

This is important as you can put the file location in a file and then give this file to your xAOD job. You can also check the file directly and access it via root, e.g.

ls -l /pnfs/pp.rl.ac.uk/data/atlas/atlaslocalgroupdisk/rucio/mc14_13TeV/
root -l /pnfs/pp.rl.ac.uk/data/atlas/atlaslocalgroupdisk/rucio/mc14_13TeV/f2

The following is obsolete but still kept for the moment!

If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via Data Replication

dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME

The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.

-- AlastairDewhurst - 2010-01-29

Revision 62015-07-01 - MonikaWielers

  META TOPICPARENT 
 name="WebHome" 

 Data Storage
- META TOPICPARENT
+ name="WebHome"
-<
<
+ This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.
->
>
+This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.
  On the local cluster
-<
<
+ For small to medium amounts of data it is recommended that you store it in either:
->
>
->
>
+For small to medium amounts of data it is recommended that you store it in either:
 /opt/ppd/scratch
-<
<
+/opt/ppd/month
->
>
+/opt/ppd/month
-<
<
+ Each has a capacity of 1.6 TB.
->
>
+Each has a capacity of 1.6 TB. Before downloading please check there is enough space left. Typically these directories are not erased periodically, however, this means you should not keep datasets which are no longer needed. If the disks are full you can send a mail to the group asking everyone to clean his/her space. To see who uses most, do 'du --max-depth=1' and then you can also ask people personally if they could reduce their diskspace.
  On AFS
-<
<
+ There is an /afs cell at RAL which you can find at: /afs/rl.ac.uk/
->
>
+There is an /afs cell at RAL which you can find at:
/afs/rl.ac.uk/
  On the dCache storage element locally
-<
<
+ Detail instructions on using the dCache element can be found on the DCacheStorageElement page. To make things easier to manage please copy ATLAS datasets into: /pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/
 You can use dq2-get to copy files directly into the dCache space. To do this log into heplnx109.gridpp.rl.ac.uk and do:
->
>
+Detail instructions on using the dCache element can be found on the DCacheStorageElement page. To make things easier to manage please copy ATLAS datasets into:
->
>
+/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/

You can use dq2-get to copy files directly into the dCache space. To do this log into heplnx109.gridpp.rl.ac.uk and do:
 cd $HOME
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas
-<
<
+dq2-get -S srm://heplnx204.pp.rl.ac.uk:8443/srm/managerv2?SFN=/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/dewhurst user10.janstrube.ganga.data10_7TeV.00152221.physics_L1Calo.merge.AOD.r1239_p134.D3PD_v1.99.0
->
>
+dq2-get -S srm://heplnx204.pp.rl.ac.uk:8443/srm/managerv2?SFN=/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/dewhurst user10.janstrube.ganga.data10_7TeV.00152221.physics_L1Calo.merge.AOD.r1239_p134.D3PD_v1.99.0
-<
<
+ where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful. Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.
->
>
+where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful. Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.
  On the dCache storage element on the grid
-<
<
+ The advantage of storing data on the grid is that there is alot more avaliable storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.
-<
<
+If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via Data Replication
->
>
+The advantage of storing data on the grid is that there is a lot more avaiable storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.
->
>
+To download datafiles use rucio. The documentation on rucio can be found here

If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via Data Replication
-<
<
+dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME
 The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.
->
>
+dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME
->
>
+The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.
 -- AlastairDewhurst - 2010-01-29

Revision 52010-09-07 - TimAdye

META TOPICPARENT	name="WebHome"

Data Storage

This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.

On the local cluster

For small to medium amounts of data it is recommended that you store it in either:

/opt/ppd/scratch
/opt/ppd/month

Each has a capacity of 1.6 TB.

On AFS

There is an /afs cell at RAL which you can find at:

/afs/rl.ac.uk/

On the dCache storage element locally

Detail instructions on using the dCache element can be found on the DCacheStorageElement page. To make things easier to manage please copy ATLAS datasets into:

/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/

You can use dq2-get to copy files directly into the dCache space. To do this log into heplnx109.gridpp.rl.ac.uk and do:

cd $HOME
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas
dq2-get -S srm://heplnx204.pp.rl.ac.uk:8443/srm/managerv2?SFN=/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/dewhurst user10.janstrube.ganga.data10_7TeV.00152221.physics_L1Calo.merge.AOD.r1239_p134.D3PD_v1.99.0

Changed:

<
<

where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful. Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.

>
>

where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful. Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.

On the dCache storage element on the grid

The advantage of storing data on the grid is that there is alot more avaliable storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.

Changed:

<
<

If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via [[http://panda.cern.ch:25980/server/pandamon/query?mode=ddm_req][Data Replication]

>
>

If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via Data Replication

dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME

The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.

-- AlastairDewhurst - 2010-01-29

Revision 42010-04-28 - AlastairDewhurst

  META TOPICPARENT 
 name="WebHome" 

 Data Storage 
 This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.

 On the local cluster 
 For small to medium amounts of data it is recommended that you store it in either: /opt/ppd/scratch
/opt/ppd/month 
 Each has a capacity of 1.6 TB.

 On AFS 
 There is an /afs cell at RAL which you can find at: /afs/rl.ac.uk/


 On the dCache storage element locally
- META TOPICPARENT
+ name="WebHome"
-<
<
+ Detail instructions on using the dCache element can be found on the DCacheStorageElement page.
->
>
+ Detail instructions on using the dCache element can be found on the DCacheStorageElement page. To make things easier to manage please copy ATLAS datasets into:
->
>
+/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/
 You can use dq2-get to copy files directly into the dCache space. To do this log into heplnx109.gridpp.rl.ac.uk and do: cd $HOME
source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh
voms-proxy-init -voms atlas
dq2-get -S srm://heplnx204.pp.rl.ac.uk:8443/srm/managerv2?SFN=/pnfs/pp.rl.ac.uk/data/atlas/atlasralppdisk/dewhurst user10.janstrube.ganga.data10_7TeV.00152221.physics_L1Calo.merge.AOD.r1239_p134.D3PD_v1.99.0
 where you can replace dewhurst with your name and user10.jan with the file you wish to download from the grid. Everybody has read write permission so be careful.  Note: For some reason if you try this command while somewhere inside the /pnfs/pp.rl.ac.uk/ directory structure you will get an error saying that the file system is read only.
  On the dCache storage element on the grid 
 The advantage of storing data on the grid is that there is alot more avaliable storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.

If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via [[http://panda.cern.ch:25980/server/pandamon/query?mode=ddm_req][Data Replication]

dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME
 The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.
-<
<
+-- AlastairDewhurst - 2010-01-29</verbatim>
->
>
+-- AlastairDewhurst - 2010-01-29

Revision 32010-04-27 - AlastairDewhurst

  META TOPICPARENT 
 name="WebHome" 

 Data Storage
- META TOPICPARENT
+ name="WebHome"
-<
<
+ This page describes where to store your data in order to run over it.
->
>
+ This page describes where to store your data in order to run over it. In general all data files should be stored on dCache. A small amount of data can be stored on the local cluster for development purposes. afs space can be used to put output files/code that you may wish to share with others.
-<
<
+ Locally 
 For small to medium amounts of data it is recommended that you store it in either:
->
>
+ On the local cluster 
 For small to medium amounts of data it is recommended that you store it in either:
-<
<
 /opt/ppd/scratch
/opt/ppd/month
->
>
+ Each has a capacity of 1.6 TB.

 On AFS 
 There is an /afs cell at RAL which you can find at: /afs/rl.ac.uk/
-<
<
+Each has a capacity of 1.6 TB.
->
>
+ On the dCache storage element locally 
 Detail instructions on using the dCache element can be found on the DCacheStorageElement page.
-<
<
+ On the Grid (at RAL)
->
>
+ On the dCache storage element on the grid
  The advantage of storing data on the grid is that there is alot more avaliable storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.
-<
<
+If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get.  If you want to run over a larger sample you can request data replication via [[http://panda.cern.ch:25980/server/pandamon/query?mode=ddm_req][Data Replication]
->
>
+If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get. If you want to run over a larger sample you can request data replication via [[http://panda.cern.ch:25980/server/pandamon/query?mode=ddm_req][Data Replication]
 dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME
-<
<
->
>
+ The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK The DATASETNAME needs to follow the Dataset naming rules.
-<
<
+The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK
The DATASETNAME needs to follow the Dataset naming rules.
-<
<
->
>
+-- AlastairDewhurst - 2010-01-29</verbatim>
-<
<
+-- AlastairDewhurst - 2010-01-29

Revision 22010-01-29 - AlastairDewhurst

  META TOPICPARENT 
 name="WebHome" 

 Data Storage
- META TOPICPARENT
+ name="WebHome"
-<
<
+This page describes where to store your data in order to run over it.
->
>
+ This page describes where to store your data in order to run over it.
  Locally
->
>
+ For small to medium amounts of data it is recommended that you store it in either: 
/opt/ppd/scratch
/opt/ppd/month 

Each has a capacity of 1.6 TB.
-<
<
  On the Grid (at RAL)
-<
<
+The advantage of storing data on the grid is that there is alot more avaliable storage space.  By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job.  The dis-advantage is that you don't have full control over your data and some things can take time.
->
>
+ The advantage of storing data on the grid is that there is alot more avaliable storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.
->
>
+If you wish to copy either ATLAS data that is already on the grid to RAL you can download a small amount using DQ2-get.  If you want to run over a larger sample you can request data replication via [[http://panda.cern.ch:25980/server/pandamon/query?mode=ddm_req][Data Replication]
->
>
+dq2-put -L SITENAME -s SOURCEDIRECTORY DATASETNAME

The SITENAME is: UKI-SOUTHGRID-RALPP_SCRATCHDISK
The DATASETNAME needs to follow the Dataset naming rules.
-<
<
->
>
+-- AlastairDewhurst - 2010-01-29
-<
<
+-- AlastairDewhurst - 2010-01-29

Revision 12010-01-29 - AlastairDewhurst

META TOPICPARENT	name="WebHome"

Data Storage

This page describes where to store your data in order to run over it.

Locally

On the Grid (at RAL)

The advantage of storing data on the grid is that there is alot more avaliable storage space. By Storing data on the grid at RAL local users will be able to run over the data interactively for developing software as well as looking at smaller samples as well as running over the same data as a grid job. The dis-advantage is that you don't have full control over your data and some things can take time.

-- AlastairDewhurst - 2010-01-29

Difference: DataStorage (1 vs. 10)

Revision 102015-07-12 - MonikaWielers

Data Storage

On the local cluster

On AFS

On the dCache storage element locally

On the dCache storage element on the grid

Revision 92015-07-02 - MonikaWielers

Data Storage

On the local cluster

On AFS

On the dCache storage element locally

On the dCache storage element on the grid

Revision 82015-07-02 - MonikaWielers

Data Storage

On the local cluster

On AFS

On the dCache storage element locally

On the dCache storage element on the grid

Revision 72015-07-01 - MonikaWielers

Data Storage

On the local cluster

On AFS

On the dCache storage element locally

On the dCache storage element on the grid

Revision 62015-07-01 - MonikaWielers

Data Storage

On the local cluster

On AFS

On the dCache storage element locally

On the dCache storage element on the grid

Revision 52010-09-07 - TimAdye

Data Storage

On the local cluster

On AFS

On the dCache storage element locally

On the dCache storage element on the grid

Revision 42010-04-28 - AlastairDewhurst

Data Storage

On the local cluster

On AFS

On the dCache storage element locally

On the dCache storage element on the grid

Revision 32010-04-27 - AlastairDewhurst

Data Storage

Locally

On the local cluster

On AFS

On the dCache storage element locally

On the Grid (at RAL)

On the dCache storage element on the grid

Revision 22010-01-29 - AlastairDewhurst

Data Storage

Locally

On the Grid (at RAL)

Revision 12010-01-29 - AlastairDewhurst

Data Storage

Locally

On the Grid (at RAL)