Replication Workflows

LIGO and Virgo data is replicated to a variety of Tier-1 or Tier-2 facilities which are managed by LIGO or Virgo themselves, and to smaller Tier-3 or 4 facilities managed by partner institutions (Tier definitions available in LIGO-M0900325). Individual sites subscribe to specific datasets or collections of datasets, as defined above. Replication policy is generally a function of file type:

  • Raw Frames: each detector site (LHO, LLO) stores its own raw frames on tape. Each sites’ raw frames are also replicated to tape at CIT, yielding 2 tape replicas of each detector’s raw frames.

  • HOFT, RDS & SFTs: again, each detector site (LHO and LLO) hosts its own tape replicas of these and the other site’s data, and both detectors’ data is replicated to tape at CIT, yielding 3 tape replicas of all data. Various other Tier-N sites throughout the LDG “subscribe” to other datasets of interest.

In summary, all raw frames are replicated to tape at each detector site and to CIT, while most other Tier-N sites subscribe to subsets of the data, often opting for more recent sets of HOFT.

For a rolling buffer of data, the rucio frame replication workflow follows the procedure:

  1. Create a new dataset in rucio (E.g., O3:H-H1_HOFT_C01)

  2. Add a rucio replication rule to replicate O3:H-H1_HOFT_C01 to desired Rucio Storage Element (RSE) (E.g., LIGO-CIT)

  3. Begin frame production:

  4. Files written to disk at the instrument site (E.g., LHO -> /archive/frames)

  5. LDAS diskcache daemon updates the diskcache file

  6. gwrucio_registrar reads diskcache, registers new files and attaches them to the desired dataset.

  7. The rucio replication daemons contact an FTS server to initiate 3rd party transfers to replicate dataset to desired RSEs.

Data can also be registered statically from the command line or a list in a text file. See the example-workflows for specific examples.

See also: rucio docs for general replication workflows in rucio.