Apart from this guide, a good place to get help is from the tool itself:
$ dcc --help Usage: dcc [OPTIONS] COMMAND [ARGS]... dcc 0.8.0 Tools for viewing and updating records, metadata and files in the LIGO Document Control Center (DCC). Website: https://docs.ligo.org/sean-leavey/dcc/ dcc comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under certain conditions. See the GNU General Public Licence for details. Copyright 2022 Sean Leavey, Jameson Graef Rollins, Christopher Wipf Options: --version Show the version and exit. --help Show this message and exit. Commands: archive Archive remote DCC records locally. convert Extract DCC numbers from a target file or URL. list List records in the local archive. open Open remote DCC record page in the default browser. open-file Open file attached to DCC record using operating system. update Update remote DCC record metadata. view View DCC record metadata.
Help for the available
dcc subcommands can be shown in the same way using e.g.
Obtaining a Kerberos ticket for accessing restricted resources¶
Access to most DCC records and files requires credentials such as those from ligo.org or another provider. You typically only get these if you’re a member of a scientific collaboration.
dcc assumes you can authenticate yourself and therefore builds and
requests URLs for records and files within the restricted part of the DCC, prompting for
credentials or using an existing Kerberos ticket. To avoid being prompted every time
dcc is invoked, run
kinit albert.einstein@LIGO.ORG (where
your login and
LIGO.ORG is your Kerberos realm) before first use each day (tickets
are typically granted for 24 hours). Subsequent interaction with the DCC will
transparently use a Kerberos token if one is available. The token can be verified with
klist and revoked with
You can specify the
--public flag to restrict
accessing public records. With this flag, you don’t need to enter your credentials or
obtain a Kerberos ticket, though you will only be able to access public resources.
Configuring a local archive¶
dcc command that involves downloading a remote record or file can cache the
results in a local archive. This allows for quick subsequent access to the same records,
by retrieving the local copy instead of connecting to the DCC. With a configured local
archive retrieval of cached records and files is transparent, with requests being made
to the DCC only if they don’t yet exist in the local archive (or if the remote version
is explicitly requested).
Downloaded data is then stored in the given directory hierarchically, e.g.:
$ tree /path/to/archive /path/to/archive └── T010075 └── T010075-v3 ├── Change Record for T010075-v3.docx ├── Change Record for T010075-v3.pdf ├── meta.toml ├── T010075-v3 aLIGO System Description.pdf └── T010075-v3 System Description.zip
meta.toml file contains the human-readable (TOML-formatted) metadata for the
record. This can also be read by
dcc uses a temporary directory for downloads that gets removed
immediately before the program exits. To persist downloaded records and files between
runs, pass the
option to any command that supports it or set the DCC_ARCHIVE environment
variable. Whichever method you use, the value should be a path (relative or absolute) to
The local archive built by
dcc is not guaranteed to remain consistent with that
of the remote DCC host. To ensure you have the latest version of a record or file,
--force flag when requesting it.
DCC records can be archived locally using dcc archive. This downloads
records’ metadata, and optionally attached files, and stores them in the local
archive for later retrieval. The command requires one or more
NUMBER arguments and/or a
--from-file option followed by a path to a file containing the DCC numbers
(separated by whitespace) to archive. For example:
# Archive the latest version of T010075: $ dcc archive -s /path/to/archive T010075 # Archive a specific version of T010075: $ dcc archive -s /path/to/archive T010075-v1 # Archive multiple records: $ dcc archive -s /path/to/archive T010075 E1300945 # Alternatively specify the path to a file containing the records to archive: $ echo "T010075 E1300945" > to-archive.txt $ dcc archive -s /path/to/archive --from-file to-archive.txt
Similar to the behaviour of standard Unix utilities, the
--from-file option can also be set to
stdin by specifying
$ echo "T010075 E1300945" | dcc archive -s /path/to/archive --from-file -
Files are not automatically archived. To fetch them too, specify the
--files flag. By default, files of any size will be retrieved. To limit the
maximum size of files retrieved, specify the
--max-file-size option, specifying a maximum file size in MB.
Archival of referenced and referencing records¶
DCC records can contain “related to” and “referenced by” records, and dcc
archive can archive them as well. The
--depth option controls
how far in the chain from the original documents the archival can traverse. For example,
--depth to 1 will fetch the records that are listed in
the specified DCC numbers, and setting it to 2 will additionally fetch the references of
those documents. The default is 0, meaning only the records specified in the input are
--depth is nonzero, by default only “related to” records
are fetched. To also fetch “referenced by” records, specify the
--fetch-referencing flag. The fetching of “related
to” and “referenced by” records can be switched on and off using
The DCC is a highly connected graph and as such setting a high
--depth is likely to lead to thousands of records being downloaded. Typically only
a value of 1 or 2 is sufficient to archive almost every relevant related record.
For example, the referenced documents of
E1300945 can be archived alongside
E1300945 itself using:
# Fetch "related to" documents as well as E1300945 itself: $ dcc archive -s /path/to/archive E1300945 --depth 1 # Fetch "referenced by" documents as well: $ dcc archive -s /path/to/archive E1300945 --depth 1 --fetch-referencing
Updating record metadata¶
--author options can be specified
multiple times to set multiple values. Author names should be as written, e.g. “Albert
Einstein”, and should correspond to real DCC users. For example:
# Update the title of T2200016. $ dcc update T2200016 --title "A new title"
By default, dcc update will prompt for confirmation before sending the
updated record to the DCC. To make changes without any confirmation, specify the flag
--no-confirm. Submitted changes are irreversible, so
The DCC does not appear to perform error checking on author names. If an author is not given correctly, it is simply discarded.
Changing the DCC or login host¶
dcc interacts with the DCC host at https://dcc.ligo.org/, or that of the
DCC_HOST if set. Some users may wish to change this to
something different, such as one of the backup servers (https://dcc-backup.ligo.org/,
https://dcc-lho.ligo.org/, https://dcc-llo.ligo.org/) or a DCC server for a different
project (e.g. https://dcc.cosmicexplorer.org/). This can be done by specifying a
different host using the
--host flag on commands that support it.
dcc does not distinguish between DCC hosts when archiving records and files
locally. To prevent mixing records from separate projects within the same hierarchy,
specify a different local archive setting for each project.
It is also possible to change the identity provider (IDP) host, used to authenticate
your login credentials. By default it is set to https://login.ligo.org/, or that of the
ECP_IDP, but can be changed to the backup
(https://login2.ligo.org/) or that of another project (see cilogon.org for a list of available IDP hosts) using
--idp-host flag on commands that support it.