API documentation

dcc.env module

Environment settings.

dcc.exceptions module

DCC exceptions.

exception dcc.exceptions.FileSkippedException(dcc_file, msg=None)[source]

Bases: Exception

Exception for when a file to be downloaded is skipped.

exception dcc.exceptions.NoVersionError(*args, **kwargs)[source]

Bases: Exception

Exception for when a DCC number has not got a version specified.

exception dcc.exceptions.NotLoggedInError(*args, **kwargs)[source]

Bases: Exception

Error due to user not being logged in.

exception dcc.exceptions.TooLargeFileSkippedException(dcc_file, size, allowed)[source]

Bases: dcc.exceptions.FileSkippedException

Exception for when a file to be downloaded is too large.

exception dcc.exceptions.UnauthorisedError(*args, **kwargs)[source]

Bases: Exception

Error for when a document is not available to the user to be viewed.

exception dcc.exceptions.UnrecognisedDCCRecordError(*args, **kwargs)[source]

Bases: Exception

Error for when a page is not recognised by the DCC server.

dcc.parsers module

Record parsing.

class dcc.parsers.DCCParser(content)[source]

Bases: object

A parser for DCC documents.

Parameters
contentstr

The response body.

dcc_numbers()[source]

Potential DCC numbers contained within the text of the document.

Returns
set

Potential DCC numbers.

html_navigator()[source]

An HTML navigator for the document content.

Returns
bs4.BeautifulSoup

The HTML navigator.

class dcc.parsers.DCCXMLRecordParser(content)[source]

Bases: dcc.parsers.DCCParser

A parser for DCC XML record documents.

property abstract
property attached_files
property authors
property dcc_number_pieces
property docid
property journal_reference
property keywords
property note
property other_version_numbers
property publication_info
property referencing_ids
property related_ids
property revision_dates
property title
class dcc.parsers.DCCXMLUpdateParser(content)[source]

Bases: dcc.parsers.DCCParser

A parser for DCC XMLUpdate responses.

dcc.records module

Record objects.

class dcc.records.DCCArchive(archive_dir)[source]

Bases: object

A local collection of DCC documents.

This acts as an offline store of previously downloaded DCC documents.

Parameters
archive_dirstr or pathlib.Path

The archive directory on the local file system to store retrieved records and files in.

archive_revision_metadata(record, *, overwrite=False)[source]

Serialise revision metadata in the local archive.

Parameters
recordDCCRecord

The record to archive.

overwritebool, optional

If True, overwrite any existing revision in the local archive; otherwise do nothing. Defaults to False.

document_dir(dcc_number)[source]

The directory in the local archive of the document corresponding to the specified DCC number.

This directory contains subdirectories corresponding to revisions (versions) of the document, and may not yet exist.

Parameters
dcc_numberDCCNumber

The DCC number. If a version is specified, it is ignored.

Returns
pathlib.Path

The directory in the local archive corresponding to the document.

property documents

The documents in the local archive.

These are DCC numbers corresponding to the documents in the local archive, without version suffices.

Yields
DCCNumber

A DCC number in the local archive.

fetch_record(dcc_number, *, ignore_version=False, overwrite=False, fetch_files=False, ignore_too_large=False, session)[source]

Fetch a DCC record, either from the local archive or from the remote DCC host, adding it to the local archive if necessary.

Parameters
dcc_numberDCCNumber or str

The DCC record to fetch.

ignore_versionbool, optional

Whether to ignore the version in dcc_number when deterimining if the document exists in the archive already. Defaults to False.

overwritebool, optional

Whether to overwrite existing records and files in the archive with those fetched remotely. Defaults to False.

fetch_filesbool, optional

Whether to also fetch the files attached to the record. Defaults to False.

ignore_too_largebool, optional

If False, when a file is too large, raise a TooLargeFileSkippedException. If True, the file is simply ignored.

sessionDCCSession, optional

The DCC session to use. Defaults to None, which triggers use of the default session settings.

fetch_record_file(record, number, *, ignore_too_large=False, overwrite=False, session)[source]

Fetch the file at position number in the specified DCC record. If the file does not exist in the local archive, it is fetched and archived from the DCC.

Parameters
recordDCCRecord

The record to fetch files for.

numberint

The file number to fetch, as listed in the record metadata, starting from position 1.

ignore_too_largebool, optional

If False, when a file is too large, raise a TooLargeFileSkippedException. If True, the file is simply ignored.

overwritebool, optional

Whether to overwrite existing local files with those fetched remotely. Defaults to False.

sessionDCCSession, optional

The DCC session to use. Defaults to None, which triggers use of the default session settings.

Returns
DCCFile

The fetched file.

fetch_record_files(record, *, ignore_too_large=False, overwrite=False, session)[source]

Fetch the files in the specified DCC record. If any file does not exist in the local archive, it is fetched and archived from the DCC.

Parameters
recordDCCRecord

The record to fetch files for.

ignore_too_largebool, optional

If False, when a file is too large, raise a TooLargeFileSkippedException. If True, the file is simply ignored.

overwritebool, optional

Whether to overwrite existing local files with those fetched remotely. Defaults to False.

sessionDCCSession, optional

The DCC session to use. Defaults to None, which triggers use of the default session settings.

Returns
list

The fetched files.

latest_revision(dcc_number)[source]

The latest revision in the local archive of the document corresponding to the specified DCC number.

Parameters
dcc_numberDCCNumber or str

The DCC number. If a version is specified, it is ignored.

Returns
DCCRecord

The latest revision in the local archive of dcc_number.

Raises
FileNotFoundError

If no revisions of dcc_number exist in the local archive.

property latest_revisions

Latest revisions of the documents in the local archive.

Yields
DCCRecord

The latest revision of a document in the archive.

property records

Records in the local archive, including revisions.

Yields
DCCRecord

A record in the archive.

revision_dir(dcc_number)[source]

The directory in the local archive of the revision corresponding to the specified versioned DCC number.

This directory is used to store data for a particular version of a DCC record, and may not yet exist.

Parameters
dcc_numberDCCNumber

The DCC number. Must contain a version.

Returns
pathlib.Path

The directory in the local archive corresponding to the document revision.

Raises
NoVersionError

If dcc_number does not contain a version.

revision_meta_path(dcc_number)[source]

The path to the meta file in the local archive of the revision corresponding to the specified DCC number.

The meta file may not yet exist.

Parameters
dcc_numberDCCNumber

The DCC number. Must contain a version.

Returns
pathlib.Path

The path to the meta file in the local archive corresponding to the document revision.

Raises
NoVersionError

If dcc_number does not contain a version.

revisions(dcc_number)[source]

All revisions in the local archive corresponding to the specified DCC number.

Parameters
dcc_numberDCCNumber or str

The DCC number. If a version is specified, it is ignored.

Returns
list

The records in the local archive corresponding to the revisions of dcc_number.

class dcc.records.DCCAuthor(name: str, uid: Optional[int] = None)[source]

Bases: object

A DCC author.

name: str
uid: int = None
class dcc.records.DCCFile(title: str, filename: str, url: str)[source]

Bases: object

A DCC file.

discover(directory)[source]

Update local file path if the local file exists in directory.

Parameters
directorystr or pathlib.Path

The directory to search.

exists()[source]

Whether the file exists at the local path.

Returns
bool

True if the file exists at the local path, False otherwise.

fetch(directory, *, overwrite=False, session)[source]

Fetch the remote file and store in the local archive.

Parameters
directorystr or pathlib.Path

The directory to use to store the file.

overwritebool, optional

Whether to overwrite any existing file in the archive with that fetched remotely. Defaults to False.

sessionDCCSession, optional

The DCC session to use. Defaults to None, which triggers use of the default session settings.

filename: str
local_path: pathlib.Path = None
title: str
url: str
write(path)[source]

Write file to the file system.

Parameters
pathstr, pathlib.Path, or file-like

The path or file object to write to. If an open file object is given, it will be written to and left open. If a path string is given, it will be opened, written to, then closed.

class dcc.records.DCCJournalRef(journal: str, volume: int, page: str, citation: str, url: Optional[str] = None)[source]

Bases: object

A DCC record journal reference.

citation: str
journal: str
page: str
url: str = None
volume: int
class dcc.records.DCCNumber(category, numeric=None, version=None)[source]

Bases: object

A DCC number including category and numeric identifier.

You must either provide a string containing the DCC number, or the separate category and numeric parts, with optional version, e.g.:

>>> from dcc.records import DCCNumber
>>> DCCNumber("T1234567")
DCCNumber(category='T', numeric='1234567', version=None)
>>> DCCNumber("T", "1234567")
DCCNumber(category='T', numeric='1234567', version=None)
>>> DCCNumber("T", "1234567", 4)
DCCNumber(category='T', numeric='1234567', version=4)
Parameters
category, numeric, versionstr, optional

The parts that make up the DCC number.

category: str
document_type_letters = {'A': 'Acquisitions', 'C': 'Contractual or procurement', 'D': 'Drawings', 'E': 'Engineering documents', 'F': 'Forms and Templates', 'G': 'Presentations (eg Graphics)', 'L': 'Letters and Memos', 'M': 'Management or Policy', 'P': 'Publications', 'Q': 'Quality Assurance documents', 'R': 'Operations Change Requests', 'S': 'Serial numbers', 'T': 'Techical notes', 'X': 'Safety Incident Reports'}
format(version=True)[source]

String representation of the DCC number, with optional version number.

Parameters
versionbool, optional

Include the version in the string. Defaults to True.

Returns
str

The string representation.

numeric: str
version: int = None
property version_suffix

The string version suffix for the version number.

Returns
str

The version suffix to the DCC numeral, e.g. “-v2”.

class dcc.records.DCCRecord(dcc_number: dcc.records.DCCNumber, title: Optional[str] = None, authors: Optional[List[dcc.records.DCCAuthor]] = None, abstract: Optional[str] = None, keywords: Optional[List[str]] = None, note: Optional[str] = None, publication_info: Optional[str] = None, journal_reference: Optional[dcc.records.DCCJournalRef] = None, other_versions: Optional[List[int]] = None, creation_date: Optional[datetime.datetime] = None, contents_revision_date: Optional[datetime.datetime] = None, metadata_revision_date: Optional[datetime.datetime] = None, files: Optional[List[dcc.records.DCCFile]] = None, referenced_by: Optional[List[dcc.records.DCCNumber]] = None, related_to: Optional[List[dcc.records.DCCNumber]] = None)[source]

Bases: object

A DCC record.

abstract: str = None
property author_names

The names of the authors associated with this record.

Returns
list

The author names.

authors: List[dcc.records.DCCAuthor] = None
contents_revision_date: datetime.datetime = None
creation_date: datetime.datetime = None
dcc_number: dcc.records.DCCNumber
discover_files(directory)[source]

Discover existing files in directory corresponding to this record.

Parameters
directorystr or pathlib.Path

The directory to search.

classmethod fetch(dcc_number, *, session)[source]

Fetch record from the remote DCC host.

Parameters
dcc_numberDCCNumber or str

The DCC record to fetch.

sessionDCCSession, optional

The DCC session to use. Defaults to None, which triggers use of the default session settings.

Returns
DCCRecord

The fetched record.

fetch_file(number, directory, *, ignore_too_large=False, overwrite=False, session)[source]

Fetch file attached to this record.

Parameters
numberint

The file number to fetch.

directorystr or pathlib.Path

The directory in which to store the fetched file.

ignore_too_largebool, optional

If False, when a file is too large, raise a TooLargeFileSkippedException. If True, the file is simply ignored.

overwritebool, optional

Whether to overwrite the existing local file with that fetched remotely. Defaults to False.

sessionDCCSession, optional

The DCC session to use. Defaults to None, which triggers use of the default session settings.

Returns
DCCFile

The fetched file.

fetch_files(directory, *, ignore_too_large=False, overwrite=False, session)[source]

Fetch files attached to this record.

Parameters
directorystr or pathlib.Path

The directory in which to store the fetched files.

ignore_too_largebool, optional

If False, when a file is too large, raise a TooLargeFileSkippedException. If True, the file is simply ignored.

overwritebool, optional

Whether to overwrite existing local files with those fetched remotely. Defaults to False.

sessionDCCSession, optional

The DCC session to use. Defaults to None, which triggers use of the default session settings.

Returns
list

The fetched files.

property filenames

The filenames associated with this record.

Returns
list

The filenames.

files: List[dcc.records.DCCFile] = None
is_latest_version()[source]

Check if the current record is the latest version.

Note: this only checks the current record instance represents the latest known local record. The remote record is not fetched.

Returns
bool

True if the current version is the latest; False otherwise.

journal_reference: dcc.records.DCCJournalRef = None
keywords: List[str] = None
property latest_version_number

The latest version number for this record.

Returns
int

The latest version number.

metadata_revision_date: datetime.datetime = None
note: str = None
other_versions: List[int] = None
publication_info: str = None
classmethod read(path)[source]

Read record from the file system.

Parameters
pathstr or pathlib.Path

The path for the record’s meta file.

Returns
DCCRecord

The record.

refenced_by_titles()[source]

The titles of the records referencing this record.

Returns
list

The titles.

referenced_by: List[dcc.records.DCCNumber] = None
related_titles()[source]

The titles of the records related to this record.

Returns
list

The titles.

related_to: List[dcc.records.DCCNumber] = None
title: str = None
update(*, session)[source]

Update the remote record metadata.

Parameters
sessionDCCSession, optional

The DCC session to use. Defaults to None, which triggers use of the default session settings.

property version_numbers

The versions associated with this record.

Returns
set

The versions.

write(path)[source]

Write record to the file system.

Parameters
pathstr, pathlib.Path, or file-like

The path or file object to write to. If an open file object is given, it will be written to and left open. If a path string is given, it will be opened, written to, then closed.

dcc.records.ensure_session(func)[source]

Ensure the session argument passed to the wrapped function is real, creating a temporary session if required.

dcc.sessions module

Communication with the DCC.

class dcc.sessions.DCCAuthenticatedSession(host, *, stream_hook=None, **kwargs)[source]

Bases: dcc.sessions.DCCSession, ciecplib.sessions.Session

A SAML/ECP-authenticated DCC HTTP fetcher.

Parameters
hoststr

The DCC host to use.

idpstr

The identity provider host to use.

Other Parameters
stream_hookcallable, optional

Function taking a response type and a requests.Response object from a GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.

dcc_record_url(dcc_number, xml=True)[source]

Build a DCC record URL given the specified DCC number.

Parameters
dcc_numberDCCNumber

The DCC record.

xmlbool, optional

Whether to make the URL an XML request.

Returns
str

The URL.

class dcc.sessions.DCCSession(host, *, stream_hook=None, **kwargs)[source]

Bases: object

A DCC HTTP fetcher.

Parameters
hoststr

The DCC host to use.

stream_hookcallable, optional

Function taking a stream type, the item being streamed, and a requests.Response object from a streamed GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.

STREAM_FILE = 1
abstract dcc_record_url(dcc_number, xml=True)[source]

Build a DCC record URL given the specified DCC number.

Parameters
dcc_numberDCCNumber

The DCC record.

xmlbool, optional

Whether to make the URL an XML request.

Returns
str

The URL.

fetch_file_contents(dcc_file)[source]

Fetch the remote contents of the specified file.

Parameters
dcc_fileDCCFile

The DCC file to fetch.

Yields
bytes

The next chunk of the file.

fetch_record_page(dcc_number)[source]

Fetch a DCC record page.

Parameters
dcc_numberDCCNumber

The DCC record.

Returns
requests.Response

The HTTP response.

protocol = 'https'
update_record_metadata(dcc_record)[source]

Update metadata for the DCC record specified by the provided number.

The version (if any) of the provided DCC number is ignored. Only the latest version of the record is updated.

Parameters
dcc_numberDCCNumber

The DCC record.

Returns
requests.Response

The HTTP response.

class dcc.sessions.DCCUnauthenticatedSession(host, *, stream_hook=None, **kwargs)[source]

Bases: dcc.sessions.DCCSession, requests.sessions.Session

An unauthenticated DCC HTTP fetcher.

Parameters
hoststr

The DCC host to use.

Other Parameters
stream_hookcallable, optional

Function taking a response type and a requests.Response object from a GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.

dcc_record_url(dcc_number, xml=True)[source]

Build a DCC record URL given the specified DCC number.

Parameters
dcc_numberDCCNumber

The DCC record.

xmlbool, optional

Whether to make the URL an XML request.

Returns
str

The URL.

dcc.sessions.default_session(authenticated=False)[source]

Create a DCC session using the default host and identity provider.

Parameters
authenticatedbool, optional

Whether to make the session an authenticated one. Defaults to False.

Returns
DCCAuthenticatedSession

The default session.

dcc.util module

Utilities.

dcc.util.change_exc_msg(exc, new_msg)[source]

Change exception message.

dcc.util.human_file_size(length)[source]

Convert length in bytes to a human file size.

Parameters
lengthint

The file size in bytes.

Returns
int

The scaled file size.

str

The unit, e.g. “B” (bytes) or “GB” (gigabytes).

dcc.util.opened_file(fobj, mode)

Get an open file regardless of whether a string or an already open file is passed.

Parameters
fobjstr, pathlib.Path, or file-like

The path or file object to ensure is open. If fobj is an already open file object, its mode is checked to be correct but is otherwise returned as-is. If fobj is a string, it is opened with the specified mode and yielded, then closed once the wrapped context exits. Note that passed open file objects are not closed.

modestr

The mode to ensure fobj is opened with.

Yields
io.FileIO

The open file with the specified mode.

Raises
ValueError

If fobj is not a string nor open file, or if fobj is open but with a different mode.

dcc.util.remove_none(container)[source]

Remove None values from the specified container.

Adapted from https://stackoverflow.com/a/20558778/2251982.