API documentation¶
dcc.env module¶
Environment settings.
dcc.exceptions module¶
DCC exceptions.
- exception dcc.exceptions.FileSkippedException(dcc_file, msg=None)[source]¶
Bases:
ExceptionException for when a file to be downloaded is skipped.
- exception dcc.exceptions.NoVersionError(*args, **kwargs)[source]¶
Bases:
ExceptionException for when a DCC number has not got a version specified.
- exception dcc.exceptions.NotLoggedInError(*args, **kwargs)[source]¶
Bases:
ExceptionError due to user not being logged in.
- exception dcc.exceptions.TooLargeFileSkippedException(dcc_file, size, allowed)[source]¶
Bases:
dcc.exceptions.FileSkippedExceptionException for when a file to be downloaded is too large.
dcc.parsers module¶
Record parsing.
- class dcc.parsers.DCCParser(content)[source]¶
Bases:
objectA parser for DCC documents.
- Parameters
- contentstr
The response body.
- dcc_numbers()[source]¶
Potential DCC numbers contained within the text of the document.
- Returns
setPotential DCC numbers.
An HTML navigator for the document content.
- Returns
bs4.BeautifulSoupThe HTML navigator.
- class dcc.parsers.DCCXMLRecordParser(content)[source]¶
Bases:
dcc.parsers.DCCParserA parser for DCC XML record documents.
- property abstract¶
- property attached_files¶
- property authors¶
- property dcc_number_pieces¶
- property docid¶
- property journal_reference¶
- property keywords¶
- property note¶
- property other_version_numbers¶
- property publication_info¶
- property referencing_ids¶
- property revision_dates¶
- property title¶
- class dcc.parsers.DCCXMLUpdateParser(content)[source]¶
Bases:
dcc.parsers.DCCParserA parser for DCC XMLUpdate responses.
dcc.records module¶
Record objects.
- class dcc.records.DCCArchive(archive_dir)[source]¶
Bases:
objectA local collection of DCC documents.
This acts as an offline store of previously downloaded DCC documents.
- Parameters
- archive_dirstr or
pathlib.Path The archive directory on the local file system to store retrieved records and files in.
- archive_dirstr or
- archive_revision_metadata(record, *, overwrite=False)[source]¶
Serialise revision metadata in the local archive.
- Parameters
- record
DCCRecord The record to archive.
- overwritebool, optional
If True, overwrite any existing revision in the local archive; otherwise do nothing. Defaults to False.
- record
- document_dir(dcc_number)[source]¶
The directory in the local archive of the document corresponding to the specified DCC number.
This directory contains subdirectories corresponding to revisions (versions) of the document, and may not yet exist.
- Parameters
- dcc_number
DCCNumber The DCC number. If a version is specified, it is ignored.
- dcc_number
- Returns
pathlib.PathThe directory in the local archive corresponding to the document.
- property documents¶
The documents in the local archive.
These are DCC numbers corresponding to the documents in the local archive, without version suffices.
- Yields
DCCNumberA DCC number in the local archive.
- fetch_record(dcc_number, *, ignore_version=False, overwrite=False, fetch_files=False, ignore_too_large=False, session)[source]¶
Fetch a DCC record, either from the local archive or from the remote DCC host, adding it to the local archive if necessary.
- Parameters
- dcc_number
DCCNumberor str The DCC record to fetch.
- ignore_versionbool, optional
Whether to ignore the version in dcc_number when deterimining if the document exists in the archive already. Defaults to False.
- overwritebool, optional
Whether to overwrite existing records and files in the archive with those fetched remotely. Defaults to False.
- fetch_filesbool, optional
Whether to also fetch the files attached to the record. Defaults to False.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException. If True, the file is simply ignored.- session
DCCSession, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- dcc_number
- fetch_record_file(record, number, *, ignore_too_large=False, overwrite=False, session)[source]¶
Fetch the file at position number in the specified DCC record. If the file does not exist in the local archive, it is fetched and archived from the DCC.
- Parameters
- record
DCCRecord The record to fetch files for.
- numberint
The file number to fetch, as listed in the record metadata, starting from position 1.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException. If True, the file is simply ignored.- overwritebool, optional
Whether to overwrite existing local files with those fetched remotely. Defaults to False.
- session
DCCSession, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- record
- Returns
DCCFileThe fetched file.
- fetch_record_files(record, *, ignore_too_large=False, overwrite=False, session)[source]¶
Fetch the files in the specified DCC record. If any file does not exist in the local archive, it is fetched and archived from the DCC.
- Parameters
- record
DCCRecord The record to fetch files for.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException. If True, the file is simply ignored.- overwritebool, optional
Whether to overwrite existing local files with those fetched remotely. Defaults to False.
- session
DCCSession, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- record
- Returns
- list
The fetched
files.
- latest_revision(dcc_number)[source]¶
The latest revision in the local archive of the document corresponding to the specified DCC number.
- Parameters
- dcc_number
DCCNumberor str The DCC number. If a version is specified, it is ignored.
- dcc_number
- Returns
DCCRecordThe latest revision in the local archive of dcc_number.
- Raises
FileNotFoundErrorIf no revisions of dcc_number exist in the local archive.
- property latest_revisions¶
Latest revisions of the documents in the local archive.
- Yields
DCCRecordThe latest revision of a document in the archive.
- property records¶
Records in the local archive, including revisions.
- Yields
DCCRecordA record in the archive.
- revision_dir(dcc_number)[source]¶
The directory in the local archive of the revision corresponding to the specified versioned DCC number.
This directory is used to store data for a particular version of a DCC record, and may not yet exist.
- Parameters
- dcc_number
DCCNumber The DCC number. Must contain a version.
- dcc_number
- Returns
pathlib.PathThe directory in the local archive corresponding to the document revision.
- Raises
NoVersionErrorIf dcc_number does not contain a version.
- revision_meta_path(dcc_number)[source]¶
The path to the meta file in the local archive of the revision corresponding to the specified DCC number.
The meta file may not yet exist.
- Parameters
- dcc_number
DCCNumber The DCC number. Must contain a version.
- dcc_number
- Returns
pathlib.PathThe path to the meta file in the local archive corresponding to the document revision.
- Raises
NoVersionErrorIf dcc_number does not contain a version.
- class dcc.records.DCCAuthor(name: str, uid: Optional[int] = None)[source]¶
Bases:
objectA DCC author.
- class dcc.records.DCCFile(title: str, filename: str, url: str)[source]¶
Bases:
objectA DCC file.
- discover(directory)[source]¶
Update local file path if the local file exists in directory.
- Parameters
- directory
strorpathlib.Path The directory to search.
- directory
- exists()[source]¶
Whether the file exists at the local path.
- Returns
boolTrue if the file exists at the local path, False otherwise.
- fetch(directory, *, overwrite=False, session)[source]¶
Fetch the remote file and store in the local archive.
- Parameters
- directorystr or
pathlib.Path The directory to use to store the file.
- overwritebool, optional
Whether to overwrite any existing file in the archive with that fetched remotely. Defaults to False.
- session
DCCSession, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- directorystr or
- local_path: pathlib.Path = None¶
- write(path)[source]¶
Write file to the file system.
- Parameters
- pathstr,
pathlib.Path, or file-like The path or file object to write to. If an open file object is given, it will be written to and left open. If a path string is given, it will be opened, written to, then closed.
- pathstr,
- class dcc.records.DCCJournalRef(journal: str, volume: int, page: str, citation: str, url: Optional[str] = None)[source]¶
Bases:
objectA DCC record journal reference.
- class dcc.records.DCCNumber(category, numeric=None, version=None)[source]¶
Bases:
objectA DCC number including category and numeric identifier.
You must either provide a string containing the DCC number, or the separate category and numeric parts, with optional version, e.g.:
>>> from dcc.records import DCCNumber >>> DCCNumber("T1234567") DCCNumber(category='T', numeric='1234567', version=None) >>> DCCNumber("T", "1234567") DCCNumber(category='T', numeric='1234567', version=None) >>> DCCNumber("T", "1234567", 4) DCCNumber(category='T', numeric='1234567', version=4)
- Parameters
- category, numeric, versionstr, optional
The parts that make up the DCC number.
- document_type_letters = {'A': 'Acquisitions', 'C': 'Contractual or procurement', 'D': 'Drawings', 'E': 'Engineering documents', 'F': 'Forms and Templates', 'G': 'Presentations (eg Graphics)', 'L': 'Letters and Memos', 'M': 'Management or Policy', 'P': 'Publications', 'Q': 'Quality Assurance documents', 'R': 'Operations Change Requests', 'S': 'Serial numbers', 'T': 'Techical notes', 'X': 'Safety Incident Reports'}¶
- format(version=True)[source]¶
String representation of the DCC number, with optional version number.
- Parameters
- versionbool, optional
Include the version in the string. Defaults to True.
- Returns
- str
The string representation.
- property version_suffix¶
The string version suffix for the version number.
- Returns
- str
The version suffix to the DCC numeral, e.g. “-v2”.
- class dcc.records.DCCRecord(dcc_number: dcc.records.DCCNumber, title: Optional[str] = None, authors: Optional[List[dcc.records.DCCAuthor]] = None, abstract: Optional[str] = None, keywords: Optional[List[str]] = None, note: Optional[str] = None, publication_info: Optional[str] = None, journal_reference: Optional[dcc.records.DCCJournalRef] = None, other_versions: Optional[List[int]] = None, creation_date: Optional[datetime.datetime] = None, contents_revision_date: Optional[datetime.datetime] = None, metadata_revision_date: Optional[datetime.datetime] = None, files: Optional[List[dcc.records.DCCFile]] = None, referenced_by: Optional[List[dcc.records.DCCNumber]] = None, related_to: Optional[List[dcc.records.DCCNumber]] = None)[source]¶
Bases:
objectA DCC record.
- property author_names¶
The names of the authors associated with this record.
- Returns
listThe author names.
- authors: List[dcc.records.DCCAuthor] = None¶
- contents_revision_date: datetime.datetime = None¶
- creation_date: datetime.datetime = None¶
- dcc_number: dcc.records.DCCNumber¶
- discover_files(directory)[source]¶
Discover existing files in directory corresponding to this record.
- Parameters
- directory
strorpathlib.Path The directory to search.
- directory
- classmethod fetch(dcc_number, *, session)[source]¶
Fetch record from the remote DCC host.
- Parameters
- dcc_number
DCCNumberor str The DCC record to fetch.
- session
DCCSession, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- dcc_number
- Returns
DCCRecordThe fetched record.
- fetch_file(number, directory, *, ignore_too_large=False, overwrite=False, session)[source]¶
Fetch file attached to this record.
- Parameters
- numberint
The file number to fetch.
- directorystr or
pathlib.Path The directory in which to store the fetched file.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException. If True, the file is simply ignored.- overwritebool, optional
Whether to overwrite the existing local file with that fetched remotely. Defaults to False.
- session
DCCSession, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- Returns
DCCFileThe fetched file.
- fetch_files(directory, *, ignore_too_large=False, overwrite=False, session)[source]¶
Fetch files attached to this record.
- Parameters
- directorystr or
pathlib.Path The directory in which to store the fetched files.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException. If True, the file is simply ignored.- overwritebool, optional
Whether to overwrite existing local files with those fetched remotely. Defaults to False.
- session
DCCSession, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- directorystr or
- Returns
- list
The fetched
files.
- files: List[dcc.records.DCCFile] = None¶
- is_latest_version()[source]¶
Check if the current record is the latest version.
Note: this only checks the current record instance represents the latest known local record. The remote record is not fetched.
- Returns
boolTrue if the current version is the latest; False otherwise.
- journal_reference: dcc.records.DCCJournalRef = None¶
- property latest_version_number¶
The latest version number for this record.
- Returns
intThe latest version number.
- metadata_revision_date: datetime.datetime = None¶
- classmethod read(path)[source]¶
Read record from the file system.
- Parameters
- pathstr or
pathlib.Path The path for the record’s meta file.
- pathstr or
- Returns
DCCRecordThe record.
- refenced_by_titles()[source]¶
The titles of the records referencing this record.
- Returns
listThe titles.
- referenced_by: List[dcc.records.DCCNumber] = None¶
The titles of the records related to this record.
- Returns
listThe titles.
- update(*, session)[source]¶
Update the remote record metadata.
- Parameters
- session
DCCSession, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- session
- write(path)[source]¶
Write record to the file system.
- Parameters
- pathstr,
pathlib.Path, or file-like The path or file object to write to. If an open file object is given, it will be written to and left open. If a path string is given, it will be opened, written to, then closed.
- pathstr,
dcc.sessions module¶
Communication with the DCC.
- class dcc.sessions.DCCAuthenticatedSession(host, *, stream_hook=None, **kwargs)[source]¶
Bases:
dcc.sessions.DCCSession,ciecplib.sessions.SessionA SAML/ECP-authenticated DCC HTTP fetcher.
- Parameters
- hoststr
The DCC host to use.
- idpstr
The identity provider host to use.
- Other Parameters
- stream_hookcallable, optional
Function taking a response type and a
requests.Responseobject from a GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.
- class dcc.sessions.DCCSession(host, *, stream_hook=None, **kwargs)[source]¶
Bases:
objectA DCC HTTP fetcher.
- Parameters
- hoststr
The DCC host to use.
- stream_hookcallable, optional
Function taking a stream type, the item being streamed, and a
requests.Responseobject from a streamed GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.
- STREAM_FILE = 1¶
- abstract dcc_record_url(dcc_number, xml=True)[source]¶
Build a DCC record URL given the specified DCC number.
- Parameters
- dcc_number
DCCNumber The DCC record.
- xmlbool, optional
Whether to make the URL an XML request.
- dcc_number
- Returns
- str
The URL.
- fetch_record_page(dcc_number)[source]¶
Fetch a DCC record page.
- Parameters
- dcc_number
DCCNumber The DCC record.
- dcc_number
- Returns
requests.ResponseThe HTTP response.
- protocol = 'https'¶
- update_record_metadata(dcc_record)[source]¶
Update metadata for the DCC record specified by the provided number.
The version (if any) of the provided DCC number is ignored. Only the latest version of the record is updated.
- Parameters
- dcc_number
DCCNumber The DCC record.
- dcc_number
- Returns
requests.ResponseThe HTTP response.
- class dcc.sessions.DCCUnauthenticatedSession(host, *, stream_hook=None, **kwargs)[source]¶
Bases:
dcc.sessions.DCCSession,requests.sessions.SessionAn unauthenticated DCC HTTP fetcher.
- Parameters
- hoststr
The DCC host to use.
- Other Parameters
- stream_hookcallable, optional
Function taking a response type and a
requests.Responseobject from a GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.
- dcc.sessions.default_session(authenticated=False)[source]¶
Create a DCC session using the default host and identity provider.
- Parameters
- authenticated
bool, optional Whether to make the session an authenticated one. Defaults to False.
- authenticated
- Returns
DCCAuthenticatedSessionThe default session.
dcc.util module¶
Utilities.
- dcc.util.opened_file(fobj, mode)¶
Get an open file regardless of whether a string or an already open file is passed.
- Parameters
- fobjstr,
pathlib.Path, or file-like The path or file object to ensure is open. If fobj is an already open file object, its mode is checked to be correct but is otherwise returned as-is. If fobj is a string, it is opened with the specified mode and yielded, then closed once the wrapped context exits. Note that passed open file objects are not closed.
- modestr
The mode to ensure fobj is opened with.
- fobjstr,
- Yields
io.FileIOThe open file with the specified mode.
- Raises
- ValueError
If fobj is not a string nor open file, or if fobj is open but with a different mode.
- dcc.util.remove_none(container)[source]¶
Remove None values from the specified container.
Adapted from https://stackoverflow.com/a/20558778/2251982.