API documentation¶
dcc.env module¶
Environment settings.
dcc.exceptions module¶
DCC exceptions.
- exception dcc.exceptions.FileSkippedException(dcc_file, msg=None)[source]¶
Bases:
Exception
Exception for when a file to be downloaded is skipped.
- exception dcc.exceptions.NoVersionError(*args, **kwargs)[source]¶
Bases:
Exception
Exception for when a DCC number has not got a version specified.
- exception dcc.exceptions.NotLoggedInError(*args, **kwargs)[source]¶
Bases:
Exception
Error due to user not being logged in.
- exception dcc.exceptions.TooLargeFileSkippedException(dcc_file, size, allowed)[source]¶
Bases:
dcc.exceptions.FileSkippedException
Exception for when a file to be downloaded is too large.
dcc.parsers module¶
Record parsing.
- class dcc.parsers.DCCParser(content)[source]¶
Bases:
object
A parser for DCC documents.
- Parameters
- contentstr
The response body.
- dcc_numbers()[source]¶
Potential DCC numbers contained within the text of the document.
- Returns
set
Potential DCC numbers.
An HTML navigator for the document content.
- Returns
bs4.BeautifulSoup
The HTML navigator.
- class dcc.parsers.DCCXMLRecordParser(content)[source]¶
Bases:
dcc.parsers.DCCParser
A parser for DCC XML record documents.
- property abstract¶
- property attached_files¶
- property authors¶
- property dcc_number_pieces¶
- property docid¶
- property journal_reference¶
- property keywords¶
- property note¶
- property other_version_numbers¶
- property publication_info¶
- property referencing_ids¶
- property revision_dates¶
- property title¶
- class dcc.parsers.DCCXMLUpdateParser(content)[source]¶
Bases:
dcc.parsers.DCCParser
A parser for DCC XMLUpdate responses.
dcc.records module¶
Record objects.
- class dcc.records.DCCArchive(archive_dir)[source]¶
Bases:
object
A local collection of DCC documents.
This acts as an offline store of previously downloaded DCC documents.
- Parameters
- archive_dirstr or
pathlib.Path
The archive directory on the local file system to store retrieved records and files in.
- archive_dirstr or
- archive_revision_metadata(record, *, overwrite=False)[source]¶
Serialise revision metadata in the local archive.
- Parameters
- record
DCCRecord
The record to archive.
- overwritebool, optional
If True, overwrite any existing revision in the local archive; otherwise do nothing. Defaults to False.
- record
- document_dir(dcc_number)[source]¶
The directory in the local archive of the document corresponding to the specified DCC number.
This directory contains subdirectories corresponding to revisions (versions) of the document, and may not yet exist.
- Parameters
- dcc_number
DCCNumber
The DCC number. If a version is specified, it is ignored.
- dcc_number
- Returns
pathlib.Path
The directory in the local archive corresponding to the document.
- property documents¶
The documents in the local archive.
These are DCC numbers corresponding to the documents in the local archive, without version suffices.
- Yields
DCCNumber
A DCC number in the local archive.
- fetch_record(dcc_number, *, ignore_version=False, overwrite=False, fetch_files=False, ignore_too_large=False, session)[source]¶
Fetch a DCC record, either from the local archive or from the remote DCC host, adding it to the local archive if necessary.
- Parameters
- dcc_number
DCCNumber
or str The DCC record to fetch.
- ignore_versionbool, optional
Whether to ignore the version in dcc_number when deterimining if the document exists in the archive already. Defaults to False.
- overwritebool, optional
Whether to overwrite existing records and files in the archive with those fetched remotely. Defaults to False.
- fetch_filesbool, optional
Whether to also fetch the files attached to the record. Defaults to False.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException
. If True, the file is simply ignored.- session
DCCSession
, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- dcc_number
- fetch_record_file(record, number, *, ignore_too_large=False, overwrite=False, session)[source]¶
Fetch the file at position number in the specified DCC record. If the file does not exist in the local archive, it is fetched and archived from the DCC.
- Parameters
- record
DCCRecord
The record to fetch files for.
- numberint
The file number to fetch, as listed in the record metadata, starting from position 1.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException
. If True, the file is simply ignored.- overwritebool, optional
Whether to overwrite existing local files with those fetched remotely. Defaults to False.
- session
DCCSession
, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- record
- Returns
DCCFile
The fetched file.
- fetch_record_files(record, *, ignore_too_large=False, overwrite=False, session)[source]¶
Fetch the files in the specified DCC record. If any file does not exist in the local archive, it is fetched and archived from the DCC.
- Parameters
- record
DCCRecord
The record to fetch files for.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException
. If True, the file is simply ignored.- overwritebool, optional
Whether to overwrite existing local files with those fetched remotely. Defaults to False.
- session
DCCSession
, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- record
- Returns
- list
The fetched
files
.
- latest_revision(dcc_number)[source]¶
The latest revision in the local archive of the document corresponding to the specified DCC number.
- Parameters
- dcc_number
DCCNumber
or str The DCC number. If a version is specified, it is ignored.
- dcc_number
- Returns
DCCRecord
The latest revision in the local archive of dcc_number.
- Raises
FileNotFoundError
If no revisions of dcc_number exist in the local archive.
- property latest_revisions¶
Latest revisions of the documents in the local archive.
- Yields
DCCRecord
The latest revision of a document in the archive.
- property records¶
Records in the local archive, including revisions.
- Yields
DCCRecord
A record in the archive.
- revision_dir(dcc_number)[source]¶
The directory in the local archive of the revision corresponding to the specified versioned DCC number.
This directory is used to store data for a particular version of a DCC record, and may not yet exist.
- Parameters
- dcc_number
DCCNumber
The DCC number. Must contain a version.
- dcc_number
- Returns
pathlib.Path
The directory in the local archive corresponding to the document revision.
- Raises
NoVersionError
If dcc_number does not contain a version.
- revision_meta_path(dcc_number)[source]¶
The path to the meta file in the local archive of the revision corresponding to the specified DCC number.
The meta file may not yet exist.
- Parameters
- dcc_number
DCCNumber
The DCC number. Must contain a version.
- dcc_number
- Returns
pathlib.Path
The path to the meta file in the local archive corresponding to the document revision.
- Raises
NoVersionError
If dcc_number does not contain a version.
- class dcc.records.DCCAuthor(name: str, uid: Optional[int] = None)[source]¶
Bases:
object
A DCC author.
- class dcc.records.DCCFile(title: str, filename: str, url: str)[source]¶
Bases:
object
A DCC file.
- discover(directory)[source]¶
Update local file path if the local file exists in directory.
- Parameters
- directory
str
orpathlib.Path
The directory to search.
- directory
- exists()[source]¶
Whether the file exists at the local path.
- Returns
bool
True if the file exists at the local path, False otherwise.
- fetch(directory, *, overwrite=False, session)[source]¶
Fetch the remote file and store in the local archive.
- Parameters
- directorystr or
pathlib.Path
The directory to use to store the file.
- overwritebool, optional
Whether to overwrite any existing file in the archive with that fetched remotely. Defaults to False.
- session
DCCSession
, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- directorystr or
- local_path: pathlib.Path = None¶
- write(path)[source]¶
Write file to the file system.
- Parameters
- pathstr,
pathlib.Path
, or file-like The path or file object to write to. If an open file object is given, it will be written to and left open. If a path string is given, it will be opened, written to, then closed.
- pathstr,
- class dcc.records.DCCJournalRef(journal: str, volume: int, page: str, citation: str, url: Optional[str] = None)[source]¶
Bases:
object
A DCC record journal reference.
- class dcc.records.DCCNumber(category, numeric=None, version=None)[source]¶
Bases:
object
A DCC number including category and numeric identifier.
You must either provide a string containing the DCC number, or the separate category and numeric parts, with optional version, e.g.:
>>> from dcc.records import DCCNumber >>> DCCNumber("T1234567") DCCNumber(category='T', numeric='1234567', version=None) >>> DCCNumber("T", "1234567") DCCNumber(category='T', numeric='1234567', version=None) >>> DCCNumber("T", "1234567", 4) DCCNumber(category='T', numeric='1234567', version=4)
- Parameters
- category, numeric, versionstr, optional
The parts that make up the DCC number.
- document_type_letters = {'A': 'Acquisitions', 'C': 'Contractual or procurement', 'D': 'Drawings', 'E': 'Engineering documents', 'F': 'Forms and Templates', 'G': 'Presentations (eg Graphics)', 'L': 'Letters and Memos', 'M': 'Management or Policy', 'P': 'Publications', 'Q': 'Quality Assurance documents', 'R': 'Operations Change Requests', 'S': 'Serial numbers', 'T': 'Techical notes', 'X': 'Safety Incident Reports'}¶
- format(version=True)[source]¶
String representation of the DCC number, with optional version number.
- Parameters
- versionbool, optional
Include the version in the string. Defaults to True.
- Returns
- str
The string representation.
- property version_suffix¶
The string version suffix for the version number.
- Returns
- str
The version suffix to the DCC numeral, e.g. “-v2”.
- class dcc.records.DCCRecord(dcc_number: dcc.records.DCCNumber, title: Optional[str] = None, authors: Optional[List[dcc.records.DCCAuthor]] = None, abstract: Optional[str] = None, keywords: Optional[List[str]] = None, note: Optional[str] = None, publication_info: Optional[str] = None, journal_reference: Optional[dcc.records.DCCJournalRef] = None, other_versions: Optional[List[int]] = None, creation_date: Optional[datetime.datetime] = None, contents_revision_date: Optional[datetime.datetime] = None, metadata_revision_date: Optional[datetime.datetime] = None, files: Optional[List[dcc.records.DCCFile]] = None, referenced_by: Optional[List[dcc.records.DCCNumber]] = None, related_to: Optional[List[dcc.records.DCCNumber]] = None)[source]¶
Bases:
object
A DCC record.
- property author_names¶
The names of the authors associated with this record.
- Returns
list
The author names.
- authors: List[dcc.records.DCCAuthor] = None¶
- contents_revision_date: datetime.datetime = None¶
- creation_date: datetime.datetime = None¶
- dcc_number: dcc.records.DCCNumber¶
- discover_files(directory)[source]¶
Discover existing files in directory corresponding to this record.
- Parameters
- directory
str
orpathlib.Path
The directory to search.
- directory
- classmethod fetch(dcc_number, *, session)[source]¶
Fetch record from the remote DCC host.
- Parameters
- dcc_number
DCCNumber
or str The DCC record to fetch.
- session
DCCSession
, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- dcc_number
- Returns
DCCRecord
The fetched record.
- fetch_file(number, directory, *, ignore_too_large=False, overwrite=False, session)[source]¶
Fetch file attached to this record.
- Parameters
- numberint
The file number to fetch.
- directorystr or
pathlib.Path
The directory in which to store the fetched file.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException
. If True, the file is simply ignored.- overwritebool, optional
Whether to overwrite the existing local file with that fetched remotely. Defaults to False.
- session
DCCSession
, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- Returns
DCCFile
The fetched file.
- fetch_files(directory, *, ignore_too_large=False, overwrite=False, session)[source]¶
Fetch files attached to this record.
- Parameters
- directorystr or
pathlib.Path
The directory in which to store the fetched files.
- ignore_too_largebool, optional
If False, when a file is too large, raise a
TooLargeFileSkippedException
. If True, the file is simply ignored.- overwritebool, optional
Whether to overwrite existing local files with those fetched remotely. Defaults to False.
- session
DCCSession
, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- directorystr or
- Returns
- list
The fetched
files
.
- files: List[dcc.records.DCCFile] = None¶
- is_latest_version()[source]¶
Check if the current record is the latest version.
Note: this only checks the current record instance represents the latest known local record. The remote record is not fetched.
- Returns
bool
True if the current version is the latest; False otherwise.
- journal_reference: dcc.records.DCCJournalRef = None¶
- property latest_version_number¶
The latest version number for this record.
- Returns
int
The latest version number.
- metadata_revision_date: datetime.datetime = None¶
- classmethod read(path)[source]¶
Read record from the file system.
- Parameters
- pathstr or
pathlib.Path
The path for the record’s meta file.
- pathstr or
- Returns
DCCRecord
The record.
- refenced_by_titles()[source]¶
The titles of the records referencing this record.
- Returns
list
The titles.
- referenced_by: List[dcc.records.DCCNumber] = None¶
The titles of the records related to this record.
- Returns
list
The titles.
- update(*, session)[source]¶
Update the remote record metadata.
- Parameters
- session
DCCSession
, optional The DCC session to use. Defaults to None, which triggers use of the default session settings.
- session
- write(path)[source]¶
Write record to the file system.
- Parameters
- pathstr,
pathlib.Path
, or file-like The path or file object to write to. If an open file object is given, it will be written to and left open. If a path string is given, it will be opened, written to, then closed.
- pathstr,
dcc.sessions module¶
Communication with the DCC.
- class dcc.sessions.DCCAuthenticatedSession(host, *, stream_hook=None, **kwargs)[source]¶
Bases:
dcc.sessions.DCCSession
,ciecplib.sessions.Session
A SAML/ECP-authenticated DCC HTTP fetcher.
- Parameters
- hoststr
The DCC host to use.
- idpstr
The identity provider host to use.
- Other Parameters
- stream_hookcallable, optional
Function taking a response type and a
requests.Response
object from a GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.
- class dcc.sessions.DCCSession(host, *, stream_hook=None, **kwargs)[source]¶
Bases:
object
A DCC HTTP fetcher.
- Parameters
- hoststr
The DCC host to use.
- stream_hookcallable, optional
Function taking a stream type, the item being streamed, and a
requests.Response
object from a streamed GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.
- STREAM_FILE = 1¶
- abstract dcc_record_url(dcc_number, xml=True)[source]¶
Build a DCC record URL given the specified DCC number.
- Parameters
- dcc_number
DCCNumber
The DCC record.
- xmlbool, optional
Whether to make the URL an XML request.
- dcc_number
- Returns
- str
The URL.
- fetch_record_page(dcc_number)[source]¶
Fetch a DCC record page.
- Parameters
- dcc_number
DCCNumber
The DCC record.
- dcc_number
- Returns
requests.Response
The HTTP response.
- protocol = 'https'¶
- update_record_metadata(dcc_record)[source]¶
Update metadata for the DCC record specified by the provided number.
The version (if any) of the provided DCC number is ignored. Only the latest version of the record is updated.
- Parameters
- dcc_number
DCCNumber
The DCC record.
- dcc_number
- Returns
requests.Response
The HTTP response.
- class dcc.sessions.DCCUnauthenticatedSession(host, *, stream_hook=None, **kwargs)[source]¶
Bases:
dcc.sessions.DCCSession
,requests.sessions.Session
An unauthenticated DCC HTTP fetcher.
- Parameters
- hoststr
The DCC host to use.
- Other Parameters
- stream_hookcallable, optional
Function taking a response type and a
requests.Response
object from a GET or POST request, yielding its body content. This can be used to implement download progress bars, interactive skipping of downloads, etc.
- dcc.sessions.default_session(authenticated=False)[source]¶
Create a DCC session using the default host and identity provider.
- Parameters
- authenticated
bool
, optional Whether to make the session an authenticated one. Defaults to False.
- authenticated
- Returns
DCCAuthenticatedSession
The default session.
dcc.util module¶
Utilities.
- dcc.util.opened_file(fobj, mode)¶
Get an open file regardless of whether a string or an already open file is passed.
- Parameters
- fobjstr,
pathlib.Path
, or file-like The path or file object to ensure is open. If fobj is an already open file object, its mode is checked to be correct but is otherwise returned as-is. If fobj is a string, it is opened with the specified mode and yielded, then closed once the wrapped context exits. Note that passed open file objects are not closed.
- modestr
The mode to ensure fobj is opened with.
- fobjstr,
- Yields
io.FileIO
The open file with the specified mode.
- Raises
- ValueError
If fobj is not a string nor open file, or if fobj is open but with a different mode.
- dcc.util.remove_none(container)[source]¶
Remove None values from the specified container.
Adapted from https://stackoverflow.com/a/20558778/2251982.