iDQ offline¶
What does this task do?¶
This task instantiates and runs an end-to-end offline (batch) iDQ job (see the iDQ docs). When this job completes, it summarizes the results and presents them exactly as the iDQ online tasks do. We refer readers to these references for more detail.
Key differences between the online and offline iDQ reports are
the causal nature of training and evaluation: iDQ online jobs are manifestly causal whereas offline jobs may or may not be, depending on the configuration
cross-validatoin: iDQ online jobs automatically cross-validate the classifier performance by only using historical data to train (ie, training and evaluation sets are distinct). This may not be true for offline jobs, again depending on the configuration.
the number of classifiers included in the report: iDQ online jobs have strict latency requirements and therefore typically run only a single classifier. Offline jobs may run several.
What are its return states?¶
human_input_needed
error
How was it reviewed?¶
This has not been reviewed!
How should results be interpreted?¶
We refer analysts to the detailed description of iDQ DQR reports available for iDQ online jobs. Because Offline jobs may run more classifiers than online jobs, they may also provide an opportunity to compare the performance of multiple algorithms.
What INI options, config files are required?¶
config (string)
a path to an iDQ config file
lookback (float, optional)
how much historical time should be included in the iDQ batch job (and resulting summary)
delay (float, optional)
the number of seconds to wait before querying data. This allows us to reasonably guarantee that data is discoverable through standard tools (gw_data_find).
Are there any derived tasks that are based on this one?¶
The following reference standard iDQ configs stored within the DQR source code and therefore will ignore config if it is supplied.
H1 iDQ offline
L1 iDQ offline
H1 iDQ offline-kw
L1 iDQ offline-kw