.. _running-stream:

Running the Streaming Pipeline
####################################################################################################

This is the *how to* guide for new folks who want to run their own version of the pipeline.
The syntax for `iDQ`'s executables are fairly standard (and fairly limited), and so much of the work setting up `iDQ` lies with correctly specifying the config file.
We address both points in this tutorial.

While we note what would be needed to configure `iDQ` for several different sources of features (triggers), we only provide a full example for synthetic trigger streams (:class:`idq.io.MockClassifierData`).

.. _running-stream_config:

Configuration file
====================================================================================================

A complete description of `iDQ`'s configuration (INI) files can be found here: :ref:`configuration`.
However, we provide enough information below to get you started.

The INI file has several required sections, which we'll introduce in turn.
An example is provided alongside the source code (`~etc/idq.ini`), but we repeat a simplified version below.

In addition to the INI file, analysts must manage the list of channels to be used in the analysis.
These are typically determined via safety studies (i.e.: hardware injections), but this simplified INI uses synthetic data generated on the fly.
As such, you will also have to manage a config file for your synthetic data in addition to the channel list.

**idq.ini**

::

    #-------------------------------------------------
    # high-level shared parameters
    [general]
    tag = test
    instrument = Fake1
    rootdir = .

    classifiers = ovl

    [samples]
    target_channel = target_channel
    target_bounds = 

    dirty_bounds =
    dirty_window = 0.

    #-------------------------------------------------
    # parameters for training jobs
    [train]
    workflow = block 
    log_level = 10
    random_rate = 0.1

    [train data discovery]
    flavor = MockClassifierData
    time = time
    ignore_segdb = False

    columns = ['time', 'snr', 'frequency']

    config = 

    [train stream]
    stride =
    delay = 

    [train reporting]
    flavor = PickleReporter

    #-------------------------------------------------
    # parameters for evaluation jobs
    [evaluate]
    workflow = 
    log_level =
    random_rate = 

    [evaluate data discovery]
    flavor = MockClassifierData
    time = time
    ignore_segdb = False

    columns = ['time', 'snr', 'frequency']

    config = 

    [evaluate stream]
    stride =
    delay = 

    [evaluate reporting]
    flavor = QuiverReporter

    #-------------------------------------------------
    # parameters for calibration jobs
    [calibrate]
    workflow = block
    log_level = 10

    [calibrate reporting]
    flavor = CalibrationMapReporter

    #-------------------------------------------------
    # parameters for timeseries jobs
    [timeseries]
    workflow = block
    log_level = 10
    srate = 128

    [timeseries data discovery]
    flavor = MockClassifierData
    time = time
    ignore_segdb = False

    columns = ['time', 'snr', 'frequency']

    config = 

    [timeseries stream]
    stride =
    delay =

    [timeseries reporting]
    flavor = GWFSeriesReporter

    #-------------------------------------------------
    # parameters for classifiers
    [ovl]
    flavor = OVL

    incremental = 100
    num_recalculate = 10
    metric = eff_fap
    minima = {'eff_fap': 3, 'poisson_signif':5, 'use_percentage':1e-3}

    time = time
    significance = significance

**mcd.ini**

::

    WRITE AN EXAMPLE HERE WITH VERY FEW CHANNELS

**channels.txt**

::

    WRITE this

.. _running-stream-tasks:

Streaming Tasks
====================================================================================================

Describe how to manage (asynchronous) processes via ``idq-stream``.
Describe how that will manage

* ``idq-streaming_train``
* ``idq-streaming_evaluate``
* ``idq-streaming_calibrate``
* ``idq-streaming_timeseries``

Describe the input/output data streams for each.