.. _running-stream: Running the Streaming Pipeline #################################################################################################### This is the *how to* guide for new folks who want to run their own version of the pipeline. The syntax for `iDQ`'s executables are fairly standard (and fairly limited), and so much of the work setting up `iDQ` lies with correctly specifying the config file. We address both points in this tutorial. While we note what would be needed to configure `iDQ` for several different sources of features (triggers), we only provide a full example for synthetic trigger streams (:class:`idq.io.MockClassifierData`). .. _running-stream_config: Configuration file ==================================================================================================== A complete description of `iDQ`'s configuration (INI) files can be found here: :ref:`configuration`. However, we provide enough information below to get you started. The INI file has several required sections, which we'll introduce in turn. An example is provided alongside the source code (`~etc/idq.ini`), but we repeat a simplified version below. In addition to the INI file, analysts must manage the list of channels to be used in the analysis. These are typically determined via safety studies (i.e.: hardware injections), but this simplified INI uses synthetic data generated on the fly. As such, you will also have to manage a config file for your synthetic data in addition to the channel list. **idq.ini** :: #------------------------------------------------- # high-level shared parameters [general] tag = test instrument = Fake1 rootdir = . classifiers = ovl [samples] target_channel = target_channel target_bounds = dirty_bounds = dirty_window = 0. #------------------------------------------------- # parameters for training jobs [train] workflow = block log_level = 10 random_rate = 0.1 [train data discovery] flavor = MockClassifierData time = time ignore_segdb = False columns = ['time', 'snr', 'frequency'] config = [train stream] stride = delay = [train reporting] flavor = PickleReporter #------------------------------------------------- # parameters for evaluation jobs [evaluate] workflow = log_level = random_rate = [evaluate data discovery] flavor = MockClassifierData time = time ignore_segdb = False columns = ['time', 'snr', 'frequency'] config = [evaluate stream] stride = delay = [evaluate reporting] flavor = QuiverReporter #------------------------------------------------- # parameters for calibration jobs [calibrate] workflow = block log_level = 10 [calibrate reporting] flavor = CalibrationMapReporter #------------------------------------------------- # parameters for timeseries jobs [timeseries] workflow = block log_level = 10 srate = 128 [timeseries data discovery] flavor = MockClassifierData time = time ignore_segdb = False columns = ['time', 'snr', 'frequency'] config = [timeseries stream] stride = delay = [timeseries reporting] flavor = GWFSeriesReporter #------------------------------------------------- # parameters for classifiers [ovl] flavor = OVL incremental = 100 num_recalculate = 10 metric = eff_fap minima = {'eff_fap': 3, 'poisson_signif':5, 'use_percentage':1e-3} time = time significance = significance **mcd.ini** :: WRITE AN EXAMPLE HERE WITH VERY FEW CHANNELS **channels.txt** :: WRITE this .. _running-stream-tasks: Streaming Tasks ==================================================================================================== Describe how to manage (asynchronous) processes via ``idq-stream``. Describe how that will manage * ``idq-streaming_train`` * ``idq-streaming_evaluate`` * ``idq-streaming_calibrate`` * ``idq-streaming_timeseries`` Describe the input/output data streams for each.