Feature Backends

The workhorse that enables data discovery to find features is the idq.io.triggers.DataLoader. One can instantiate a flavor of idq.io.triggers.DataLoader explicitly to acquire features or configure one through a configuration file as part of the streaming or batch workflows.

class idq.io.triggers.DataLoader(start, end, segs=None, columns=None, **kwargs)[source]

A data loader to retrieve features spanning multiple channels.

filter(segs=None, bounds=None, time='time')[source]

update segments and filters out data that don’t span segments also filters by bounds and updates cache as needed NOTE: this requires knowledge of the “time” key within data

flush(max_stride=inf, time='time')[source]

remove data to a target span and number of samples

is_cached(channel, bounds=None)[source]

Returns whether or not data is cached for the channel requested.

pop(channel, default=None)[source]

Remove and return all data associated with this channel.

query(channels=None, columns=None, segs=None, time=None, bounds=None, **kwargs)[source]

Submits a query to get features and returns the result of such a query as a dictionary of EventTables, keyed by channel.

random_times(time, target_channel, dirty_bounds, dirty_window, random_rate, segs=None)[source]

A convenience function to extract random times, implicitly loading needed data.

target_times(time, target_channel, target_bounds, segs=None)[source]

A convenience function to extract target times, implicitly loading needed data.

Omicron-based

class idq.io.triggers.omicron.OmicronDataLoader(start, end, segs=None, columns=None, **kwargs)[source]

an extension meant to read Omicron triggers off disk

Keyword arguments:

The following keyword arguments are required:

  • instrument: the instrument for which features are derived from

In addition, the following optional keyword arguments can be passed in:

  • skip_bad_files: allows one to skip over problematic files with incorrect permissions

SNAX-based

class idq.io.triggers.snax.SNAXDataLoader(start, end, segs=None, columns=None, **kwargs)[source]

an extension meant to read SNAX features off disk

We assume the following directory structure:

${rootdir}/${gpsMOD1e5}/${basename}-${start}-${dur}.h5

Keyword arguments:

The following keyword arguments are required:

  • rootdir: base directory where features are located

  • basename: name of files/directories containing GstLAL-based features

class idq.io.triggers.snax.SNAXKafkaDataLoader(*args, **kwargs)[source]

an extension meant to load streaming SNAX features from Kafka.

Intended to keep a running current timestamp, and has the ability to poll for new data and fill its own ClassifierData objects for use when triggers are to be retrieved.

NOTE: when called, this will cache all trigger regardless of the bounds. This is done to avoid issues with re-querying data from rolling buffers, which is not guaranteed to return consistent results. Instead, we record everything we query and filter.

Keyword arguments:

The following keyword arguments are required:

  • group: the Kafka consumer group to subscribe to

  • port: the Kafka port to subscribe to

  • topic: the Kafka topic to subscribe to

  • poll_timeout: how long to wait for a message before timing out

  • retry_cadence: how long to wait between retries

  • sample_rate: the sampling rate of incoming features

Kleine-Welle-based

class idq.io.triggers.kw.KWDataLoader(start, end, segs=None, columns=None, **kwargs)[source]

an extension of ClassifierData specifically for KleineWelle triggers. expects triggers to be in multi-channel files

Note, if we do not request any specific channel(s), all discoverable channels will be returned

Keyword arguments:

The following keyword arguments are required:

  • instrument: the instrument for which features are derived from