Feature Backends¶

The workhorse that enables data discovery to find features is the idq.io.triggers.DataLoader. One can instantiate a flavor of idq.io.triggers.DataLoader explicitly to acquire features or configure one through a configuration file as part of the streaming or batch workflows.

class idq.io.triggers.DataLoader(start, end, segs=None, columns=None, **kwargs)[source]¶

A data loader to retrieve features spanning multiple channels.

filter(segs=None, bounds=None, time='time')[source]¶: update segments and filters out data that don’t span segments also filters by bounds and updates cache as needed NOTE: this requires knowledge of the “time” key within data

flush(max_stride=inf, time='time')[source]¶: remove data to a target span and number of samples

is_cached(channel, bounds=None)[source]¶: Returns whether or not data is cached for the channel requested.

pop(channel, default=None)[source]¶: Remove and return all data associated with this channel.

query(channels=None, columns=None, segs=None, time=None, bounds=None, **kwargs)[source]¶: Submits a query to get features and returns the result of such a query as a dictionary of EventTables, keyed by channel.

random_times(time, target_channel, dirty_bounds, dirty_window, random_rate, segs=None)[source]¶: A convenience function to extract random times, implicitly loading needed data.

target_times(time, target_channel, target_bounds, segs=None)[source]¶: A convenience function to extract target times, implicitly loading needed data.

Omicron-based¶

class idq.io.triggers.omicron.OmicronDataLoader(start, end, segs=None, columns=None, **kwargs)[source]¶: an extension meant to read Omicron triggers off disk

Keyword arguments:

The following keyword arguments are required:

instrument: the instrument for which features are derived from

In addition, the following optional keyword arguments can be passed in:

skip_bad_files: allows one to skip over problematic files with incorrect permissions

SNAX-based¶

class idq.io.triggers.snax.SNAXDataLoader(start, end, segs=None, columns=None, **kwargs)[source]¶

an extension meant to read SNAX features off disk

We assume the following directory structure:: ${rootdir}/${gpsMOD1e5}/${basename}-${start}-${dur}.h5

Keyword arguments:

The following keyword arguments are required:

rootdir: base directory where features are located
basename: name of files/directories containing GstLAL-based features

class idq.io.triggers.snax.SNAXKafkaDataLoader(*args, **kwargs)[source]¶

an extension meant to load streaming SNAX features from Kafka.

Intended to keep a running current timestamp, and has the ability to poll for new data and fill its own ClassifierData objects for use when triggers are to be retrieved.

NOTE: when called, this will cache all trigger regardless of the bounds. This is done to avoid issues with re-querying data from rolling buffers, which is not guaranteed to return consistent results. Instead, we record everything we query and filter.

Keyword arguments:

The following keyword arguments are required:

group: the Kafka consumer group to subscribe to
port: the Kafka port to subscribe to
topic: the Kafka topic to subscribe to
poll_timeout: how long to wait for a message before timing out
retry_cadence: how long to wait between retries
sample_rate: the sampling rate of incoming features

Kleine-Welle-based¶

class idq.io.triggers.kw.KWDataLoader(start, end, segs=None, columns=None, **kwargs)[source]¶

an extension of ClassifierData specifically for KleineWelle triggers. expects triggers to be in multi-channel files

Note, if we do not request any specific channel(s), all discoverable channels will be returned

Keyword arguments:

The following keyword arguments are required:

instrument: the instrument for which features are derived from