iDQ documentation¶
iDQ, or inferential Data Quality, is a statistical inference framework for data quality. It specifically focuses on the problem of non-Gaussian noise transients within gravitational-wave detectors, but the underlying formalism has broader applicability.
iDQ works with vectorized representations of the detector’s auxiliary state and searches for correlations between that vectorized state and noise transients in \(h(t)\) with the end goal of producing a calibrated estimate of the probability that there is a noise artifact in \(h(t)\), conditioned on the auxiliary state, as a function of time. This is primarily done through supervised learning, and iDQ supports a variety of supervised learning techniques. Furthermore, these concepts extend well beyond 1-dimensional data (i.e.: timeseries) and could be applied to any streaming classification problem.
iDQ not only supports classification through 2-class classification schemes, but also supports automatic retraining and calibration to deal with detector non-stationarity. In this way, the algorithm automatically re-learns which correlations are important as they change over time and returns meaningful probabilistic statements that can be interpreted immediately without further processing.