How to add a new Data Quality Product

There are two ways to get a new data quality product incorporated into the DQR.

We provide tutorials for both below. For the sake of clarity, we will assume your new product is called my awesome product throughout.

Adding a new check that will be run within the DQR’s DAG

If your new data quality product does not have specialized dependencies, it may be straightforward to include it within the DQR’s DAG. This has the advantage that you, the developer, do not need to manage or monitor persistent processes to ensure your results are included in the DQR. To do this, follow these steps

Write an executable

Your follow-up should be encapsulated in a single executable. You can write this in any language you wish, but it must be in the DQR’s PATH and therefore discoverable when submitted to condor as part of the DAG. You can find several examples within the DQR repository itself, such as omegascan.

We also require your executable to always post a report to GraceDb. This means you must properly catch all errors and post an associated error report. Specifications for the content of the reports can be found here: Technical Design.

The DQR provides convenient formatting libraries in Python that comply with the required format. Here’s an example, which would be contained in an executable called your-awesome-product-dqr, showing how they can be used, including catching errors and reporting the errors.:

>> from ligo.gracedb.rest import GraceDb
>> import sys
>> import json
>> from dqr import json as dqrjson
>>
>> import yourAwesomeLibrary ### import what you need to compute your data quality product
>>
>> __process_name__ = 'my awesome product"
>> __author__ = 'your name (your.name@ligo.org)'
>>
>> graceid = sys.argv[1] ### the GraceDb ID number for an interesting event
>> option1 = sys.argv[2] ### some option assoicated with your library
>>
>> try:
>>     ### your code returns a pass/fail/hin state
>>     ### if it needs to post images/files to GraceDb,
>>     ### that should be done within the delegation
>>     state = yourAwesomeLibrary.do_some_thing(graceid, option1)
>>
>>     ### format the report for the DQR
>>     report = dqrjson.format_report(
>>         "warn", ### we always warn to get human input for omegascans!
>>         __process_name__,
>>         __author__,
>>         summary = 'Summarize your product in a human readable way!',
>>         links = [dqrjson.format_link('https://docs.ligo.org/detchar/data-quality-report/tasks/my-awesome-product-dqr.html', 'docs')],
>>     )
>>
>> except Exception as e:
>>     import traceback
>>
>>     ### format an error report for the DQR
>>     report = dqrjson.format_failure(
>>         utils.__process_name__,
>>         utils.__author__,
>>         traceback_string=traceback.format_exc(),
>>     )
>>
>> finally:
>>     ### actually upload the report to the DQR
>>     ### do this with the GraceDb REST interface
>>     reportpath = '%s-%s.json'%(__process_name__.replace(' ',''), graceid)
>>     with open(reportpath, 'w') as file_obj:
>>         json.dump(report, file_obj)
>>
>>     gdbconn = gracedb.connect(opts.gracedb_url)
>>     gdbconn.writeLog(
>>         graceid,
>>         process_name+' report',
>>         filename=reportpath,
>>         tagname=[__process_name__],
>>     )

Please note: the convention for naming JSON reports is to remove all spaces and append the GraceID as is done above. If you do not follow this format, the DQR will not be able to discover your JSON report in GraceDb. The look-up is actually based on the task name as defined in dqr.condor (see below), and it is the responsibility of developers to make sure the strings used in conditionals defined wihtin dqr.condor match the naming convention of their JSON reports.

Additionally: please always include a link to your product’s documentation, as is done above. You should write basic documentation for your task within the Sphinx docs for this repository. Instructions for how to do this can be found in the detailed guied to How to contribute to the DQR to the repo. When adding a new task, you should create a file specifically for that task and include the following sections

  • What does this task do?

  • What are its return states?

  • How was it reviewed?

  • How should results be interpreted?

  • What INI options, config files are required?

  • Are there any derived tasks that are based on this one?

Several examples of this can be found in the hyperlink: existing docs. In addition to adding a new file for your new task, you will need to update ~/doc/source/tasks/tasks.rst to include your new file within the table of contents.

Modify the DAG generation

In order for the DQR to schedule your follow-up task, it must know how to generate an associated Condor SUB file for it. This is managed within dqr.condor and relies on knowledge of your product’s name. Again for concreteness, we assume your project is called your awesome product.

Write a sub-generation helper function

You need to tell the DQR how to write a Condor SUB file for your new product. This is done via a function with a standard signature. You can specify as many options as you like through the dqr.ini config file, but for now we will assume there is a single option like we’ve written in the executable above.

Below, you’ll find an example function for your awesome product. You may need to modify this a bit to fit your specific needs. We’ve specifically called out only the required args and kwargs; this example grabs the one option as part of **kwargs but that of course can be changed as needed.:

>> def sub_your_awesome_product(
>>         graceid,
>>         gps,
>>         output_dir,
>>         output_url,
>>         gracedb_url=__default_gracedb_url__, ### not strictly required because it can be absorbed by **kwargs, but without **kwargs this is needed
>>         verbose=False,                       ### not strictly required because it can be absorbed by **kwargs, but without **kwargs this is needed
>>         email_upon_error=None,
>>         **kwargs,
>>     ):
>>    """
>>    write subfile for "your awesome product"
>>    return path/to/file.sub
>>    """
>>    option1 = kwargs.get('option1', 'default')                                                         ### retrieve the one option for your_awesome_product
>>
>>    condor_classads = dict()                                                                           ### set up classads
>>    condor_classads['executable'] = which('your_awesome_product')                                      ### find the full path to your executable
>>    condor_classads['log']    = os.path.join(output_dir, 'condor-your_awesome_product-%s.log'%graceid) ### set up Condor's output
>>    condor_classads['output'] = os.path.join(output_dir, 'condor-your_awesome_product-%s.out'%graceid)
>>    condor_classads['error']  = os.path.join(output_dir, 'condor-your_awesome_product-%s.err'%graceid)
>>    condor_classads['arguments'] = '%s %s'%(graceid, option1)                                          ### set up the arguments for your script
>>
>>    if email_upon_error:
>>        condor_classads.update(condor_notification(email_upon_error))                                  ### add a notification for failures directly from
>>                                                                                                       ### Condor if requested
>>
>>    path = os.path.join(output_dir, 'your_awesome_product.sub')                                        ### actually write the file
>>    with open(path, 'w') as file_obj:
>>        file_obj.write('\n'.join(' = '.join(item) for item in condor_classads.items())+'\nqueue 1')
>>    return path

If you look around dqr.condor, you’ll find more complicated examples that accept condor_classads as kwargs. Mimicking that structure will allow you to inherit Condor job specifications from the DEFAULT section of dqr.ini, but is not strictly necessary.

As a concrete example, this:

>> path = sub_your_awesome_product('G123456', 8675309.0, '.', '.')
>> print(path)
>> ./your_awesome_product.sub

should write the following into the SUB file:

executable = /full/path/to/your_awesome_product
log = ./condor-your_awesome_product-G123456.log
output = condor-your_awesome_product-G123456.out
error = condor-your_awesome_product-G123456.error
arguments = 'G123456 default'
queue 1

Modify the routing function

The DQR listener looks up which function to use when builing a SUB file within dqr.condor.sub(). Therefore, you must add a conditional statement associated with the helper function you just wrote and your product’s name. As an example, you should modify dqr.condor.sub() to look something like this:

>> def sub(task):
>>     """..."""
>>     if task=="some name":
>>         return sub_some_name
>>
>>     elif task=="another name":
>>         return sub_another_name
>>
>>     elif task=="your awesome product": ### add this conditional in to recognize your new product!
>>         return sub_your_awesome_product
>>
>>     else:
>>         raise ValueError('task=%s not understood'%task)

Now, when dqr.condor.sub() is called, it will return your helper function.

Please note: the DQR expects the JSON report generated by a task to be based on the the task string used in this routing function. For example, the task called “another name” should create a file for GraceId=G1234 called “anothername-G12134.json”.

Add a section to your dqr.ini

Now that you’ve told the DQR how to run your script, you need to configure it so that it actually will run your script. This is done with sections in dqr.ini. Within your copy of dqr.ini, create a new section that looks like the following:

[your awesome product]
# tell the DQR to include this as part of the DAG it generates
include_in_dag = True

# which latency tier to include this check in
tier = 0

# which high-level question this product addresses
question = A high level question

# configure which states your product is allowed to return
allow = human_input_needed pass

# configure toggles as a space-delimited list if desired
toggles = H1 L1 V1

# options specific to your product
option1 = "the option for your script"

While tier is only required to be an integer and question is only required to be a string, it is likely that you’ll want to re-use values that are already present in other sections. Please check and see whether any of those fit your product and only specify a new tier or question if absolutely necessary. In either case, you’ll need to tell your reviewers which tier and question you’ve chosen.

Please note: the section name must exactly match the string used within dqr.condor, otherwise your task will not be discoverable.

Test your changes

You now need to test your changes. We provide a more complete tutorial on how to test your code and the DQR in general (Testing a Technical Solution), but as a starting point you should try to install the code via:

>> pip install . --prefix=path/to/install/opt

If you’re running on a cluster and want to use default system packages instead of installing your own copies of the dependencies, you can instead do:

>> pip install . --prefix=path/to/install/opt --no-deps

See How to Install and Run the DQR in production for more details. Your review committee will almost certainly require you to demonstrate functionality using the tools outlined in Testing a Technical Solution, which goes beyond demonstrating that the code still installs.

Initiate a merge request to get your changes reviewed and included

Once you’ve tested your code, you should merge it into the production repository. The DQR uses a fork-and-pull development model, and the full procedure is described in more detail here (How to contribute to the DQR).

When creating the merge request, you should assign it to one of your reviewers. They will look over the changes, try to reproduce your tests, and iterate with you until the code is ready to be deployed.

Adding a new check that will not be run within the DQR’s DAG

We strongly encourage incorporating your new product within the DQR’s DAG (Adding a new check that will be run within the DQR’s DAG). However, this may not be practical in several situations, such as

  • the latency requirement for your product is faster than the DQR’s scope. This is the case for extrem low-latency checks performed by gwcelery, which will be managed outside the DQR’s DAG but still report their results to GraceDb in a DQR-compatible format.

  • the data needed for your product is not available at CIT, such as Virgo auxiliary channel information.

In this case, you will need to manage your own follow-up scheduler in addition to changing a few things in the DQR repo.

Write an executable

You will still need to write an executable, just like before (Write an executable). The requirements are the same as before: your executable must always post a properly formatted report to GraceDb but can be written in any language you wish.

Set up and manage your own follow-up scheduler

Once you’ve written your executable, you’ll need to manage a follow-up scheduler. You can find a instructions for interacting with GraceDb’s REST interface and LVAlert, as well as a tutorial within the GraceDb docs.

Add a section to your dqr.ini

You will also need to add section to dqr.ini, similar to what is described above (Add a section to your dqr.ini). However, because your product is managed outside the DQR, you will only need to specify the tier and question options.

Within your copy of dqr.ini, create a new section that looks like the following:

[your awesome product]
# tell the DQR to not include this as part of the DAG it generates
include_in_dag = False

# which latency tier to include this check in
tier = 0

# which high-level question this product addresses
question = A high level question

# which states your task is allowed to return
allow = pass fail human_input_needed

# toggles (if desired)
toggles = H1

Test your changes

You now need to test your changes. We provide a more complete tutorial on how to test your code and the DQR in general (Testing a Technical Solution). Some of the tools described therein will be useful for testing your follow-up scheduler along with the DQR’s scheduler, and you are encouraged to use them.

Your reviewers will set the exact testing requirements for your product to be incorporated in the production DQR.

Initiate a merge request to get your changes reviewed and included

Once you’ve tested your code, you should merge it into the production repository. The DQR uses a fork-and-pull development model, and the full procedure is described in more detail here (How to contribute to the DQR). Please be sure to include

  • the name of your new product.

  • the permissions awarded to the product (pass, fail, and/or human_input_needed).

  • how the product will be managed (within DQR’s DAG or external to it).

    • if it will be run within the DQR’s DAG, please clearly enumerate all dependencies (e.g.: gwpy >= 0.12.0)

    • if it will run outside the DQR’s DAG, please specify where it will be run and how it those processes will be monitored

  • how you’ve tested the new code.

  • how you’ve tested the new product for efficacy.

When you open the merge request, be sure to assign it to one of your reviewers. If git.ligo.org will not allow you to assign the merge request, please tag @reed.essick (or one of the other maintainers) and they’ll make sure your request is processed. You should also specify that your new product will be run outside of the DQR’s DAG and provide references to where your own follow-up scheduler lives and how it will be run. While not conceptually more difficult, this will require more monitoring and oversight than incorporating your product directly within the DQR’s DAG.