Using bilby_pipe
Command line interface
The primary user-interface for this code is a command line tool
bilby_pipe
, for an overview of this and other executables,
see the executable reference.
Basics
The primary user-interface for this code is a command line tool
bilby_pipe
which is available after following the installation
instructions. To see the help for this tool, run
$ bilby_pipe --help
(the complete output is given in the reference)
To run bilby_pipe
, you first need to define an define an ini file; examples for different types of ini files can be found below.
Once you have an ini file (for the purpose of clarity, lets say
my-run.ini
), you initialize your job with
$ bilby_pipe my-run.ini
This will produce a directory structure as follows:
my-run.ini
outdir/
-> data/
-> final_result/
-> log_data_analysis/
-> log_data_generation/
-> log_results_page/
-> result/
-> results_page/
-> submit/
Most of these folders will initially be empty, but as the job
progresses will be populated. The data
directory will contain all the
data to be analysed while result
will contain the *result.hdf5
result files generated by bilby`
along with any plots. The
final_result
directory will contain the final result file, which is
created by merging the individual result files.
Note that the location to the log
and results_page
folders can be modified.
The final folder, submit
, contains all the of the DAG submission
scripts. To submit your job, run condor_submit_dag
giving as a
first argument the file prepended with dag
under outdir/submit
(instructions to do this are printed to the terminal after).
Alternatively, you can initialise and submit your jobs with
$ bilby_pipe my-run.ini --submit
Running all or part of the job directly
In some cases, you may need to run all or part of the job directly (e.g., not
through a scheduler). This can be done by using the file prepended with
bash
in the submit/
directory. This file is a simple bash script
that runs all commands in sequence. One simple way to run part of the job is to
open the bash file and copy the commands you require to another script and then
run that. For convenience, we also add if statements to the bash script to enable
you to run parts of the analysis by providing a pattern as a command line.
For example, to run the data generation step, you can call the bash script with
generation
in the arguments, e.g.:
$ bash outdir/submit/bash_my_label.sh generation
If you want to run the analysis step and n-parallel=1
, then you would use
$ bash outdir/submit/bash_my_label.sh analysis
Note, if n-parallel > 1
this will run all the parallel jobs. To run just
one, run (replacing par0
with the analysis you want to run):
$ bash outdir/submit/bash_my_label.sh par0
Finally to merge the analyses, run
$ bash outdir/submit/bash_my_label.sh merge
Internally, the bash script is simply matching the given argument to the job name. This works in simple cases, but will likely fail or need inspection of the base file itself in complicated cases. Moreover, if you use any of the special key words (generation, analysis, par, or merge) in your label, the ability to filter to single jobs will be lost.
Using the slurm batch scheduler
By default, bilby_pipe
runs under a HTCondor environment (the default
for the IGWN grid). It can also be used on a slurm-based cluster. Here we
give a brief description of the steps required to run under slurm, but a full
list of available options, see the output of bilby_pipe --help
.
To use slurm, add scheduler=slurm
to your ini file. Typically, slurm
needs you to configure the correct environment, you can do this by
passing it in to scheduler-env=my-environment
. This will add the
following line to your submit scripts.
$ source activate my-environment
(Note: for conda users, this is equivalent to conda activate
my-environment
).
If the cluster you are using does not provide network access on the compute
nodes, the data generation step may fail if an attempt is made to remotely
access the data. (If you are creating simulated data, or have local copies of
the data, this is, of course, not a problem). To resolve this issue, you can
set local-generation=True
in your ini file. The generation steps will
then be run on the head node when you invoke bilby_pipe
after which
you simply submit the job.
Slurm modules can be loaded using scheduler-modules
, a space-separated
list of modules to load. Additional commands to sbatch
can be given
using the scheduler-args
command.
Putting all this together, adding these lines to your ini file
scheduler = slurm
scheduler-args = arg1=val1 arg2=val2
scheduler-modules = git python
scheduler-env = my-environment
scheduler-analysis-time = 1-00:00:00 # Limit job to 1 day
Will produce a slurm
submit files which contains
#SBATCH --arg1=val1
#SBATCH --arg2=val2
module load git python
and individual bash scripts containing
module load git python
source activate my-environment
Summary webpage
bilby_pipe
allows the user to visualise the posterior samples through
a ‘summary’ webpage. This is implemented using PESummary.
To generate a summary webpage, the create-summary
option must be passed
in the configuration file. Additionally, you can specify a web directory where
you would like the output from PESummary
to be stored; by default this
is placed in outdir/results_page
. If you are working on an LDG cluster,
then the web directory should be in your public_html. Below is an example of
the additional lines to put in your configuration file to generate ‘summary’
webpages:
create-summary = True
email = albert.einstein@ligo.org
webdir = /home/albert.einstein/public_html/project
If you have already generated a webpage in the past using PESummary
,
then you are able to pass the existing-dir
options to add further
results files to a single webpage. This includes all histograms for each
results file as well as comparison plots. Below is an example of the additional
lines in the configuration file that will add to an existing webpage:
create-summary = True
email = albert.einstein@ligo.org
existing-dir = /home/albert.einstein/public_html/project
Main function
Functionally, the main command line tool is
calling the function bilby_pipe.main.main()
, which is transcribed here:
def main():
""" Top-level interface for bilby_pipe """
from bilby_pipe.job_creation.dag import Dag
args, unknown_args = parse_args(sys.argv[1:], create_parser())
inputs = MainInput(args, unknown_args)
# Create a Directed Acyclic Graph (DAG) of the workflow
Dag(inputs)
As you can see, there 3 steps. First the command line arguments are parsed, the
args
object stores the user inputs and any defaults (see Command line
interface) while unknown_args
is a list of any unknown arguments.
The logic of handling the user input (in the form of the args
object)
is handled by the Main Input()
object. Following this, the logic of generated a DAG
given that user input is handled by the Dag()
object.