================
Using bilby_pipe
================

Command line interface
----------------------

The primary user-interface for this code is a command line tool
:code:`bilby_pipe`, for an overview of this and other executables,
see the `executable reference <executables.rst>`_.

Basics
------

The primary user-interface for this code is a command line tool
:code:`bilby_pipe` which is available after following the `installation
instructions <installation.txt>`_. To see the help for this tool, run

.. code-block:: console

$ bilby_pipe --help

(the complete output is given in the `reference <executables/main.html>`_)

To run :code:`bilby_pipe`, you first need to define an define `an ini file
<ini_file.txt>`_; examples for different types of ini files can be found below.

Once you have an ini file (for the purpose of clarity, lets say
:code:`my-run.ini`), you initialize your job with

.. code-block:: console

$ bilby_pipe my-run.ini

This will produce a directory structure as follows:

.. code-block:: console

my-run.ini
outdir/
-> data/
-> final_result/
-> log_data_analysis/
-> log_data_generation/
-> log_results_page/
-> result/
-> results_page/
-> submit/

Most of these folders will initially be empty, but as the job
progresses will be populated. The :code:`data` directory will contain all the
data to be analysed while :code:`result` will contain the :code:`*result.hdf5`
result files generated by :code:`bilby`` along with any plots. The
:code:`final_result` directory will contain the final result file, which is
created by merging the individual result files.
Note that the location to the :code:`log` and :code:`results_page` folders can be modified.

The final folder, :code:`submit`, contains all the of the DAG submission
scripts. To submit your job, run :code:`condor_submit_dag` giving as a
first argument the file prepended with :code:`dag` under :code:`outdir/submit`
(instructions to do this are printed to the terminal after).
Alternatively, you can initialise and submit your jobs with

.. code-block:: console

$ bilby_pipe my-run.ini --submit

Running all or part of the job directly
---------------------------------------

In some cases, you may need to run all or part of the job directly (e.g., not
through a scheduler). This can be done by using the file prepended with
:code:`bash` in the :code:`submit/` directory. This file is a simple bash script
that runs all commands in sequence. One simple way to run part of the job is to
open the bash file and copy the commands you require to another script and then
run that. For convenience, we also add if statements to the bash script to enable
you to run parts of the analysis by providing a pattern as a command line.
For example, to run the data generation step, you can call the bash script with
:code:`generation` in the arguments, e.g.:

.. code-block:: console

$ bash outdir/submit/bash_my_label.sh generation

If you want to run the analysis step and :code:`n-parallel=1`, then you would use

.. code-block:: console

$ bash outdir/submit/bash_my_label.sh analysis

Note, if :code:`n-parallel > 1` this will run all the parallel jobs. To run just
one, run (replacing :code:`par0` with the analysis you want to run):

.. code-block:: console

$ bash outdir/submit/bash_my_label.sh par0

Finally to merge the analyses, run

.. code-block:: console

$ bash outdir/submit/bash_my_label.sh merge

Internally, the bash script is simply matching the given argument to the job
name. This works in simple cases, but will likely fail or need inspection of
the base file itself in complicated cases. Moreover, if you use any of the
special key words (generation, analysis, par, or merge) in your label, the
ability to filter to single jobs will be lost.

Using the slurm batch scheduler
-------------------------------

By default, :code:`bilby_pipe` runs under a HTCondor environment (the default
for the IGWN grid). It can also be used on a slurm-based cluster. Here we
give a brief description of the steps required to run under slurm, but a full
list of available options, see the output of :code:`bilby_pipe --help`.

To use slurm, add :code:`scheduler=slurm` to your ini file. Typically, slurm
needs you to configure the correct environment, you can do this by
passing it in to :code:`scheduler-env=my-environment`. This will add the
following line to your submit scripts.

.. code-block:: console

$ source activate my-environment

(Note: for conda users, this is equivalent to :code:`conda activate
my-environment`).

If the cluster you are using does not provide network access on the compute
nodes, the data generation step may fail if an attempt is made to remotely
access the data. (If you are creating simulated data, or have local copies of
the data, this is, of course, not a problem). To resolve this issue, you can
set :code:`local-generation=True` in your ini file. The generation steps will
then be run on the head node when you invoke :code:`bilby_pipe` after which
you simply submit the job.

Slurm modules can be loaded using :code:`scheduler-modules`, a space-separated
list of modules to load. Additional commands to :code:`sbatch` can be given
using the :code:`scheduler-args` command.

Putting all this together, adding these lines to your ini file

.. code-block:: ini

scheduler = slurm
scheduler-args = arg1=val1 arg2=val2
scheduler-modules = git python
scheduler-env = my-environment
scheduler-analysis-time = 1-00:00:00 # Limit job to 1 day

Will produce a :code:`slurm` submit files which contains

.. code-block:: bash

#SBATCH --arg1=val1
#SBATCH --arg2=val2

module load git python

and individual bash scripts containing

.. code-block:: bash

module load git python

source activate my-environment

Summary webpage
---------------

:code:`bilby_pipe` allows the user to visualise the posterior samples through
a 'summary' webpage. This is implemented using `PESummary
<https://docs.ligo.org/charlie.hoy/pesummary/>`_.

To generate a summary webpage, the :code:`create-summary` option must be passed
in the configuration file. Additionally, you can specify a web directory where
you would like the output from :code:`PESummary` to be stored; by default this
is placed in :code:`outdir/results_page`. If you are working on an LDG cluster,
then the web directory should be in your public_html. Below is an example of
the additional lines to put in your configuration file to generate 'summary'
webpages:

.. code-block:: text

create-summary = True
email = albert.einstein@ligo.org
webdir = /home/albert.einstein/public_html/project

If you have already generated a webpage in the past using :code:`PESummary`,
then you are able to pass the :code:`existing-dir` options to add further
results files to a single webpage. This includes all histograms for each
results file as well as comparison plots. Below is an example of the additional
lines in the configuration file that will add to an existing webpage:

.. code-block:: text

create-summary = True
email = albert.einstein@ligo.org
existing-dir = /home/albert.einstein/public_html/project

Main function
-------------
Functionally, the main command line tool is
calling the function :code:`bilby_pipe.main.main()`, which is transcribed here:

.. code-block:: python

def main():
""" Top-level interface for bilby_pipe """
from bilby_pipe.job_creation.dag import Dag
args, unknown_args = parse_args(sys.argv[1:], create_parser())
inputs = MainInput(args, unknown_args)
# Create a Directed Acyclic Graph (DAG) of the workflow
Dag(inputs)

As you can see, there 3 steps. First the command line arguments are parsed, the
:code:`args` object stores the user inputs and any defaults (see `Command line
interface`_) while :code:`unknown_args` is a list of any unknown arguments.

The logic of handling the user input (in the form of the :code:`args` object)
is handled by the :func:`Main Input` object. Following this, the logic of generated a DAG
given that user input is handled by the :func:`Dag` object.