gwpopulation_pipe.utils.MinimumEffectiveSamplesLikelihood

class MinimumEffectiveSamplesLikelihood(posteriors, hyper_prior, ln_evidences=None, max_samples=1e+100, selection_function=<function HyperparameterLikelihood.<lambda>>, conversion_function=<function HyperparameterLikelihood.<lambda>>, cupy=False, maximum_uncertainty=inf)[source]

Bases: HyperparameterLikelihood

__init__(posteriors, hyper_prior, ln_evidences=None, max_samples=1e+100, selection_function=<function HyperparameterLikelihood.<lambda>>, conversion_function=<function HyperparameterLikelihood.<lambda>>, cupy=False, maximum_uncertainty=inf)
Parameters:
posteriors: list

An list of pandas data frames of samples sets of samples. Each set may have a different size. These can contain a prior column containing the original prior values.

hyper_prior: `bilby.hyper.model.Model`

The population model, this can alternatively be a function.

ln_evidences: list, optional

Log evidences for single runs to ensure proper normalisation of the hyperparameter likelihood. If not provided, the original evidences will be set to 0. This produces a Bayes factor between the sampling power_prior and the hyperparameterised model.

selection_function: func

Function which evaluates your population selection function.

conversion_function: func

Function which converts a dictionary of sampled parameter to a dictionary of parameters of the population model.

max_samples: int, optional

Maximum number of samples to use from each set.

cupy: bool

DEPRECATED: if you want to use cupy, you should manually set the backend using gwpopulation.set_backend.

maximum_uncertainty: float

The maximum allowed uncertainty in the natural log likelihood. If the uncertainty is larger than this value a log likelihood of -inf will be returned. Default = inf

__call__(*args, **kwargs)

Call self as a function.

Methods

__init__(posteriors, hyper_prior[, ...])

generate_extra_statistics(sample)

Given an input sample, add extra statistics

generate_rate_posterior_sample()

Generate a sample from the posterior distribution for rate assuming a \(1 / R\) prior.

ln_likelihood_and_variance()

Compute the ln likelihood estimator and its variance.

log_likelihood()

log_likelihood_ratio()

Difference between log likelihood and noise log likelihood

noise_log_likelihood()

per_event_bayes_factors_and_n_effective()

per_event_bayes_factors_and_n_effective_and_variances()

Called by _compute_per_event_ln_bayes_factors to compute the per event BFs, effective number of samples for each event's computed BF, and the associated uncertainty (variance) in the ln BF.

posterior_predictive_resample(samples[, ...])

Resample the original single event posteriors to use the PPD from each of the other events as the prior.

resample_posteriors(posteriors[, max_samples])

Convert list of pandas DataFrame object to dict of arrays.

Attributes

marginalized_parameters

maximum_uncertainty

The maximum allowed uncertainty in the estimate of the log-likelihood.

meta_data

generate_extra_statistics(sample)

Given an input sample, add extra statistics

Adds:

  • ln_bf_idx: \(\frac{\ln {\cal L}(d_{i} | \Lambda)} {\ln {\cal L}(d_{i} | \varnothing)}\) for each of the events in the data

  • selection: \(P_{\rm det}\)

  • var_idx, selection_variance: the uncertainty in each Monte Carlo integral

  • total_variance: the total variance in the likelihood

Note

The quantity selection_variance is the variance in P_{rm det} and not the total variance from the contribution of the selection function to the likelihood.

Parameters:
sample: dict

Input sample to compute the extra things for.

Returns:
sample: dict

The input dict, modified in place.

generate_rate_posterior_sample()

Generate a sample from the posterior distribution for rate assuming a \(1 / R\) prior.

The likelihood evaluated is analytically marginalized over rate. However the rate dependent likelihood can be trivially written.

\[p(R) = \Gamma(n=N, \text{scale}=\mathcal{V})\]

Here \(\Gamma\) is the Gamma distribution, \(N\) is the number of events being analyzed and \(\mathcal{V}\) is the total observed 4-volume.

Note

This function only uses the numpy backend. It can be used with the other backends as it returns a float, but does not support e.g., autodifferentiation with jax.

Returns:
rate: float

A sample from the posterior distribution for rate.

ln_likelihood_and_variance()

Compute the ln likelihood estimator and its variance.

log_likelihood()
Returns:
float
log_likelihood_ratio()

Difference between log likelihood and noise log likelihood

Returns:
float
property maximum_uncertainty

The maximum allowed uncertainty in the estimate of the log-likelihood. If the uncertainty is larger than this value a log likelihood of -inf is returned.

noise_log_likelihood()
Returns:
float
per_event_bayes_factors_and_n_effective_and_variances()[source]

Called by _compute_per_event_ln_bayes_factors to compute the per event BFs, effective number of samples for each event’s computed BF, and the associated uncertainty (variance) in the ln BF. Computes same qunatities as superclass function _compute_per_event_ln_bayes_factors but additionally provides the effective sample size.

Returns:
per_event_bfs: array-like

The BF per event, computed by reweighting single-event likelihood samples into the hyper_prior model.

n_effectives: array-like

The effective sample size for each Monte Carlo sum computation of the BFs. The BF is computed for each event, so this array has length n_events.

variance: array-like

The variances (uncertainties) in the ln BF per event.

posterior_predictive_resample(samples, return_weights=False)

Resample the original single event posteriors to use the PPD from each of the other events as the prior.

Parameters:
samples: pd.DataFrame, dict, list

The samples to do the weighting over, typically the posterior from some run.

return_weights: bool, optional

Whether to return the per-sample weights, default = False

Returns:
new_samples: dict

Dictionary containing the weighted posterior samples for each of the events.

weights: array-like

Weights to apply to the samples, only if return_weights == True.

resample_posteriors(posteriors, max_samples=1e+300)

Convert list of pandas DataFrame object to dict of arrays.

Parameters:
posteriors: list

List of pandas DataFrame objects.

max_samples: int, opt

Maximum number of samples to take from each posterior, default is length of shortest posterior chain.

Returns:
data: dict

Dictionary containing arrays of size (n_posteriors, max_samples) There is a key for each shared key in posteriors.