gwpopulation_pipe.utils.MinimumEffectiveSamplesLikelihood

class MinimumEffectiveSamplesLikelihood(posteriors, hyper_prior, ln_evidences=None, max_samples=1e+100, selection_function=<function HyperparameterLikelihood.<lambda>>, conversion_function=<function HyperparameterLikelihood.<lambda>>, cupy=False, maximum_uncertainty=inf)[source]

Bases: HyperparameterLikelihood

__init__(posteriors, hyper_prior, ln_evidences=None, max_samples=1e+100, selection_function=<function HyperparameterLikelihood.<lambda>>, conversion_function=<function HyperparameterLikelihood.<lambda>>, cupy=False, maximum_uncertainty=inf)

Parameters:

posteriors: list: An list of pandas data frames of samples sets of samples. Each set may have a different size. These can contain a prior column containing the original prior values.
hyper_prior: `bilby.hyper.model.Model`: The population model, this can alternatively be a function.
ln_evidences: list, optional: Log evidences for single runs to ensure proper normalisation of the hyperparameter likelihood. If not provided, the original evidences will be set to 0. This produces a Bayes factor between the sampling power_prior and the hyperparameterised model.
selection_function: func: Function which evaluates your population selection function.
conversion_function: func: Function which converts a dictionary of sampled parameter to a dictionary of parameters of the population model.
max_samples: int, optional: Maximum number of samples to use from each set.
cupy: bool: DEPRECATED: if you want to use cupy, you should manually set the backend using gwpopulation.set_backend.
maximum_uncertainty: float: The maximum allowed uncertainty in the natural log likelihood. If the uncertainty is larger than this value a log likelihood of -inf will be returned. Default = inf

__call__(*args, **kwargs): Call self as a function.

Methods

`__init__`(posteriors, hyper_prior[, ...])
`generate_extra_statistics`(sample)	Given an input sample, add extra statistics
`generate_rate_posterior_sample`()	Generate a sample from the posterior distribution for rate assuming a \(1 / R\) prior.
`ln_likelihood_and_variance`()	Compute the ln likelihood estimator and its variance.
`log_likelihood`()
`log_likelihood_ratio`()	Difference between log likelihood and noise log likelihood
`noise_log_likelihood`()
`per_event_bayes_factors_and_n_effective`()
`per_event_bayes_factors_and_n_effective_and_variances`()	Called by _compute_per_event_ln_bayes_factors to compute the per event BFs, effective number of samples for each event's computed BF, and the associated uncertainty (variance) in the ln BF.
`posterior_predictive_resample`(samples[, ...])	Resample the original single event posteriors to use the PPD from each of the other events as the prior.
`resample_posteriors`(posteriors[, max_samples])	Convert list of pandas DataFrame object to dict of arrays.

Attributes

`marginalized_parameters`
`maximum_uncertainty`	The maximum allowed uncertainty in the estimate of the log-likelihood.
`meta_data`

generate_extra_statistics(sample)

Given an input sample, add extra statistics

Adds:

ln_bf_idx: \(\frac{\ln {\cal L}(d_{i} | \Lambda)} {\ln {\cal L}(d_{i} | \varnothing)}\) for each of the events in the data
selection: \(P_{\rm det}\)
var_idx, selection_variance: the uncertainty in each Monte Carlo integral
total_variance: the total variance in the likelihood

Note

The quantity selection_variance is the variance in P_{rm det} and not the total variance from the contribution of the selection function to the likelihood.

Parameters:

sample: dict: Input sample to compute the extra things for.

Returns:

sample: dict: The input dict, modified in place.

generate_rate_posterior_sample()

Generate a sample from the posterior distribution for rate assuming a \(1 / R\) prior.

The likelihood evaluated is analytically marginalized over rate. However the rate dependent likelihood can be trivially written.

\[p(R) = \Gamma(n=N, \text{scale}=\mathcal{V})\]

Here \(\Gamma\) is the Gamma distribution, \(N\) is the number of events being analyzed and \(\mathcal{V}\) is the total observed 4-volume.

Note

This function only uses the numpy backend. It can be used with the other backends as it returns a float, but does not support e.g., autodifferentiation with jax.

Returns:

rate: float: A sample from the posterior distribution for rate.

ln_likelihood_and_variance(): Compute the ln likelihood estimator and its variance.

log_likelihood()

Returns:

float

log_likelihood_ratio()

Difference between log likelihood and noise log likelihood

Returns:

float

property maximum_uncertainty: The maximum allowed uncertainty in the estimate of the log-likelihood. If the uncertainty is larger than this value a log likelihood of -inf is returned.

noise_log_likelihood()

Returns:

float

per_event_bayes_factors_and_n_effective_and_variances()[source]

Called by _compute_per_event_ln_bayes_factors to compute the per event BFs, effective number of samples for each event’s computed BF, and the associated uncertainty (variance) in the ln BF. Computes same qunatities as superclass function _compute_per_event_ln_bayes_factors but additionally provides the effective sample size.

Returns:

per_event_bfs: array-like: The BF per event, computed by reweighting single-event likelihood samples into the hyper_prior model.
n_effectives: array-like: The effective sample size for each Monte Carlo sum computation of the BFs. The BF is computed for each event, so this array has length n_events.
variance: array-like: The variances (uncertainties) in the ln BF per event.

posterior_predictive_resample(samples, return_weights=False)

Resample the original single event posteriors to use the PPD from each of the other events as the prior.

Parameters:

samples: pd.DataFrame, dict, list: The samples to do the weighting over, typically the posterior from some run.
return_weights: bool, optional: Whether to return the per-sample weights, default = False

Returns:

new_samples: dict: Dictionary containing the weighted posterior samples for each of the events.
weights: array-like: Weights to apply to the samples, only if return_weights == True.

resample_posteriors(posteriors, max_samples=1e+300)

Convert list of pandas DataFrame object to dict of arrays.

Parameters:

posteriors: list: List of pandas DataFrame objects.
max_samples: int, opt: Maximum number of samples to take from each posterior, default is length of shortest posterior chain.

Returns:

data: dict: Dictionary containing arrays of size (n_posteriors, max_samples) There is a key for each shared key in posteriors.