Skip to contents

Given a set of event counts and patient numbers for multiple sites, where the event can occur at most once per patient, fit a Bayesian hierarchical Binomial model to the site-specific event counts.

Usage

fitBayesBinomialModel(
  data,
  n,
  r,
  model = NULL,
  inits = NULL,
  nChains = ifelse(is.null(inits), 2, length(inits)),
  ...
)

Arguments

data

The data.frame containing the participant and event counts

n

The column in data containing the participant counts. Uses tidy evaluation

r

The column in data containing the event counts. Uses tidy evaluation

model

The character string containing the JAGS model to be fitted. If NULL, obtained from getModelString("binomial).

inits

A list of JAGS inits lists suitable for use with this model. If NULL (the default), nChains random inits are generated by .createBinomialInit.

nChains

The number of chains to use. Default 2. If inits is not NULL, must equal length(inits) or be NULL.

...

passed to .createBinomialInits or .autorunJagsAndCaptureOutput

Value

A tibble with four columns

p

Simulated values from the posterior distribution of the event probabilities.

shape1

Simulated values from the posterior distribution of the first shape parameter of the Beta distribution for p.

shape2

Simulated values from the posterior distribution of the second shape parameter of the Beta distribution for p.

q

Percentile in which each p value falls.

Details

The count of patients experiencing an event at a given site is assumed to follow a Binomial distribution, a standard statistical distribution that expresses the probability of a given number of events out of a fixed number of possible events. The probability depends on a site-specific probability for the event to occur for a single patient p_i and the total number of patients at each site. The site-specific probabilities, p_i, are assumed to follow a Beta distribution; a continuous distribution of values between 0 and 1, with two parameters, a and b, that determine the shape and skewness of the distribution. Uncertainty in the parameters of the Beta distribution is accounted for by specifying prior distributions on these parameters. These "hyperpriors" are specified as Gamma distributions.

The parameters of the hyper-priors are specified by a and b. The default settings specify Gamma(1, 10) distributions which allow bell-shaped distributions for 0.05 < p_i < 0.95, but put a low probability of very precise distributions.

The Bayesian Hierarchical Model then estimates the posterior distribution for the unknown parameters p_i, a and b, given the observed events. Under Bayes Theorem this is proportional to the likelihood of the observed data given the parameters multiplied by the prior for the parameters. The prior can be informed by historical data and/or expert knowledge. As more data are available, the posterior is less influenced by the prior and more influenced by the data.

The exact posterior distribution cannot be computed directly, hence a Markov Chain Monte Carlo (MCMC) method is used to simulate values of p_i, a and b from the posterior distribution. Any number Markov chains can be used, each with a minimum of 10,000 values simulated after 4000 burn-in iterations and 1000 adaptive iterations. The chains are checked for convergence and the simulation is extended if required to satisfy basic convergence tests. The samples from all chains are combined in the returned value.

Examples

results <- berrySummary %>% 
             fitBayesBinomialModel(
               n=Subjects,
               r=Events
             )
#> INFO [2024-11-01 09:30:47] Status of model fitting: OK