epim is the only model fitting function in epidemia. It takes a model description, a dataframe, and additional arguments relating to the fitting algorithm, and translates this to data that is then passed to a precompiled Stan program which is used to fit the model. This allows model fitting to begin immediately as opposed to requiring compilation each time epim is called.

epim(
  rt,
  inf,
  obs,
  data,
  algorithm = c("sampling", "meanfield", "fullrank"),
  group_subset = NULL,
  prior_PD = FALSE,
  ...
)

Arguments

rt

An object of class epirt. This defines the model for the time-varying reproduction rates. See epirt for more details.

inf

An object of class epiinf. This defines the model for latent infections. See epiinf for more details.

obs

Either an epiobs object, or a list of such objects. Each element in the list defines a model for the specified observation vector. See epiobs for more details.

data

A dataframe with all data required for fitting the model. This includes all observation variables and covariates specified in the models for the reproduction number and ascertainment rates.

algorithm

One of "sampling", "meanfield" or "fullrank". This specifies which rstan function to use for fitting the model. For "sampling" this is sampling, otherwise this is vb.

group_subset

If specified, a character vector naming a subset of regions to include in the model.

prior_PD

Same as in stan_glm. If TRUE, samples all parameters from the joint prior distribution. This is useful for prior predictive checks. Defaults to FALSE.

...

Additional arguments to pass to the rstan function used to fit the model.

Value

An object of class epimodel.

Details

This is similar to the workflow for fitting Bayesian regression models with rstanarm. A key difference, however, is that the models fit by epidemia are much more complex, and are therefore inherently more difficult to specify. epidemia aims to simplify this process by modularizing the model definition into three distinct parts: transmission, infections and observations. These components of the model are defined with the functions epirt, epiinf and epiobs respectively.

epim has arguments rt, inf and obs which expect a description of the transmission model, infection model and observational models respectively. Together, these fully define the joint distribution of data and parameters. Each of these model components are described in terms of variables that are expected to live in a single dataframe, data. This dataframe must be compatible with the model components, in the sense that it holds all variables defined in these models.

In addition to taking a model description and a dataframe, epim has various additional arguments which specify how the model should be fit. If algorithm = "sampling" then the model will be fit using Stan’s adaptive Hamiltonian Monte Carlo sampler. This is done internally by calling sampling. If algorithm = "meanfield" or algorithm = "fullrank", then Stan’s variational Bayes algorithms are used instead, by calling vb. Any unnamed arguments in the call to epim are passed directly on to the rstan sampling function. epim returns a fitted model object of class epimodel, which contains posterior samples from the model along with other useful objects.

In general, the adaptive Hamiltonian Monte Carlo sampler should be used for final inference. Nonetheless, fitting these models using HMC is often computationally demanding, and variational Bayes can often be fruitful for quickly iterating models.

Examples

# \donttest{ library(EpiEstim) data("Flu1918") date <- as.Date("1918-01-01") + seq(0, along.with = c(NA, Flu1918$incidence)) data <- data.frame( city = "Baltimore", cases = c(NA, Flu1918$incidence), date = date, week = lubridate::week(date) ) rt <- epirt( formula = R(city, date) ~ rw(time = week, prior_scale = 0.1), prior_intercept = rstanarm::normal(log(2), 0.2), link = 'log' ) obs <- epiobs( formula = cases ~ 1, prior_intercept = rstanarm::normal(location=1, scale=0.01), link = "identity", i2o = rep(.25,4) ) args <- list( rt = rt, inf = epiinf(gen = Flu1918$si_distr), obs = obs, data = data, algorithm = "fullrank", iter = 1e4, seed = 12345 ) fm <- do.call(epim, args)
#> Chain 1: ------------------------------------------------------------ #> Chain 1: EXPERIMENTAL ALGORITHM: #> Chain 1: This procedure has not been thoroughly tested and may be unstable #> Chain 1: or buggy. The interface is subject to change. #> Chain 1: ------------------------------------------------------------ #> Chain 1: #> Chain 1: #> Chain 1: Rejecting initial value: #> Chain 1: Error evaluating the log probability at the initial value. #> Chain 1: Exception: neg_binomial_2_lpmf: Location parameter[1] is -7.96211e-06, but must be > 0! (in 'model_epidemia_base' at line 132) #> #> Chain 1: #> Chain 1: Gradient evaluation took 0.000289 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.89 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Begin eta adaptation. #> Chain 1: Iteration: 1 / 250 [ 0%] (Adaptation) #> Chain 1: Iteration: 50 / 250 [ 20%] (Adaptation) #> Chain 1: Iteration: 100 / 250 [ 40%] (Adaptation) #> Chain 1: Iteration: 150 / 250 [ 60%] (Adaptation) #> Chain 1: Iteration: 200 / 250 [ 80%] (Adaptation) #> Chain 1: Iteration: 250 / 250 [100%] (Adaptation) #> Chain 1: Success! Found best value [eta = 0.1]. #> Chain 1: #> Chain 1: Begin stochastic gradient ascent. #> Chain 1: iter ELBO delta_ELBO_mean delta_ELBO_med notes #> Chain 1: 100 -10224.766 1.000 1.000 #> Chain 1: 200 -6242.872 0.819 1.000 #> Chain 1: 300 -4460.624 0.679 0.638 #> Chain 1: 400 -3018.208 0.629 0.638 #> Chain 1: 500 -1929.581 0.616 0.564 #> Chain 1: 600 -1509.661 0.560 0.564 #> Chain 1: 700 -1172.784 0.521 0.478 #> Chain 1: 800 -1042.063 0.471 0.478 #> Chain 1: 900 -943.309 0.431 0.400 #> Chain 1: 1000 -1118.005 0.403 0.400 #> Chain 1: 1100 -879.886 0.330 0.287 #> Chain 1: 1200 -838.614 0.271 0.278 #> Chain 1: 1300 -821.186 0.233 0.271 #> Chain 1: 1400 -770.641 0.192 0.156 #> Chain 1: 1500 -863.898 0.147 0.125 #> Chain 1: 1600 -810.893 0.125 0.108 #> Chain 1: 1700 -735.930 0.107 0.105 #> Chain 1: 1800 -680.682 0.102 0.102 #> Chain 1: 1900 -705.569 0.095 0.081 #> Chain 1: 2000 -699.094 0.081 0.066 #> Chain 1: 2100 -613.973 0.068 0.066 #> Chain 1: 2200 -660.284 0.070 0.070 #> Chain 1: 2300 -691.698 0.072 0.070 #> Chain 1: 2400 -650.282 0.072 0.070 #> Chain 1: 2500 -655.797 0.062 0.065 #> Chain 1: 2600 -661.337 0.056 0.064 #> Chain 1: 2700 -601.600 0.056 0.064 #> Chain 1: 2800 -590.878 0.050 0.045 #> Chain 1: 2900 -598.213 0.047 0.045 #> Chain 1: 3000 -654.266 0.055 0.064 #> Chain 1: 3100 -550.376 0.060 0.064 #> Chain 1: 3200 -561.904 0.055 0.045 #> Chain 1: 3300 -559.239 0.051 0.021 #> Chain 1: 3400 -595.660 0.051 0.021 #> Chain 1: 3500 -595.639 0.050 0.021 #> Chain 1: 3600 -570.589 0.053 0.044 #> Chain 1: 3700 -581.496 0.045 0.021 #> Chain 1: 3800 -590.731 0.045 0.021 #> Chain 1: 3900 -568.836 0.048 0.038 #> Chain 1: 4000 -557.317 0.041 0.021 #> Chain 1: 4100 -506.965 0.032 0.021 #> Chain 1: 4200 -519.702 0.033 0.025 #> Chain 1: 4300 -504.915 0.035 0.029 #> Chain 1: 4400 -531.954 0.034 0.029 #> Chain 1: 4500 -598.248 0.045 0.038 #> Chain 1: 4600 -496.190 0.061 0.038 #> Chain 1: 4700 -527.987 0.066 0.051 #> Chain 1: 4800 -521.152 0.065 0.051 #> Chain 1: 4900 -528.863 0.063 0.051 #> Chain 1: 5000 -526.002 0.061 0.051 #> Chain 1: 5100 -542.573 0.055 0.031 #> Chain 1: 5200 -528.794 0.055 0.031 #> Chain 1: 5300 -469.201 0.064 0.051 #> Chain 1: 5400 -489.398 0.063 0.041 #> Chain 1: 5500 -523.486 0.059 0.041 #> Chain 1: 5600 -480.915 0.047 0.041 #> Chain 1: 5700 -476.770 0.042 0.031 #> Chain 1: 5800 -522.249 0.049 0.041 #> Chain 1: 5900 -507.258 0.051 0.041 #> Chain 1: 6000 -518.603 0.053 0.041 #> Chain 1: 6100 -474.961 0.059 0.065 #> Chain 1: 6200 -494.737 0.060 0.065 #> Chain 1: 6300 -514.214 0.051 0.041 #> Chain 1: 6400 -510.252 0.048 0.040 #> Chain 1: 6500 -476.462 0.048 0.040 #> Chain 1: 6600 -517.731 0.048 0.040 #> Chain 1: 6700 -485.916 0.053 0.065 #> Chain 1: 6800 -489.885 0.045 0.040 #> Chain 1: 6900 -465.209 0.048 0.053 #> Chain 1: 7000 -472.794 0.047 0.053 #> Chain 1: 7100 -475.276 0.038 0.040 #> Chain 1: 7200 -505.793 0.040 0.053 #> Chain 1: 7300 -463.735 0.046 0.060 #> Chain 1: 7400 -468.833 0.046 0.060 #> Chain 1: 7500 -504.262 0.046 0.060 #> Chain 1: 7600 -496.296 0.040 0.053 #> Chain 1: 7700 -496.305 0.033 0.016 #> Chain 1: 7800 -453.705 0.042 0.053 #> Chain 1: 7900 -494.136 0.045 0.060 #> Chain 1: 8000 -500.360 0.044 0.060 #> Chain 1: 8100 -464.209 0.051 0.070 #> Chain 1: 8200 -469.179 0.046 0.070 #> Chain 1: 8300 -479.486 0.040 0.021 #> Chain 1: 8400 -457.231 0.043 0.049 #> Chain 1: 8500 -460.266 0.037 0.021 #> Chain 1: 8600 -473.518 0.038 0.028 #> Chain 1: 8700 -443.380 0.045 0.049 #> Chain 1: 8800 -458.211 0.039 0.032 #> Chain 1: 8900 -463.556 0.032 0.028 #> Chain 1: 9000 -468.748 0.032 0.028 #> Chain 1: 9100 -458.397 0.026 0.023 #> Chain 1: 9200 -450.545 0.027 0.023 #> Chain 1: 9300 -448.172 0.025 0.023 #> Chain 1: 9400 -457.934 0.022 0.021 #> Chain 1: 9500 -458.506 0.022 0.021 #> Chain 1: 9600 -444.452 0.022 0.021 #> Chain 1: 9700 -456.377 0.018 0.021 #> Chain 1: 9800 -456.189 0.015 0.017 #> Chain 1: 9900 -441.026 0.017 0.021 #> Chain 1: 10000 -470.592 0.022 0.023 #> Chain 1: Informational Message: The maximum number of iterations is reached! The algorithm may not have converged. #> Chain 1: This variational approximation is not guaranteed to be meaningful. #> Chain 1: #> Chain 1: Drawing a sample of size 1000 from the approximate posterior... #> Chain 1: COMPLETED.
#> Warning: Pareto k diagnostic value is 4.8. Resampling is disabled. Decreasing tol_rel_obj may help if variational algorithm has terminated prematurely. Otherwise consider using sampling instead.
plot_rt(fm)
# }