epim
is the only model fitting function in epidemia.
It takes a model description, a dataframe, and additional
arguments relating to the fitting algorithm, and translates this
to data that is then passed to a precompiled Stan program which is used to fit the model.
This allows model fitting to begin immediately as opposed to requiring compilation
each time epim
is called.
epim( rt, inf, obs, data, algorithm = c("sampling", "meanfield", "fullrank"), group_subset = NULL, prior_PD = FALSE, ... )
rt | An object of class |
---|---|
inf | An object of class |
obs | Either an |
data | A dataframe with all data required for fitting the model. This includes all observation variables and covariates specified in the models for the reproduction number and ascertainment rates. |
algorithm | One of |
group_subset | If specified, a character vector naming a subset of regions to include in the model. |
prior_PD | Same as in |
... | Additional arguments to pass to the rstan function used to fit the model. |
An object of class epimodel
.
This is similar to the workflow for fitting Bayesian regression models with rstanarm.
A key difference, however, is that the models fit by epidemia are much more complex,
and are therefore inherently more difficult to specify. epidemia aims to simplify this
process by modularizing the model definition into three distinct parts: transmission, infections and observations.
These components of the model are defined with the functions epirt
, epiinf
and epiobs
respectively.
epim
has arguments
rt
, inf
and obs
which expect a description of the
transmission model, infection model and observational models respectively.
Together, these fully define the joint distribution of data and parameters.
Each of these model components are described in terms of variables that are expected to live in a single dataframe, data
.
This dataframe must be compatible with the model components, in the sense that it holds all variables defined in these models.
In addition to taking a model description and a dataframe, epim
has various
additional arguments which specify how the model should be fit. If algorithm = "sampling"
then the model will be fit using Stan’s adaptive Hamiltonian Monte Carlo sampler.
This is done internally by calling sampling
. If
algorithm = "meanfield"
or algorithm = "fullrank"
, then
Stan’s variational Bayes algorithms are used instead, by calling vb
.
Any unnamed arguments in the call to epim
are passed directly on to the rstan sampling function.
epim
returns a fitted model object of class epimodel
, which contains posterior samples from the model along with other useful objects.
In general, the adaptive Hamiltonian Monte Carlo sampler should be used for final inference. Nonetheless, fitting these models using HMC is often computationally demanding, and variational Bayes can often be fruitful for quickly iterating models.
# \donttest{ library(EpiEstim) data("Flu1918") date <- as.Date("1918-01-01") + seq(0, along.with = c(NA, Flu1918$incidence)) data <- data.frame( city = "Baltimore", cases = c(NA, Flu1918$incidence), date = date, week = lubridate::week(date) ) rt <- epirt( formula = R(city, date) ~ rw(time = week, prior_scale = 0.1), prior_intercept = rstanarm::normal(log(2), 0.2), link = 'log' ) obs <- epiobs( formula = cases ~ 1, prior_intercept = rstanarm::normal(location=1, scale=0.01), link = "identity", i2o = rep(.25,4) ) args <- list( rt = rt, inf = epiinf(gen = Flu1918$si_distr), obs = obs, data = data, algorithm = "fullrank", iter = 1e4, seed = 12345 ) fm <- do.call(epim, args)#> Chain 1: ------------------------------------------------------------ #> Chain 1: EXPERIMENTAL ALGORITHM: #> Chain 1: This procedure has not been thoroughly tested and may be unstable #> Chain 1: or buggy. The interface is subject to change. #> Chain 1: ------------------------------------------------------------ #> Chain 1: #> Chain 1: #> Chain 1: Rejecting initial value: #> Chain 1: Error evaluating the log probability at the initial value. #> Chain 1: Exception: neg_binomial_2_lpmf: Location parameter[1] is -7.96211e-06, but must be > 0! (in 'model_epidemia_base' at line 154) #> #> Chain 1: #> Chain 1: Gradient evaluation took 0.000284 seconds #> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.84 seconds. #> Chain 1: Adjust your expectations accordingly! #> Chain 1: #> Chain 1: #> Chain 1: Begin eta adaptation. #> Chain 1: Iteration: 1 / 250 [ 0%] (Adaptation) #> Chain 1: Iteration: 50 / 250 [ 20%] (Adaptation) #> Chain 1: Iteration: 100 / 250 [ 40%] (Adaptation) #> Chain 1: Iteration: 150 / 250 [ 60%] (Adaptation) #> Chain 1: Iteration: 200 / 250 [ 80%] (Adaptation) #> Chain 1: Iteration: 250 / 250 [100%] (Adaptation) #> Chain 1: Success! Found best value [eta = 0.1]. #> Chain 1: #> Chain 1: Begin stochastic gradient ascent. #> Chain 1: iter ELBO delta_ELBO_mean delta_ELBO_med notes #> Chain 1: 100 -5860.749 1.000 1.000 #> Chain 1: 200 -4207.575 0.696 1.000 #> Chain 1: 300 -2650.502 0.660 0.587 #> Chain 1: 400 -1441.156 0.705 0.839 #> Chain 1: 500 -1216.215 0.601 0.587 #> Chain 1: 600 -1209.856 0.502 0.587 #> Chain 1: 700 -1127.632 0.440 0.393 #> Chain 1: 800 -910.625 0.415 0.393 #> Chain 1: 900 -974.331 0.376 0.238 #> Chain 1: 1000 -813.998 0.358 0.238 #> Chain 1: 1100 -906.963 0.269 0.197 #> Chain 1: 1200 -811.175 0.241 0.185 #> Chain 1: 1300 -704.472 0.197 0.151 #> Chain 1: 1400 -912.199 0.136 0.151 #> Chain 1: 1500 -638.591 0.161 0.151 #> Chain 1: 1600 -689.650 0.168 0.151 #> Chain 1: 1700 -624.428 0.171 0.151 #> Chain 1: 1800 -670.520 0.154 0.118 #> Chain 1: 1900 -677.843 0.148 0.118 #> Chain 1: 2000 -626.944 0.137 0.104 #> Chain 1: 2100 -546.190 0.141 0.118 #> Chain 1: 2200 -548.548 0.130 0.104 #> Chain 1: 2300 -586.539 0.121 0.081 #> Chain 1: 2400 -569.394 0.101 0.074 #> Chain 1: 2500 -586.787 0.062 0.069 #> Chain 1: 2600 -578.063 0.056 0.065 #> Chain 1: 2700 -566.092 0.047 0.030 #> Chain 1: 2800 -553.299 0.043 0.030 #> Chain 1: 2900 -587.195 0.047 0.030 #> Chain 1: 3000 -536.875 0.049 0.030 #> Chain 1: 3100 -603.497 0.045 0.030 #> Chain 1: 3200 -550.315 0.054 0.058 #> Chain 1: 3300 -538.754 0.050 0.030 #> Chain 1: 3400 -552.282 0.049 0.030 #> Chain 1: 3500 -530.377 0.051 0.041 #> Chain 1: 3600 -518.013 0.051 0.041 #> Chain 1: 3700 -519.449 0.050 0.041 #> Chain 1: 3800 -618.422 0.063 0.058 #> Chain 1: 3900 -533.789 0.073 0.094 #> Chain 1: 4000 -486.536 0.074 0.097 #> Chain 1: 4100 -526.909 0.070 0.077 #> Chain 1: 4200 -550.408 0.065 0.043 #> Chain 1: 4300 -517.502 0.069 0.064 #> Chain 1: 4400 -488.755 0.073 0.064 #> Chain 1: 4500 -580.766 0.084 0.077 #> Chain 1: 4600 -503.138 0.097 0.097 #> Chain 1: 4700 -535.972 0.103 0.097 #> Chain 1: 4800 -507.943 0.093 0.077 #> Chain 1: 4900 -485.150 0.081 0.064 #> Chain 1: 5000 -493.939 0.074 0.061 #> Chain 1: 5100 -469.627 0.071 0.059 #> Chain 1: 5200 -577.918 0.086 0.061 #> Chain 1: 5300 -469.447 0.102 0.061 #> Chain 1: 5400 -468.974 0.097 0.061 #> Chain 1: 5500 -500.700 0.087 0.061 #> Chain 1: 5600 -493.748 0.073 0.055 #> Chain 1: 5700 -492.404 0.067 0.052 #> Chain 1: 5800 -488.857 0.062 0.047 #> Chain 1: 5900 -471.381 0.061 0.037 #> Chain 1: 6000 -469.551 0.060 0.037 #> Chain 1: 6100 -442.325 0.061 0.037 #> Chain 1: 6200 -477.826 0.050 0.037 #> Chain 1: 6300 -485.489 0.028 0.016 #> Chain 1: 6400 -478.790 0.029 0.016 #> Chain 1: 6500 -461.772 0.027 0.016 #> Chain 1: 6600 -472.519 0.028 0.023 #> Chain 1: 6700 -455.661 0.031 0.037 #> Chain 1: 6800 -491.113 0.038 0.037 #> Chain 1: 6900 -441.741 0.045 0.037 #> Chain 1: 7000 -447.105 0.046 0.037 #> Chain 1: 7100 -458.586 0.042 0.037 #> Chain 1: 7200 -454.954 0.036 0.025 #> Chain 1: 7300 -457.807 0.035 0.025 #> Chain 1: 7400 -469.904 0.036 0.026 #> Chain 1: 7500 -436.675 0.040 0.026 #> Chain 1: 7600 -474.067 0.045 0.037 #> Chain 1: 7700 -440.314 0.049 0.072 #> Chain 1: 7800 -470.366 0.048 0.064 #> Chain 1: 7900 -458.155 0.040 0.027 #> Chain 1: 8000 -440.942 0.043 0.039 #> Chain 1: 8100 -467.038 0.046 0.056 #> Chain 1: 8200 -451.515 0.048 0.056 #> Chain 1: 8300 -434.668 0.052 0.056 #> Chain 1: 8400 -447.735 0.052 0.056 #> Chain 1: 8500 -435.657 0.047 0.039 #> Chain 1: 8600 -450.502 0.043 0.039 #> Chain 1: 8700 -438.568 0.038 0.034 #> Chain 1: 8800 -467.144 0.037 0.034 #> Chain 1: 8900 -482.755 0.038 0.034 #> Chain 1: 9000 -425.383 0.047 0.034 #> Chain 1: 9100 -447.279 0.047 0.034 #> Chain 1: 9200 -426.487 0.048 0.039 #> Chain 1: 9300 -436.099 0.047 0.033 #> Chain 1: 9400 -436.387 0.044 0.033 #> Chain 1: 9500 -469.269 0.048 0.049 #> Chain 1: 9600 -441.920 0.051 0.049 #> Chain 1: 9700 -440.928 0.048 0.049 #> Chain 1: 9800 -460.802 0.046 0.049 #> Chain 1: 9900 -454.914 0.045 0.049 #> Chain 1: 10000 -439.420 0.035 0.043 #> Chain 1: Informational Message: The maximum number of iterations is reached! The algorithm may not have converged. #> Chain 1: This variational approximation is not guaranteed to be meaningful. #> Chain 1: #> Chain 1: Drawing a sample of size 1000 from the approximate posterior... #> Chain 1: COMPLETED.#> Warning: Pareto k diagnostic value is 2.81. Resampling is disabled. Decreasing tol_rel_obj may help if variational algorithm has terminated prematurely. Otherwise consider using sampling instead.# }