Run LDATs on a dataset using crossvalidation — fit_ldats

Top-level wrapper to subset a dataset and fit LDATS to the subsets.

fit_ldats_crossval(
  dataset,
  buffer = 2,
  k,
  lda_seed,
  cpts,
  nit,
  cpt_seed = NULL,
  return_full = FALSE,
  return_fits = FALSE,
  summarize_ll = TRUE
)

Arguments

dataset	MATSS-style dataset (list with `$abundance` and `$covariates`)
buffer	number of timesteps to withold on either side of the test timestep, default 2
k	integer number of topics
lda_seed	integer seed to use to run LDA
cpts	integer number of changepoints
nit	integer number of iterations for the changepoint model. 100 is fast but will not find the global optimum, 1000 gets closer but takes time.
cpt_seed	integer which seed to use for the cpt model. If NULL (default) randomly drawn.
return_full	logical whether to return the model objects or just the logliks. Returning the objects hogs memory. Defaults FALSE.
return_fits	logical. If TRUE returns list with fits. If FALSE (default) returns a dataframe of model info and the loglikelihood estimated as the sum (over all test steps) of the mean loglikelihood (over all iterations) for each step.
summarize_ll	logical. If TRUE, summary dataframe will have only one row of model info and the loglikelihood estimated as the sum (over all test steps) of the mean loglikelihood (over all iterations) for each step. If FALSE, summary dataframe will return all time steps. FALSE useful for diagnostics.

Value

If return_fits = TRUE, list of lists. Each element is the result of running ldats_subset_one on one subset of the original dataset. If return_fits = FALSE, returns a dataframe of model info and the mean loglikelihood (over all iterations) for each step.