Top-level wrapper to subset a dataset and fit LDATS to the subsets.

fit_ldats_crossval(
  dataset,
  buffer = 2,
  k,
  lda_seed,
  cpts,
  nit,
  cpt_seed = NULL,
  return_full = FALSE,
  return_fits = FALSE,
  summarize_ll = TRUE
)

Arguments

dataset

MATSS-style dataset (list with $abundance and $covariates)

buffer

number of timesteps to withold on either side of the test timestep, default 2

k

integer number of topics

lda_seed

integer seed to use to run LDA

cpts

integer number of changepoints

nit

integer number of iterations for the changepoint model. 100 is fast but will not find the global optimum, 1000 gets closer but takes time.

cpt_seed

integer which seed to use for the cpt model. If NULL (default) randomly drawn.

return_full

logical whether to return the model objects or just the logliks. Returning the objects hogs memory. Defaults FALSE.

return_fits

logical. If TRUE returns list with fits. If FALSE (default) returns a dataframe of model info and the loglikelihood estimated as the sum (over all test steps) of the mean loglikelihood (over all iterations) for each step.

summarize_ll

logical. If TRUE, summary dataframe will have only one row of model info and the loglikelihood estimated as the sum (over all test steps) of the mean loglikelihood (over all iterations) for each step. If FALSE, summary dataframe will return all time steps. FALSE useful for diagnostics.

Value

If return_fits = TRUE, list of lists. Each element is the result of running ldats_subset_one on one subset of the original dataset. If return_fits = FALSE, returns a dataframe of model info and the mean loglikelihood (over all iterations) for each step.