LDA_set_user_seeds.RdThis is a modification to the function LDATS::LDA_set to allow the user to choose the seed for the LDA. RMD originally added this function in a branch of weecology/LDATS, but that means that to use it you have to install that branch version of LDATS. It is necessary for this package, so porting it over as part of cvlt. This means cvlt can depend on the CRAN version of LDATS.
From LDATS documentation:
For a given dataset consisting of counts of words across
multiple documents in a corpus, conduct multiple Latent Dirichlet
Allocation (LDA) models (using the Variational Expectation
Maximization (VEM) algorithm; Blei et al. 2003) to account for 1
uncertainty in the number of latent topics and 2 the impact of initial
values in the estimation procedure.
LDA_set is a list wrapper of LDA
in the topicmodels package (Grun and Hornik 2011).
check_LDA_set_inputs checks that all of the inputs
are proper for LDA_set (that the table of observations is
conformable to a matrix of integers, the number of topics is an integer,
the number of seeds is an integer and the controls list is proper).
LDA_set_user_seeds(document_term_table, topics = 2, seed = 1, control = list())
| document_term_table | Table of observation count data (rows:
documents, columns: terms. May be a class |
|---|---|
| topics | Vector of the number of topics to evaluate for each model.
Must be conformable to |
| seed | Seed to use for each
value of |
| control | A |
LDA_set: list (class: LDA_set) of LDA models
(class: LDA_VEM).
check_LDA_set_inputs: an error message is thrown if any input is
improper, otherwise NULL.
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993-1022. link.
Grun B. and K. Hornik. 2011. topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40:13. link.