LDA_set_user_seeds.Rd
This is a modification to the function LDATS::LDA_set to allow the user to choose the seed for the LDA. RMD originally added this function in a branch of weecology/LDATS, but that means that to use it you have to install that branch version of LDATS. It is necessary for this package, so porting it over as part of cvlt
. This means cvlt
can depend on the CRAN version of LDATS.
From LDATS documentation:
For a given dataset consisting of counts of words across
multiple documents in a corpus, conduct multiple Latent Dirichlet
Allocation (LDA) models (using the Variational Expectation
Maximization (VEM) algorithm; Blei et al. 2003) to account for 1
uncertainty in the number of latent topics and 2 the impact of initial
values in the estimation procedure.
LDA_set
is a list wrapper of LDA
in the topicmodels
package (Grun and Hornik 2011).
check_LDA_set_inputs
checks that all of the inputs
are proper for LDA_set
(that the table of observations is
conformable to a matrix of integers, the number of topics is an integer,
the number of seeds is an integer and the controls list is proper).
LDA_set_user_seeds(document_term_table, topics = 2, seed = 1, control = list())
document_term_table | Table of observation count data (rows:
documents, columns: terms. May be a class |
---|---|
topics | Vector of the number of topics to evaluate for each model.
Must be conformable to |
seed | Seed to use for each
value of |
control | A |
LDA_set
: list
(class: LDA_set
) of LDA models
(class: LDA_VEM
).
check_LDA_set_inputs
: an error message is thrown if any input is
improper, otherwise NULL
.
Blei, D. M., A. Y. Ng, and M. I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993-1022. link.
Grun B. and K. Hornik. 2011. topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40:13. link.