During topic learning, one needs to supply W: int, size of vocabulary.
I tried to fathom the meaning of W reading Algorithm 1: Gibbs sampling algorithm for BTM in the paper BTM: Topic Modeling over Short Texts, but W is not an input there. However, it is data-dependent to me, so am I correct if I assume W to mean the number of unique terms in the cleaned and preprocessed corpus? If so, any reason W is not calculated from the corpus docs_pt automatically? I'm afraid I am missing something hence my question.
Thank you.
During topic learning, one needs to supply
W: int, size of vocabulary.I tried to fathom the meaning of
Wreading Algorithm 1: Gibbs sampling algorithm for BTM in the paper BTM: Topic Modeling over Short Texts, butWis not an input there. However, it is data-dependent to me, so am I correct if I assumeWto mean the number of unique terms in the cleaned and preprocessed corpus? If so, any reasonWis not calculated from the corpusdocs_ptautomatically? I'm afraid I am missing something hence my question.Thank you.