Follow-up to Dan's KDD paper on Mixture of Multinomials involving convergence diagnostics. (This one is for Dan.)
Co-clustering paper explaining a common framework underpinning LDA and the Author-Topic models.
LDA with credible intervals or at least a multi-sample technique for estimating P(w|z)
Label switching analysis on LDA
Length modeling
Suverying Gibbs samplers (possibly for MM or for LDA):
collapsed v. non-collapsed
block v. non-block sampler
random scan
non-collapsed: continuous variables – diagnose using traditional convergence diagnostics
Keyword identification for documents – build on your work for Data Mining group project involving LDA topics
Sampling N_d (# of tokens in doc. d) == choosing feature selector in the inference loop
Comparison of Bayesian model selection and non-parametric prior