TODO
Paper Ideas
[Added 8 May 2008]
Follow-up to Dan's KDD paper on Mixture of Multinomials involving convergence diagnostics. (This one is for Dan.)
Co-clustering paper explaining a common framework underpinning LDA and the Author-Topic models.
LDA with credible intervals or at least a multi-sample technique for estimating P(w|z)
Label switching analysis on LDA
Length modeling
Suverying Gibbs samplers (possibly for MM or for LDA):
collapsed v. non-collapsed
block v. non-block sampler
random scan
non-collapsed: continuous variables – diagnose using traditional convergence diagnostics
Keyword identification for documents – build on your work for Data Mining group project involving LDA topics
Sampling N_d (# of tokens in doc. d) == choosing feature selector in the inference loop
Comparison of Bayesian model selection and non-parametric prior
ALFA
Redoing cost model in R
Code to compute cost (derived columns)
QBC: fix sampling, code review/Vote entropy
“Human Waits” Active learning (human only labels 1)
We want the submitted job to run the revision that was current at submission (could pass revision number, or could set aside binaries)
Write out XML file without results at beginning. Dump results that you have on sigkill before terminating
Wrap ant script in shell script/python so that it can trap sigkill
Fast Maxent uses prior
Other
Fellowships
Current Work
Semi-supervised + Maxent
Research work done at BBN on speech N-best lists from Schwartz, Nguyen, and Austin
Research work done by Robert Moore on estimating probability distributions using n-best lists
Brainstorming
Back to top