 += TODO =
 +=== Paper Ideas ===
 +[Added 8 May 2008]
 +* Follow-up to Dan's KDD paper on Mixture of Multinomials involving convergence diagnostics. ​ (This one is for Dan.)
 +* Co-clustering paper explaining a common framework underpinning LDA and the Author-Topic models.
 +* LDA with credible intervals or at least a multi-sample technique for estimating P(w|z)
 +* Label switching analysis on LDA
 +* Length modeling
 +* Suverying Gibbs samplers (possibly for MM or for LDA):
 +** collapsed v. non-collapsed
 +** block v. non-block sampler
 +** random scan
 +** non-collapsed:​ continuous variables -- diagnose using traditional convergence diagnostics
 +* Keyword identification for documents -- build on your work for Data Mining group project involving LDA topics
 +* Sampling N_d (# of tokens in doc. d) == choosing feature selector in the inference loop
 +** choosing dimensionality / cut-off for a fixed feature selector
 +* Comparison of Bayesian model selection and non-parametric prior
 +=== ALFA ===
 +* Re-read E & D. Specifically look for values we can use for candidate, batch, and other parameter settings
 +** Re-run all experiments using best values
 +* Redoing cost model in R
 +* Code to compute cost (derived columns)
 +* QBC: fix sampling, code review/Vote entropy ​
 +* "Human Waits" Active learning (human only labels 1)
 +** Batch but based on previous annotations
 +* We want the submitted job to run the revision that was current at submission (could pass revision number, or could set aside binaries)
 +* Write out XML file without results at beginning. Dump results that you have on sigkill before terminating
 +* Wrap ant script in shell script/​python so that it can trap sigkill
 +* Fast Maxent uses prior
 +* Phase in cutoffs
 +=== Other ===
 +* Thesis!!!!
 +* Prepare slides for Bayesian reading group
 +* MI for POS-Tagging
 +* Add MI to LID pipeline
 += Fellowships =
 +* BYU Graduate Research Fellowship Award: http://​www.byu.edu/​gradstudies/​index.php?​action=resources.fellowship&​fellowshipid=57
 +** Application due around January 2008
 += Current Work =
 +* LDA and Co-clustering
 +* Mutual-Information feature selector for POS-Tagging
 +* Incorporate Mutual-Information feature selector into LID.
 +== Semi-supervised + Maxent ==
 +* Research work done at BBN on speech N-best lists from Schwartz, Nguyen, and Austin
 +* Research work done by Robert Moore on estimating probability distributions using n-best lists
 += Brainstorming =
 +[[User:​Rah67/​Brainstorming|Brainstorming List]]
