= TODO =

=== Paper Ideas ===
[Added 8 May 2008]

* Follow-up to Dan's KDD paper on Mixture of Multinomials involving convergence diagnostics.  (This one is for Dan.)
* Co-clustering paper explaining a common framework underpinning LDA and the Author-Topic models.
* LDA with credible intervals or at least a multi-sample technique for estimating P(w|z)
* Label switching analysis on LDA
* Length modeling
* Suverying Gibbs samplers (possibly for MM or for LDA):
** collapsed v. non-collapsed
** block v. non-block sampler
** random scan
** non-collapsed: continuous variables -- diagnose using traditional convergence diagnostics
* Keyword identification for documents -- build on your work for Data Mining group project involving LDA topics
* Sampling N_d (# of tokens in doc. d) == choosing feature selector in the inference loop
** choosing dimensionality / cut-off for a fixed feature selector
* Comparison of Bayesian model selection and non-parametric prior

=== ALFA ===
* Re-read E & D. Specifically look for values we can use for candidate, batch, and other parameter settings
** Re-run all experiments using best values

* Redoing cost model in R
* Code to compute cost (derived columns)
* QBC: fix sampling, code review/Vote entropy 
* "Human Waits" Active learning (human only labels 1)
** Batch but based on previous annotations
* We want the submitted job to run the revision that was current at submission (could pass revision number, or could set aside binaries)
* Write out XML file without results at beginning. Dump results that you have on sigkill before terminating
* Wrap ant script in shell script/python so that it can trap sigkill
* Fast Maxent uses prior

* Phase in cutoffs

=== Other ===
* Thesis!!!!
* Prepare slides for Bayesian reading group
* MI for POS-Tagging
* Add MI to LID pipeline

= Fellowships =

* BYU Graduate Research Fellowship Award: http://www.byu.edu/gradstudies/index.php?action=resources.fellowship&fellowshipid=57
** Application due around January 2008

= Current Work =
* LDA and Co-clustering
* Mutual-Information feature selector for POS-Tagging
* Incorporate Mutual-Information feature selector into LID.
== Semi-supervised + Maxent ==
* Research work done at BBN on speech N-best lists from Schwartz, Nguyen, and Austin
* Research work done by Robert Moore on estimating probability distributions using n-best lists

= Brainstorming =
[[User:Rah67/Brainstorming|Brainstorming List]]