===Week 1: Text Classification with Naive Bayes===
* "A Comparison of Event Models for Naive Bayes Text Classification", by Andrew McCallum and Kamal Nigam.   In AAAI/ICML-98 Workshop on Learning for Text Categorization, pp. 41-48. Technical Report WS-98-05. AAAI Press. 1998.  [http://www.kamalnigam.com/papers/multinomial-aaaiws98.pdf PDF].

* (optional) "Naive Bayes Text Classification: A Statistical Natural Language Processing Project", by Chris Monson [[media:nlp:Chris_Monson.pdf]].

===Week 2: Semi-Supervised Learning with Naive Bayes and Expectation Maximization===
* "Learning to Classify Text from Labeled and Unlabeled Documents", by Kamal Nigam, Andrew McCallum, Sebastian Thrun and Tom Mitchell. [http://www.kamalnigam.com/papers/emcat-aaai98.pdf PDF] (8 pages)

* (optional) "Text Classification from Labeled and Unlabeled Documents using EM", by Kamal Nigam, Andrew McCallum, Sebastian Thrun and Tom Mitchell.  Machine Learning, 39(2/3). pp. 103-134. 2000.  [http://www.kamalnigam.com/papers/emcat-mlj99.pdf PDF] (34 pages)

===Week 3: Text Classification with Maximum Entropy===
* "Using Maximum Entropy for Text Classification", by Kamal Nigam, John Lafferty, Andrew McCallum. [http://www.cs.cmu.edu/~knigam/papers/maxent-ijcaiws99.pdf PDF] (7 pages)

* (optional) "A Maximum Entropy Approach to Natural Language Processing", by Adam Berger, Vincent Della Pietra, Stephen Della Pietra. [http://acl.ldc.upenn.edu/J/J96/J96-1002.pdf PDF] (34 pages)

===Week 4: Feature Selection===
* Mutual information and Log-Likelihood ratio sections in Manning & Schuetze: 5.1-5.4

* (optional) "A comparative study on feature selection for text categorization", by Yiming Yang and Jan Pedersen. [http://citeseer.ifi.unizh.ch/cache/papers/cs/1982/http:zSzzSzwww.cs.cmu.eduzSz~yimingzSzpapers.yyzSzml97.pdf/yang97comparative.pdf PDF]

===Week 5: Feature Selection in the Learning Loop===
* Focus on the section 4 about feature selection in the learning loop: "A Maximum Entropy Approach to Natural Language Processing", by Adam Berger, Vincent Della Pietra, Stephen Della Pietra. [http://acl.ldc.upenn.edu/J/J96/J96-1002.pdf PDF]

===Week 6: Feature Selection as Word Clustering===
* "Distributional Clustering of Words for Text Classification", by Douglas Baker and Andrew McCallum. [http://citeseer.ist.psu.edu/cache/papers/cs/6562/http:zSzzSzwww.cs.cmu.eduzSz~mccallumzSzpaperszSzclustering-sigir98s.pdf/baker98distributional.pdf PDF]

* (Optional) Interesting read on similar feature selection mechanism. [http://www.phil.uni-passau.de/linguistik/mitarbeiter/schneider/pub/acl2004.html] [http://www.phil.uni-passau.de/linguistik/mitarbeiter/schneider/pub/acl2004.pdf]

===Week 7: Text Classification with Support Vector Machines===
* Work through as much of the SVM Tutorial by Nello Cristianini as you can.  I don't expect you to get all the way through this.  Presentation slides from ICML 2001 Tutorial: [http://www.support-vector.net/icml-tutorial.pdf PDF]

* "Text Categorization with Support Vector Machines: Learning with Many Relevant Features", by Thorsten Joachims. [http://www.cs.cornell.edu/People/tj/publications/joachims_98a.pdf PDF]

----
Moving on to text clustering ...

===Weeks 8 & 9: Clustering with Naive Bayes===
* "An Experimental Comparison of Several Clustering and Initialization Methods", by Marina Meila and David Heckerman.  Try to fight through the whole thing. [http://research.microsoft.com/research/pubs/view.aspx?type=Technical%20Report&id=165 PS]

===Week 10:  Bayesian Smoothing===
* "Bayesian smoothing through text classification", by Tom Griffiths.[http://nlp.stanford.edu/courses/cs224n/2001/gruffydd/smoothing.html]

===Week 11: Going Beyond Naive Bayes===
* "Latent Dirichlet Allocation", by D. Blei, A. Ng, and M. Jordan.  This is dense.  Read as much of this as you can.  [http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pdf PDF]

* Blei's code is also available here: [http://www.cs.princeton.edu/~blei/lda-c/]

----
Extra reading:

===Clustering Email===
* "Inferring Ongoing Activities of Workstation Users by Clustering Email".  [http://www.cs.cmu.edu/~hyifen/publication/EmailCluster04.pdf PDF]
Shorter version: [http://www.cs.cmu.edu/~hyifen/publication/CEAS2004.pdf PDF]

* "Automatic Discovery of Personal Topics To Organize Email".
[http://research.microsoft.com/~acsuren/PersonalTopics.pdf PDF]
by Arun C. Surendran, John C. Platt and Erin Renshaw, Conference on Email and Anti-Spam, 21-22 July at Stanford University, 2005.

* "Restrictive Clustering and Metaclustering for Self-Organizing Document Collections". [http://doi.acm.org/10.1145/1008992.1009032]