nlp:readings

Table of Contents

Week 1: Text Classification with Naive Bayes
Week 2: Semi-Supervised Learning with Naive Bayes and Expectation Maximization
Week 3: Text Classification with Maximum Entropy
Week 4: Feature Selection
Week 5: Feature Selection in the Learning Loop
Week 6: Feature Selection as Word Clustering
Week 7: Text Classification with Support Vector Machines
Weeks 8 & 9: Clustering with Naive Bayes
Week 10: Bayesian Smoothing
Week 11: Going Beyond Naive Bayes
Clustering Email

Week 1: Text Classification with Naive Bayes

“A Comparison of Event Models for Naive Bayes Text Classification”, by Andrew McCallum and Kamal Nigam. In AAAI/ICML-98 Workshop on Learning for Text Categorization, pp. 41-48. Technical Report WS-98-05. AAAI Press. 1998. PDF.

(optional) “Naive Bayes Text Classification: A Statistical Natural Language Processing Project”, by Chris Monson Chris_Monson.pdf.

Week 2: Semi-Supervised Learning with Naive Bayes and Expectation Maximization

“Learning to Classify Text from Labeled and Unlabeled Documents”, by Kamal Nigam, Andrew McCallum, Sebastian Thrun and Tom Mitchell. PDF (8 pages)

(optional) “Text Classification from Labeled and Unlabeled Documents using EM”, by Kamal Nigam, Andrew McCallum, Sebastian Thrun and Tom Mitchell. Machine Learning, 39(2/3). pp. 103-134. 2000. PDF (34 pages)

Week 3: Text Classification with Maximum Entropy

“Using Maximum Entropy for Text Classification”, by Kamal Nigam, John Lafferty, Andrew McCallum. PDF (7 pages)

(optional) “A Maximum Entropy Approach to Natural Language Processing”, by Adam Berger, Vincent Della Pietra, Stephen Della Pietra. PDF (34 pages)

Week 4: Feature Selection

Mutual information and Log-Likelihood ratio sections in Manning & Schuetze: 5.1-5.4

(optional) “A comparative study on feature selection for text categorization”, by Yiming Yang and Jan Pedersen. PDF

Week 5: Feature Selection in the Learning Loop

Focus on the section 4 about feature selection in the learning loop: “A Maximum Entropy Approach to Natural Language Processing”, by Adam Berger, Vincent Della Pietra, Stephen Della Pietra. PDF

Week 6: Feature Selection as Word Clustering

“Distributional Clustering of Words for Text Classification”, by Douglas Baker and Andrew McCallum. PDF

(Optional) Interesting read on similar feature selection mechanism. ://www.phil.uni-passau.de/linguistik/mitarbeiter/schneider/pub/acl2004.html ://www.phil.uni-passau.de/linguistik/mitarbeiter/schneider/pub/acl2004.pdf

Week 7: Text Classification with Support Vector Machines

Work through as much of the SVM Tutorial by Nello Cristianini as you can. I don't expect you to get all the way through this. Presentation slides from ICML 2001 Tutorial: PDF

“Text Categorization with Support Vector Machines: Learning with Many Relevant Features”, by Thorsten Joachims. PDF

Moving on to text clustering …

Weeks 8 & 9: Clustering with Naive Bayes

“An Experimental Comparison of Several Clustering and Initialization Methods”, by Marina Meila and David Heckerman. Try to fight through the whole thing. PS

Week 10: Bayesian Smoothing

“Bayesian smoothing through text classification”, by Tom Griffiths.://nlp.stanford.edu/courses/cs224n/2001/gruffydd/smoothing.html

Week 11: Going Beyond Naive Bayes

“Latent Dirichlet Allocation”, by D. Blei, A. Ng, and M. Jordan. This is dense. Read as much of this as you can. PDF

Blei's code is also available here: ://www.cs.princeton.edu/~blei/lda-c/

Extra reading:

Clustering Email

“Inferring Ongoing Activities of Workstation Users by Clustering Email”. PDF

Shorter version: PDF

“Automatic Discovery of Personal Topics To Organize Email”.

PDF by Arun C. Surendran, John C. Platt and Erin Renshaw, Conference on Email and Anti-Spam, 21-22 July at Stanford University, 2005.

“Restrictive Clustering and Metaclustering for Self-Organizing Document Collections”. ://doi.acm.org/10.1145/1008992.1009032