Our focus is currently on Bayesian approaches to NLP.  Here's a partial bibliography:

=Bayesian approaches to NLP=

===Bibliographies===
Beal: [http://www.cs.toronto.edu/~beal/npbayes/papers.html]

NIPS 2005: [http://aluminum.cse.buffalo.edu:8080/npbayes/nipsws05/resources]

Griffiths’s reading list: [http://cog.brown.edu/~gruffydd/bayes.html]

===Seminal===
T.S. Ferguson. A Bayesian analysis of some nonparametric problems. Annals of Statistics 1:209-230, 1973. http://www.jstor.org/view/00905364/di983860/98p00275/0]

C.E. Antoniak. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics 2:1152-1174, 1974. [http://www.jstor.org/view/00905364/di983870/98p0275d/0]

===Foundational===
M.D. Escobar and M. West. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90:577-588, 1995. [http://www.jstor.org/view/01621459/di986004/98p0224o/0]

S.N. MacEachern and P. Muller. Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7:223-238, 1998. [http://links.jstor.org/sici?sici=1061-8600%28199806%297%3A2%3C223%3AEMODPM%3E2.0.CO%3B2-9]

R.M. Neal. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9:249-265, 2000. [http://www.cs.utoronto.ca/~radford/ftp/mixmc.pdf]

C.E. Rasmussen. The Infinite Gaussian Mixture Model. NIPS, 2000. [http://www.kyb.mpg.de/publications/pdfs/pdf2299.pdf]

H. Ishwaran and L. James. Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96:161-173, 2001. [http://www.bio.ri.ccf.org/Resume/Pages/Ishwaran/stickBreaking.pdf]

===Graphical Models===
D. McAllester, M. Collins, F. Pereira.  Case-Factor Diagrams for Structured Probabilistic Modeling.  ??

===NLP, Clustering===
D.M. Blei, T.L. Griffiths, M.I. Jordan, and J.B. Tenenbaum. Hierarchical topic models and the nested Chinese restaurant process. NIPS, 2004. [http://cog.brown.edu/~gruffydd/papers/ncrp.pdf]

Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical Dirichlet processes. NIPS, 2004. [http://www.cs.toronto.edu/~ywteh/research/npbayes/nips2004a.pdf]

Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical Dirichlet Processes.  Tech Report.  Last updated: 8th Oct'04 
[http://www.cs.toronto.edu/~ywteh/research/npbayes/report.pdf]

T. Griffiths, M. Steyvers, D. Blei, and J. Tenenbaum. Integrating Topics and Syntax.  In press, Advances in Neural Information Processing Systems (NIPS) 17, 2004. [http://www.cs.berkeley.edu/~blei/papers/syntax-semantics.pdf]

D. Blei, A. Ng, and M. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993-1022, January 2003. [http://www.cs.berkeley.edu/~blei/papers/blei03a.pdf] 

R. Madsen, D. Kauchak, C. Elkan.  Modeling Word Burstiness Using the Dirichlet Distribution.  ICML 2005

A. McCallum, A. Corrada-Emmanuel, X. Wang.  Topic and Role Discovery in Social Networks.  ??

X. Wang, N. Mohanty, A. McCallum.  Group and Topic Discovery from Relations and Text.  LinkKDD-2005.

A. McCallum, A. Corrada-Emmanuel, X. Wang.  The Author-Recipient-Topic Model for Topic and Role Discovery in Social Networks:  Experiments with Enron and Academic Email.

===NLP, Language Modeling===
S. Goldwater, T. Griffiths, M. Johnson. Interpolating Between Types and Tokens by Estimating Power-Law Generators.  NIPS 2005

D. MacKay, L. Bauman Peto.  A Hierarchical Dirichlet Language Model.  Natural Language Engineering 1(1).  1994.

Yeh Whye Teh.  A Bayesian Interpretation of Kneser-Ney Smoothing (?).  NIPS 2005 Workshop on Bayesian NLP.  Draft available.

===Software===
Nonparametric Bayesian inference software, Yee Whye Teh:  [http://www.cs.berkeley.edu/~ywteh/]