(to be moved up or deleted soon)
Organization of codebase: separating edu.berkeley.nlp and edu.byu.nlp
Fix handling of held-out data
Projects:
Text Class.: get to know tokenization pipeline and do simple text classifier (like k-NN)
Text Class.: Naive Bayes
Text Class.: MaxEnt or SVM
Clustering: k-Means
Clustering: EM
Clustering: LDA
Final Project: help fill our clustering survey matrix
Final Presentation:
Implement clustering algorithm from the literature in our framework
Evaluate on given datasets
Do Error Analysis
Propose some improvement
Run experiments using the improved algorithm
Evaluate
Datasets: