Table of Contents

Special Projects

CS598R Winter 2006: Special Projects - Text Classification and Text Clustering

Introduction

Dr. Ringger, Mark Gulbrandsen (Msg26), Daniel Walker, and Scott Chun are studying text classification and text clustering using Natural Language Processing techniques. The course will cover Naive Bayes, Expectation Maximization, Maximum Entropy, several feature selection topics, several Support Vector Machines topics, and several Clustering topics. The experimental part of this course is searching for a novel use of these concepts. The expected outcome is a paper submitted to a substantial conference in the NLP field.

Goals

Class Meetings

We meet the following days/times in the NLP South Lab.

Schedule

The file Schedule.xls (a Microsoft Excel formatted file) outlines the schedule for this class.

Text and Readings

Deliverables

Data

Grading

The grade for this course is calculated based on four performance areas:

  1. Paper reading completion
  2. Meeting attendance and participation
  3. Coding sessions and assignments
  4. Paper authoring, including creative efforts toward developing a novel application of the course contents.

Dr. Ringger will give individual feedback roughly monthly to each of the participants, so that they can gauge their performance.