Table of Contents

To Do Now

Misc Details

Notes from jar extracion for Fall '07

Homework 0

Lab 1

Code:

I had an idea for lab 1 based on my experience after revising lab 3. If we modify the requirements of lab 1 appropriately, they will save a significant amount of time on lab 2 and lab 3 because they will be able to directly re-use their code.

Specifically, I think that the unigram model is fine as is. We should also require an un-interpolated bigram model (this will allow them to re-use the code here in lab 3). In addition, they are required to write an un-interpolated trigram model and then a fourth model which simply interpolates the previous three. That way we force them to have separate models lying around for future labs (I don’t believe ANYONE did this). We would recommend, but not require, that by implementing a model that worked for any order n, they could fulfill the bigram and trigram requirements much quicker than doing them separately and certainly with less code.

Another note is that b/c I recommended that they concatenate strings together to form the history, students’ code was not directly pluggable for Lab 2. They had to change all of the characters to strings or build new lms from scratch. Two solutions: (1) require that their LM works with any data type (after all, this is the purpose of generics) or (2) change the reader/add an optional reader that produces strings (each string is one character) instead of characters.

Finally, we should encourage (not require) them to separate their learner from their model so that in future labs they need only write new learners (often times, even that is unnecessary).

Write up coherent, self-contained notes on Kneser-Ney to aid students in implementation - rely on the “bit of progress” paper

Lab 2

Remember how I was talking about how slick lab 2 was for maximum entropy? Well there is a downside. The way I had the students do things means that there is no way to grade their model.

Here’s the thing. Feature extraction happens in the reader. Therefore, for their report, the test set they used already had all of the features extracted and they didn’t have to add any extra code.

In general, this is desirable—why re-write code for feature extraction in every model.

The downside—the serialized model expects data to be given in its already extracted form. Of course, the auto-grader has no idea how to run their feature extractor (it wasn’t part of the serialized model, although it must exist somewhere in the .jar).

Bottom line: we don’t have any (reliable) scores for any maxent model.

Short term options: (1) require everyone to add feature extractors in the appropriate manner and re-serialize and re-upload (2) Just do the Hall of Fame based on the dev test set (3) have people report their own results on the dev test.

I’m leaning towards 2—I don’t want to add to the student’s burden.

Long term solution:

               Not quite sure. Maybe we’ll have to run evaluations based on submitted XML files now. That causes its own problems, for instance paths, and also how to insert the information related to the blind data set.

Possible solution: if their XML file is also submitted (in a predictable location, perhaps it can be one of the upload boxes), then it may be possible to parse the reader information from their own XML file (nothing else). This shouldn't (normally) have paths or any other information like that (whereas datasets will). Note that in lab three, the binarizers are attached to the datasets–this is correct because the blind data set should NOT be binarized (nothing to binarize)

Project 2.2

Project 3.1 NEW

Lab 3

Lab 4

Final Projects

Bruce's Email

I spent a lot of time trying to figure out where to start. I was pretty unclear on which classes I need to write and how they fit in with the rest of the system. Graphical depictions of entity relationships and data flow would have been helpful. Navigating the sea of undocumented generic parameters and fuzzy class relationships in the framework code took several extra hours. Having documentation of even just the semantics of generic parameters for the classes I used would have saved me several hours.

I also spent several hours (6 or 7, maybe?) trying to understand the graphical model and mathematical foundations of the project, because being unclear on the math bit me on the last project. I finally just decided to talk to Robbie and take the “get stuff done” approach more than the “understand what's going on” approach. I have a superficial idea of what is happening, but there's still a lot of magic that's happening in the supplied code that I don't totally understand.

A small thing that would help is to supply prewritten code to automate the running of experiments. I spent an hour or two writing a Perl script to run an experiment, time it, and copy all of the relevant details (POS tagger file, experiment XML file, output, etc.) to a directory so that I could analyze it later.

Providing access to fast computing resources would be helpful for this project. I was lucky that I had access to a fast box with lots of memory at work, so each feature engineering run took about 1.5 hours (training + decoding w/ Viterbi), but I would imagine that some people could do only one feature engineering iteration per day.

It didn't help that I started late because I was putting out fires in my other classes.

Once I finally got going and understood things, I really enjoyed the project. It was fun to do feature engineering and see the accuracy increase with each iteration as the features that I added eliminated errors.