I'm working on the spoken language ID project. I took CS401R (statistical natural language processing) in fall 2006 and I started working in the lab in January 2007.

My Research Log

Current Tasks

  • Spring Research Conference: Prepare a presentation for the CPMS Spring Research conference, to be held 17 March 2007. I'll give a dry run in our lab meeting on 9 March 2007.
  • Trac: Get Trac working on Entropy so that we can easily browse our codebase.
  • Commit emails: Set up automatic commit message notification emails on entropy.
  • Multiple DET curves: Make it possible to view multiple DET curves on the same plot for easy comparison.
  • Polynomial regression: Add polynomial regression for use in feature generation. We currently do linear regression on pitch data, and experiments that use the slope of the fitted line get better results than ones that simply examine the different between the first and last pitch data points in a segment.
  • Pitch vs. F0 with regression: Run experiments to compare performance of F0 and pitch when using regression curves. We currently get better results with pitch using quantized regression than with F0 comparing first and last data points. However, comparing first and last pitches gives worse results than comparing first and last F0 points, so it's possible that our gains are due to the regression line rather than the pitch data.

Completed Tasks

  • Pitch vs. F0: Run experiments using pitch features instead of F0 formant features from Praat. Add the ability to extract pitch data and then run several experiments using it. (CS 401R final project)
