Projects and other links: *[[Spoken Language ID]] *[[CMake]] *[[DETware]] *[[Feature Engineering Console]] *[[Target Platforms]] *[[Requirements]] *[[Subversion]] *[[NLPfest 2010 Notes]] ==Completed Tasks== *[[CMake]] ** Get cmake to take care of compiling Language-ID? No, cmake's support for java building is not developed enough. We'll stick with [[ant]] for now. ** Wait for stat-nlp and sphinx to be depended on as jars so we don't have to build those sources? We're already using the statnlp jar. Don't wait for the sphinx one, just let that be taken care of in HEAD. ** Merge CMake into head - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/292/ r292] *** Use -norm 1 in RBLDROPTS (verify that there aren't any other options -- check regression test) - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/277/ r277]. The correct parameter is --norm=1 **** Double check change of 'norm' to integer in [[resultbuilder.pl]] - DONE, it seems okay *** Remove built-in ant - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/280 r280] *** Fix regression (norm?) Same issue as "Use -norm 1 in RBLDROPTS" - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/277/ r277] *** Verify that Makefile-original will work as expected - make detcurve, make metric, regressiontest.sh, etc. - DONE. I talked with Dr. Ringger and we decided that this is not a critical issue. The previous makefile will no longer work once cmake is checked in to head. *** Have CMake keep experiments directory where it was previously. This should be implemented when we finally check in. - DONE, with merge to head in [http://nlp.cs.byu.edu/trac/NIST/changeset/292/ r292] *** CODE REVIEW! w/ Dr. Ringger - DONE, with merge to head in [http://nlp.cs.byu.edu/trac/NIST/changeset/292/ r292]. Robbie actually did the review. * [[Feature Engineering Console]] ** Review ruby code completely - OBSELETED by port to Java ** plot (detware) : ntrue = 15, nfalse = 99, npts = 113 - DONE - rather than gathering these variables from console output, we in essence store the same information in the [http://nlp.cs.byu.edu/trac/NIST/browser/branches/feature-eng-console/Language-ID/src/edu/byu/langid/app/datastructures/results/Result.java Result class] ** tab for each experiment result ** Cost matrix: see board - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/324 r324] ** GUI mockup - full-ish functionality showing the flow of information, the overall process - this sort of morphed into full development mode; design review was done, however, at the SpokenLID meeting 20 September 2007. ** Create a Normal Deviate Form axis for JFreeChart - DONE using the DetcurveDataset structure instead in [http://nlp.cs.byu.edu/trac/NIST/changeset/341 r341] ** Feature weight viewer: maxent: p(y|x) = e^SUM(lambda_i * f_i(x,y)) / Z(x) - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/343 r343] ** Check whether Plot class is really necessary / whether an array is needed - OBSELETED by port to Java? ** ?? Java GUI (is this a blocker?) - DONE on creation of feature-eng-console branch in [http://nlp.cs.byu.edu/trac/NIST/changeset/294 r294] ** Generate unique models for each language pair. ** Also make outcome-* filenames specific to the languages involved - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/278 r278] ** Ensure that different colors are used for the points corresponding to the two different experiments on a chart; otherwise, there is potential confusion. - DONE. This task was written with the old, console-based FEC in mind. Now that we're using JFreeChart rather than gnuplot, there's no problem. ** View Features present in a file (be able to restrict based on what the model cares about) - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/343 r343] ** Play wav files - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/343 r343] ** ??? == duration (seconds) - DONE ** View Feature Weights (what the model cares about) - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/333 r333] ** Per-language det curves - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/270 r270] ** Support for superposed graphs to enable a feature engineer to see progress at a glance - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/270 r270] ** Before/after language-pair DET curve matrix ([[gnuplot]]?): Aggregate curve on top with upper-triangular-style diagram showing language pairs below - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/270 r270] ** Follow up with Bruce to see if he got anything, had any thoughts/plans - DONE, I'm not sure if this is relevant anymore ** Generate thumbnails - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/272 r272]. Depends on [[ImageMagick]]'s '''convert''' command. ** Verify correctness: per-language models are real per-language; one-vs-rest is really one-vs-rest. *** Change filenames to be more descriptive: en-ge.model or en-rest.model - DONE in [http://nlp.cs.byu.edu/trac/NIST/changeset/278 r278]. The model names are ${RESULT_NAME}_${language}.model *** Restrict training and test sets at frontend of pipeline so we only deal with langs of interest. In other words, generate outcome* files on the restricted set of languages before passing to [[resultbuilder.pl]]