Differences

This shows you the differences between two versions of the page.

Link to this comparison view

nlp-private:fec-development [2015/04/22 21:04] (current)
ryancha created
Line 1: Line 1:
 +==Project Goals==
 +*Have the FEC be used by another team for feature engineering work.
 +*Built-in Scripting System: integrate dynamic scripting for defining features at runtime: more usable, quicker feature definition experience.
 +*Public Source Code Release?
 +
 +==Brainstorming==
 +These are some paths the project ''​could''​ take:
 +*Use [http://​jaxe.sourceforge.net/​ Jaxe] to create a rich .def.xml file editor?
 +
 +==Outstanding Issues==
 +*Generalization:​
 +**<​s>​Formalize the distinction between identification systems (multiple decisions possible; one classifier run per each truth-hypoth combination) and classification systems (one decision possible; one classifier run for a given truth and all possible hypothesis, making a decision between the hypotheses)</​s>​ - DONE by means of [[Edu.byu.nlp.experimentation]].
 +*<​s>​Thread safety: why can't we run the pnp experiment within FEC?</​s>​ -- I believe the issue is resolved. I had added too many "​synchronized"​ blocks, causing deadlocks.
 +*<​s>​Accuracy of DET curves: why don't they look _exactly_ like the gnuplot-produced ones?</​s>​ - DONE in [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​360 r360] The points are all plotted correctly now. Only remaining issue is the tick marks, which while accurate are not the traditional 0.5, 1, 5, 10, 20, etc.
 +*Refactoring:​ dumb viewer that delegates to a smart controller
 +*<​s>​ArrayList supersedes Vector</​s>​ This has essentially been accomplished.
 +*Make documentation more thorough
 +*[[Per-template feature weight viewer]]
 +
 +==Notable SVN Commits==
 +Initial development of the [[Feature Engineering Console]] occurred within a branch in the [[Spoken Language ID]] SVN repository. Eventually the feature-eng-console branch was variously merged back into NIST/HEAD or moved into the newly-created [http://​nlp.cs.byu.edu/​subversion/​FEC FEC repository],​ which was cloned from the [http://​nlp.cs.byu.edu/​subversion/​NIST NIST repository] at r371. The history of the feature-eng-console branch is as follows:
 +*Branch created: [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​294 r294]
 +*Merged back into head: [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​376 r376], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​377 r377], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​378 r378], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​379 r379], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​380 r380], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​381 r381], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​382 r382], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​383 r383], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​384 r384], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​385 r385], [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​386 r386]
 +*Branch deleted: [http://​nlp.cs.byu.edu/​trac/​NIST/​changeset/​387 r387]
 +
 +==Robbie'​s Ideas==
 +Josh,
 +
 +I thought of some new features, some of them fairly necessary (who knows how
 +we overlooked them).
 +
 +1) We NEED to be able to see a list of files of misses and false alarms (and
 +perhaps hits). I propose that when you click on a language pair, the next
 +screen brings up a list of all files involving the pair as either the truth
 +or the hypothesis, separated by misses and false alarms and hits. There'​s
 +probably info next to the file, like its duration and possibly the scores
 +assigned by the two languages and their thresholds. There should probably
 +also be an audio button so that you can hear it. If you click on the file
 +itself, you are brought to another screen which shows the scores for each of
 +languages for that file (NOT just the pair). You could also calculate the
 +entropy of this distribution and/or other metrics to help diagnose how
 +confused the model was. You will also need to include the threshold used by
 +each language (and possibly the "​single"​ threshold as well).
 +
 +2) History. The "​1st"​ plot--the baseline vs. this current test--should show
 +all previous tests (although, the user should have the option of deleting
 +and/or not adding certain "​garbage"​ runs). This probably holds true of all
 +DET curves. Similarly, we should have a mechanism for tracking the history
 +of tabular data. This will be a lot harder to visualize, but at the very
 +least, you could "​scroll"​ through the history of tables. You could also plot
 +the data in one cell over time. This is particularly useful for the
 +"​overall"​ cost, eer, and min cost. It probably ought to be possible to add
 +or remove "​lines"​ at any time from the histories.
 +
 +Based on my previous message, it should be possible to keep this data
 +separately for the n-language tests and any given *chosen* 1 v 1 tests.
 +Suppose for instance, I decide based on the n-language test to work on
 +Mandarin v. English. I run the 1 v 1 test using the current features as the
 +baseline (may also be useful to include the original baseline). Each time I
 +add new features, the DET curve is added to this plot (tabular data
 +similarly saved). By the time I'm done, I'll have several plots on my 1 v 1
 +history. Then I'll use the feature set I ended on (actually, choose based on
 +results) and then re-run the n-language test. At that point I should have 3
 +lines for the n-language data: baseline, 1st test, and 2nd test based on 1 v
 +1 subtests. Suppose I continue the process with several other language
 +pairs. Each pair should have their own plot. If I ever come back to a
 +language pair, that data is appended to the accruing history.
 +
 +3) One option when running tests would be to make a single language choice.
 +This should not change any of the internals of the GUI, but sooner or later
 +we can add this functionality to the pipeline. Maybe one thing it would
 +affect in the GUI is that now another metric is possible: classification
 +accuracy with a classification confusion matrix. This is not possible with
 +our current type of model.
 +
 +4) Another option would be to allow the user to choose the type of model;
 +this actually may be the same thing as 3. For instance, maybe we are using 1
 +v 1 MaxEnt models or 1 v rest SVMs (initial choices would be MaxEnt vs. SVMs
 +and  1 v 1, 1 v rest, or multiclass). As far as I can tell, this shouldn'​t
 +affect the GUI as explained in 3.
 +
 +5) The ability to choose a predefined dataset, use a randomized (probably
 +stratified) subset of an existing dataset (to minimize the amount of time
 +required to run), and the ability to create a new dataset (given the desired
 +size and percentage of training vs. evaluation data). This should probably
 +happen in the "Start new test" wizard, as should #3 and #4.
 +
 +6) We probably want to make the system as "​pluggable"​ as possible, at least
 +where metrics are concerned so that we can add new ones with very little
 +difficulty.
 +
 +7) I think we should add provisions to the XML files either (1) allow JAVA
 +to "​insert"​ features (that aren't saved permanently,​ just in memory or
 +through temp files) or preferably, (2) have an "​include"​ system that allows
 +one feature file to be a complete superset of one or more other files by
 +including them. This saves from a whole bunch of copy-and-paste
 +
 +8) With the addition of history, we will probably also need controls that
 +allow us to efficiently view it. For instance, I will probably want to see a
 +graph of time (iteration number, which is just the XML filename) vs. cost. A
 +table might be able to show relative increase/​decrease in cost, eer, etc.
 +b/t two histories. So on.
 +
 +9) Scriptable generation of graphs. This *might* be another program based on
 +the same Java objects. The idea is that if we want to publish something (or
 +students would like to write a report), they can easily dictate which XML
 +files should be included and what graphs/​metrics to "​export"​. This could
 +easily just be a selection box in a GUI.
 +
 +10) Graphs should be savable in vector graphics formats (i.e. "​exportable"​).
 +
 +11) Customizable graphing behavior (color vs. dashed, dotted lines; change
 +in scale, etc.)
 +
 +What do you think about these ideas?
 +
 +Robbie
 +[[Category:​Feature Engineering Console]]
  
nlp-private/fec-development.txt ยท Last modified: 2015/04/22 21:04 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0