Differences

This shows you the differences between two versions of the page.

Link to this comparison view

nlp-private:documentation [2015/04/22 21:16] (current)
ryancha created
Line 1: Line 1:
 +__NOTOC__
 +Some documentation for the Language ID codebase!
 +==Computation of Metrics==
 +Where / how are the various metrics being calculated?
 +
 +===Operating Point===
 +Theta for the operating point comes from [[thetasweep.pl]] on OUTCOMESTRAIN in the makefile
 +Based on theta and outcomes score for each model on each file, [[resultbuilder.pl]] makes hard decisions. These are placed in NIST_RESULT_FILE.
 +[[blddetcurve.sh]] script invokes [[splittargets.pl]] to split target from non-target.
 +Given that split, hd_dcf from [[detware]] produces the coordinates of the operating point in [[hd.points]]. The point represented by [[hd.points]] is plotted as a star.
 +The value of the DCF computed at the operating point should be found in [[metric.txt]] if you build the '​metric'​ target.
 +
 +===Decision Cost Function at its Minimum===
 +[[DET_nist.points]] is created as a sideeffect of running '​plot'​ from [[detware]].
 +The values of the second and third columns of the first row of [[DET_nist.points]] are placed in [[cdet.points]]. This contains the coordinates of the point on the det curve with min DCF.
 +The point represented by [[cdet.points]] is plotted as a circle.
 +
 +===Equal Error Rate and DCF @ EER===
 +[[DET_nist.points]] is created as a sideeffect of running '​plot'​ from [[Detware]].
 +The value at the second column of the second row of [[DET_nist.points]] is extracted and placed in [[eer.points]].
 +The value at the first column of the second row of [[DET_nist.points]] is extracted and placed in [[eer.txt]]. This is the value of the DCF @ the EER point.
 +The point represented by [[eer.points]] is plotted as a box.
 +
 +===Average Per-Language DCF (aka Average Cost)===
 +Calculated by running [[average.pl]] on the output of [[printfirstcolumn.pl]] when it is run on the [[DET_nist.points]] files in all language directories at once. Stored in [[avgcost.txt]].
 +
 +===Average Per-Language DCF @ EER (aka Average EER)===
 +Calculated by running [[average.pl]] on the [[eer.txt]] files in all language directories at once. Stored in [[avgeer.txt]].
 +
 +== Scripts ==
 +The various scripts utilized by the system.
 +*[[DETware]]
 +*[[seg2xml3.pl]]
 +*[[resultbuilder.pl]]
 +*[[splittargets.pl]]
 +*[[geteer1.pl]]
 +*[[blddetcurve.sh]] or [[blddetcurve.rb]]
 +*[[gnu_det.sh]] or [[generate_gnuplot_script.rb]]
 +*[[thetasweep.pl]]
 +*[[thetasweep2.pl]]
 +*[[regressiontest.sh]]
 +*[[lid-console.rb]]
 +
 +[[Category:​Spoken Language ID]]
  
nlp-private/documentation.txt ยท Last modified: 2015/04/22 21:16 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0