Status of normalization in Language ID as of 17 July 2008:

As far as I know, normalization is still there and working. It's configurable using the rbldr_norm property in .lidconfig files (i.e. cmake knows about it) and thus should be able to be set at the command line using

cmake -DRBLDR_NORM_FORCE=

and then the number indicating the type of normalization you want. (Introduction to Language ID talks about this). I've just never really used it very much, even though I know it improves performance. The regression testing infrastructure also supports tracking what type of normalization was used to generate a result. I actually did a java reimplementation of the normalization code inside of edu.byu.nlp.experimentation.results.AbstractIdentificationResult.normalize, but at the moment it's disabled because I wasn't sure it was doing the right thing. –Josh 18:28, 17 July 2008 (MDT)

Spoken Language ID

nlp-private/normalization.txt · Last modified: 2015/04/22 15:20 by ryancha
Back to top
CC Attribution-Share Alike 4.0 International
chimeric.de = chi`s home Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0