The '''Spoken Language ID''' project seeks to....
==Introductory Tasks==
===Running An Experiment in Ten Shell Commands Or Less===
# Ensure that your system meets the current [[Requirements | system requirements]]. On an Ubuntu Linux system, this can be done using a command such as this:
sudo apt-get install sun-java6-jdk ant perl ruby gnuplot gcc cmake make pdl subversion
# Check out the current Language ID repository:
svn co http://nlp.cs.byu.edu/subversion/NIST/HEAD
# Enter the HEAD directory created by the previous command:
cd HEAD
# Be sure that the template from which the main configuration file will be generated represents your setup by editing Language-ID/config/language-id.conf.cmake
. Do this using your favorite text editor, for example: vim Language-ID/config/language-id.conf.cmake
. Pay particular attention to PRAAT_EXE
, PERL_EXE
, WAV_DATA_ORIGINAL_LOCATION
, and SEG_DATA_ORIGINAL_LOCATION
(or their *_WIN
counterparts if running on Windows) to be sure that these contain the correct paths. (LABTOOLS
seems to be unused at the moment.)
# Generate the build system with default settings using cmake:
cmake .
This also generates the configuration file Language-ID/config/language-id.conf
based on the settings you provided in Language-ID/config/language-id.conf.cmake
.
# Build DETware:
make
# Build Language-ID:
cd Language-ID
ant
# Run the experiment:
cd ..
make detcurve
The detcurve
target will create the experiments
directory in which all experiment data will be stored. It will then copy seg files and wav files into experiments/data
and begin extracting features from these data files by running the [[seg2xml3.pl]] script.
Once these preliminary data have been copied and analyzed, a directory specific to the current experiment will be created. As the default experiment is fourgramall
, a directory called experiments/fourgramall
will be created to house results specific to that experiment. ling files will be generated and placed in experiments/fourgramall/data
. Language models will be trained and placed in experiments/fourgramall/models
. Other results, including metrics and plot data, will be placed in experiments/fougramall/results
# Explore the DET curves, avgcost.txt, avgeer.txt, etc. in the results
directory.
===Running a Specific Experiment===
The above example only runs the default 'fourgramall' experiment. Running a specific experiment requires almost exactly the same process. For example, to run the 'fivegram' experiment, substitute the following cmake command:
:cmake -D FEATURE_SET_NAME_FORCE=fivegram .
Then proceed to build the detcurve
target as before:
:make detcurve
This time, results will be stored in experiments/fivegram
rather than experiments/fourgramall
as previously.
Other parameters can be set, such as the normalization option for [[resultbuilder.pl]], and the result name that determines where plot output is stored:
:cmake -D FEATURE_SET_NAME_FORCE=fivegram -D RBLDR_NORM_FORCE=1 -D RESULT_NAME_FORCE=nist_norm1 .
This prepares the system for running fivegram
with a normalization of 1. It will also set the result name to be nist_norm1
to differentiate this run from our previous one. We can now run the experiment:
:make detcurve
The parameters we set in our most recent run of cmake are the exact parameters used by the regression test. Speaking of the regression test....
===Running All Experiments===
If you wish to build all defined experiments, simply run Language-ID/scripts/[[runall.rb]]
.
===Regression Tests===
Regression tests have been created to help verify the integrity of any changes we make to the system. The regression tests can be run by invoking Language-ID/scripts/[[run_all_tests.rb]]
from within the HEAD
directory. This will attempt to compare all currently-built experiments to the baseline data stored in Language-ID/regression
to guarantee no significant changes to the output.
==Papers==
The following papers provide vital background information to the problem that the Spoken Language ID project is tackling:
* [http://ieeexplore.ieee.org.erl.lib.byu.edu/iel4/89/10293/00481450.pdf?isnumber=&arnumber=481450 Zissman '96: Comparison of Four Approaches to Automatic Language Identification of Telephone Speech]
* [http://www.nist.gov/speech/tests/lre/2007/LRE07EvalPlan-v8b.pdf NIST evaluation guidelines]
* BYU NLP Lab [http://nlp.cs.byu.edu/techreports/BYUNLP-TR1.pdf Language ID Tech Report]
[[Category:Spoken Language ID]]