icassp16-ng.tar.gz (2.66 MB)

ICASSP 2016 - Experiment results for the paper "Groupwise learning for ASR k-best list reranking in spoken language translation"

Download (2.66 MB)
dataset
posted on 27.06.2016 by Wai Man Ng, Lucia Specia, Thomas Hain, Kashif Shah
The compressed folder contains the experiment output described in Table 1 and Table 2 in the paper "Groupwise learning for ASR k-best list reranking in spoken language translation" DOI: 10.1109/ICASSP.2016.7472853.

In the top level there are two files,
"E12.filelist" contains the 1124 segments used in the experiments. Each segment is named after the convention (TALK)_(SPEAKER)_(STARTTIME in 10ms)_(ENDTIME in 10ms). The segments are identical to those from the IWSLT 2012 evaluation (http://hltshare.fbk.eu/IWSLT2012/IWSLT12.TED.SLT.testsets.tgz)
"E12.tc.fr.reference" contains the reference French translation of these 1124 sentences.

Experiments results were organised into subfolders.
Results from which TABLE 1 was generated could be found in the folder TABLE_1/Setting_[A or B or C] accordingly. For each Setting, 8 regression experiments (predicsvcpermute*) and 14 classification experiments (predicsvrpermute*) were conducted. The translation outputs are recorded in the files prefixed "rerank.merged.*" and the translation scores are recorded in the files prefixed "best-results.fullset.*"

The folders TABLE_2/Groupwise+LDA/Setting_[A or B or C] contain the translation outputs "rerank.merged.*" and translation scores "best-results.fullset.*" for the Groupwise+LDA results described in Table 2 in the paper. For Setting A, B and C, the corresponding optimal LDA dimensions are 3, 5 and 4 respectively. These are also reflected in the filename in the uploaded data.




Funding

EPSRC (EP/I031022/1) and Google ("WFST based Integration of ASR and MT in Spoken Language Translation")

History

Licence

Exports