The University of Sheffield
icassp16-ng.tar.gz (2.66 MB)
Download file

ICASSP 2016 - Experiment results for the paper "Groupwise learning for ASR k-best list reranking in spoken language translation"

Download (2.66 MB)
posted on 2016-06-27, 13:17 authored by Wai Man NgWai Man Ng, Lucia Specia, Thomas HainThomas Hain, Kashif ShahKashif Shah
The compressed folder contains the experiment output described in Table 1 and Table 2 in the paper "Groupwise learning for ASR k-best list reranking in spoken language translation" DOI: 10.1109/ICASSP.2016.7472853.

In the top level there are two files,
"E12.filelist" contains the 1124 segments used in the experiments. Each segment is named after the convention (TALK)_(SPEAKER)_(STARTTIME in 10ms)_(ENDTIME in 10ms). The segments are identical to those from the IWSLT 2012 evaluation (
"" contains the reference French translation of these 1124 sentences.

Experiments results were organised into subfolders.
Results from which TABLE 1 was generated could be found in the folder TABLE_1/Setting_[A or B or C] accordingly. For each Setting, 8 regression experiments (predicsvcpermute*) and 14 classification experiments (predicsvrpermute*) were conducted. The translation outputs are recorded in the files prefixed "rerank.merged.*" and the translation scores are recorded in the files prefixed "best-results.fullset.*"

The folders TABLE_2/Groupwise+LDA/Setting_[A or B or C] contain the translation outputs "rerank.merged.*" and translation scores "best-results.fullset.*" for the Groupwise+LDA results described in Table 2 in the paper. For Setting A, B and C, the corresponding optimal LDA dimensions are 3, 5 and 4 respectively. These are also reflected in the filename in the uploaded data.


EPSRC (EP/I031022/1) and Google ("WFST based Integration of ASR and MT in Spoken Language Translation")



  • There is no personal data or any that requires ethical approval


  • The data complies with the institution and funders' policies on access and sharing

Sharing and access restrictions

  • The data can be shared openly

Data description

  • The file formats are open or commonly used

Methodology, headings and units

  • Headings and units are explained in the files