Factored WSJCAM0 Speech Corpus

posted on 02.06.2016 by Oscar Saz Torralba, Thomas Hain

This version of the WSJCAM0 corpus has augmented variability in 4 factors: speaker; channel; background and Signal-to-Noise Ratio (SNR). It has been created by the University of Sheffield for experiments in robustness and factorisation under non-stationary noise conditions. The corpus was developed as part of the Natural Speech Technology programme grant (EP/I031022/1). Files are single-channel WAVE format, sampled at 16kHz and with a bit depth of 16 bits.

In order to obtain this corpus, you are required to have a valid licence of the original WSJCAM0 corpus from LDC (catalog number LDC95S24). Please contact Thomas Hain ( if you want to use this corpus for your research.


EPSRC Programme Grant EP/I031022/1 (Natural Speech Technology)