Old Bailey Voices 1780-1880
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
The Old Bailey Proceedings 1674-1913 represent the largest body of direct recorded speech by non-elite people ever created. The Old Bailey Voices dataset (OBV) consists of a full text corpus and summary data for 21000 trials reported in the Proceedings between 1780 and 1880.
The dataset was created for the Voices of Authority research theme of the [Digital Panopticon project, in order to explore changing speech patterns in the courtroom.
The Old Bailey Corpus project headed by Magnus Huber added linguistic tagging to a large sample of the Proceedings data. OBV has recombined the linguistic corpus with trial data to enable Digital Panopticon researchers to associate individual defendants with their spoken words (or silences) in court and long-term outcomes.