JOURNAL ARTICLE

MARSEC: A Machine-Readable Spoken English Corpus

Peter RoachGerry KnowlesTamás VáradiSimon Arnfield

Year: 1993 Journal:   Journal of the International Phonetic Association Vol: 23 (2)Pages: 47-54   Publisher: Cambridge University Press

Abstract

The purpose of this paper is to describe a new version of the Spoken English Corpus which will be of interest to phoneticians and other speech scientists. The Spoken English Corpus is a well-known collection of spoken-language texts that was collected and transcribed in the 1980's in a joint project involving IBM UK and the University of Lancaster (Alderson and Knowles forthcoming, Knowles and Taylor 1988). One valuable aspect of it is that the recorded material on which it was based is fairly freely available and the recording quality is generally good. At the time when the recordings were made, the idea of storing all the recorded material in digital form suitable for computer processing was of limited practicality. Although storage on digital tape was certainly feasible, this did not provide rapid computer access. The arrival of optical disk technology, with the possibility of storing very large amounts of digital data on a compact disk at relatively low cost, has brought about a revolution in ideas on database construction and use. It seemed to us that the recordings of the Spoken English Corpus (hereafter SEC) should now be converted into a form which would enable the user to gain access to the acoustic signal without the laborious business of winding through large amounts of tape. Once this was done, we should be able not only to listen to the recordings in a very convenient way, but also to carry out many automatic analyses of the material by computer.

Keywords:
Computer science Spoken language IBM Artificial intelligence

Metrics

63
Cited By
0.46
FWCI (Field Weighted Citation Impact)
10
Refs
0.74
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

The machine-readable Spoken English Corpus

Gerry Knowles

Year: 1993 Pages: 107-119
JOURNAL ARTICLE

Fiche technique - Corpus - Aix-Marsec

Daniel HirstCyril AuranCaroline Bouzon

Journal:   TIPA Travaux interdisciplinaires sur la parole et le langage Year: 2022
JOURNAL ARTICLE

ENTREVIS - a Spanish machine-readable text corpus

Kjær Jensen

Journal:   HERMES - Journal of Language and Communication in Business Year: 2015 Vol: 4 (7)Pages: 81-81
© 2026 ScienceGate Book Chapters — All rights reserved.