JOURNAL ARTICLE

Error Correction in Lightly Supervised Alignment of Broadcast Subtitles

Abstract

This paper presents a range of error correction techniques aimed at improving the accuracy of a lightly supervised alignment task for broadcast subtitles.Lightly supervised approaches are frequently used in the multimedia domain, either for subtitling purposes or for providing a more reliable source for training speech-based systems.The proposed methods focus on directly correcting of the alignment output using different techniques to infer word insertions and words with inaccurate time boundaries.The features used by the classification models are the outputs from the alignment system, such as confidence measures, and word or segment duration.Experiments in this paper are based on broadcast material provided by the BBC to the Multi-Genre Broadcast (MGB) challenge participants.Results, show that the order alignment F-measure improves up to 2.6% absolute (15.8% relative) when combining insertion and wordboundary correction.

Keywords:
Computer science Focus (optics) Word (group theory) Task (project management) Error detection and correction Measure (data warehouse) Range (aeronautics) Artificial intelligence Word error rate Speech recognition Error analysis Domain (mathematical analysis) Natural language processing Pattern recognition (psychology) Algorithm Data mining

Metrics

5
Cited By
1.41
FWCI (Field Weighted Citation Impact)
21
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Subtitles and Audiovisual Media
Social Sciences →  Arts and Humanities →  Language and Linguistics
© 2026 ScienceGate Book Chapters — All rights reserved.