Since the 1950s, several experiments have been run to evaluate the benefit of lip-reading on speech intelligibility, all presenting a natural face speaking at different levels of background noise. In this paper, we present a similar experiment run with French stimuli. Experiments run by McGrath (1985) and then by Summerfield et al. (1989) showed that the lips carry more than half the visual information provided by the whole face of an English speaker, and that vision of the teeth somewhat increases the intelligibility of a message. Similar experiments have been carried out at the Institut de la Communication Parlee in French. We compared the overall performance of normal hearers in audio-visual intelligibility tests where the visual displays were made of a natural face (Benoit et al., 1992), natural lips alone (Le Goff et al., 1995), and a bunch of 3D parametric models of the main components of a speaker's face: the lips, the jaw and the skin (Guiard-Marigny et al., 1995). The same parameters as those used to animate our synthetic models of the face have been measured on the same corpus to evaluate the performances of an HMM classifier in an identification task analogous to that performed by human subjects (Adjoudani and Benoit, 1996). Overall results are presented too. (6 pages)
H R YashwanthH. MahendrakarS. Sumam David
Xiaoping WangYufeng HaoDegang FuChun-Wei Yuan
Muhammad Rizki MaulanaMohamad Ivan Fanany
Tsang-Long PaoWenyuan LiaoTsan-Nung WuChing‐Yi Lin