Selective visual masking in speechreading
Using digital video technology, selective aspects of a face can be masked by identifying the pixels that represent it and then, by adjusting the gray levels, effectively eliminate that facial aspect. In groups of young adults with normal vision and hearing, consonant-viseme recognition was measured far closed sets of vowels-consonant-vowel disyllables. In the first experiment viseme recognition was measured while the tongue and teeth were masked and while the entire mouth was masked. The results showed that masking of the tongue and teeth had little effect on viseme recognition, and when the entire mouth was masked, participants continued to identify consonant visemes with 70% or greater accuracy in the/a/and/o/vowel contexts. In the second experiment, viseme recognition was measured when the upper part of the face and the mouth were masked and when the lower part of the face and the mouth were masked. The results showed that when the mouth and the upper part of the face were masked, performance was poor, but information was available to identify the consonant-viseme/f/. When the mouth and the lower part of the face were masked, viseme recognition was quite poor, but information was available to discriminate the consonant-viseme /p/ from other consonant visemes.
Journal of Speech, Language, and Hearing Research
First Page Number
Last Page Number
Preminger, Jill E.; Lin, Hwei Bing; Payen, Michel; and Levitt, Harry, "Selective visual masking in speechreading" (1998). Kean Publications. 2819.