Datasheet

Starting with What’s Most Important: Visemes n 9
In the search for a better system for CGI sync, something became very apparent: There are
three different kinds of sounds you can make during speech, and not all of them are easy
to see! You’ve got lips, a tongue, and a throat. Phoneme-based systems lump all of these
sounds together, and that is where the problems start. The only sounds you absolutely have
to worry about are the sounds made primarily with the lips. I say “primarily” because combi-
nations of all these ways to make sounds occur all the time. Also, you could argue that your
throat makes all sounds, but that would be an intellectual standpoint, not an artistic one. It
would be like saying we should include an X-ray of the lungs in sync—and, we’re not going
to be doing that!
Phonemes are sounds, but what matters in animation is what can be seen. Instead of
phonemes, of which there are about 38 in English (depending on your reference), the
techniques we’ll be using in this book are based on visual phonemes, or visemes. Visemes
are the significant shapes or visuals that are made by your lips. Phonemes are sounds;
visemes are shapes. Visemes are all you really need to see to buy into a performance.
You obviously cue these shapes based on the sounds you hear, but there aren’t nearly as
many to be seen as there are to be heard. The necessary visemes are listed in Table 1.1.
Remember that these are shapes tied to sounds, not necessarily collections of letters
exactly in the text.
ViSeme example SouNdS rule
B,M,P / Closed murder, plantation, cherub Lips closed
EE / Wide cheese, me, charity Mouth wide
F,V fire, fight, Virginia Lower lip rolled in
OO / Narrow dude, use, fool Mouth narrow
IH trip, snip Sometimes taller or wider than surrounding shapes
R car, road Sometimes narrower than surrounding shapes
T,S beat, traffic Sometimes taller or wider than surrounding shapes
Words are made up of these visemes, even if they arent spelled this way. For example,
the word you is comprised of the two visemes EE and then OO, to make the EE-OO sound
of the word. As you move forward in this book, you’ll learn that if there is no exact viseme
for the sound, you merely use the next closest thing. For instance, the sound OH, as in
M-OH-N (moan), is not really shown on this chart, whereas OO is. They’re not really the
same, but they’re close enough that you can funnel OH over to an OO-type shape.
Table 1.1 includes just seven shapes to hit, and only a few of those are their own unique
shape to build! Analysis and breakdown of speech has just gone from 38 sounds to
account for to only seven visemes. Some sounds can show up as the same shape, such as
UH and AW, which need to be represented only by the jaw opening.
Table 1.1
Visemes
609903c01.indd 9 6/2/11 1:52:16 AM