Archive for September 28th, 2008
Automatic lip synch.
I wasn’t originally going to talk about this until I had something to show, but my next AIR project was going to be something like my old e2animate application that I talked about in my previous post.
I was going to aim it at non-technical users. Make it fun. Easy to use. Allow them to access an online library of clip-art. Plus I was going to incorporate a very powerful and special feature. Automatic lip synchronisation. Just drop a mouth shape into an animation, and it would automatically synchronise to a sound track.
However, someone has beat me to it with an AS3 lip-synch implementation. Samir has created a Flash component that he’s talking about licensing for free.
I am pleased that someone has possibly saved me all the hard work. But I was quite looking forward to the technical challenge of doing this myself.
I don’t know the details of Samir’s algorithm, but this is how I was going to do it…
The first thing is to determine the shape of the mouth from each segment of speech. I was going to look at two possible ways of doing this. Prediction gain, and frequency bin matching.
Predication gain is the power of the signal coming out a filter, divided by the power of the signal going in. If the filter matches the characteristics of the signal, then we get a high value (because we don’t lose much power), but if it doesn’t match, we get a low value, because the filter blocks out more of the signal.
So, imagine a bank of fixed filters, each one detects a particular sound. (A,I/E/U/L/W,Q/M, fricatives, etc. – possibly also taking different kinds of voices into account). Depending which filter gives us the best match, we select a particular mouth shape.
Frequency bin matching is like matching the power in parts of the frequency spectrum to known sets of values, again corresponding to different kinds of sound.
MP3 is a sub-band coder. An MP3 file actually contains these frequency bin values. Or they can be derived from an FFT (SoundMixer.computeSpectrum()).
I also heard that pixelbender could be used to process sound, and I was going to investigate this, and anything else Cosmo had.
Anyway, it was probably going to take a little fiddling around, possibly some time-based signal processing too, but I’m sure something based on these methods could automatically generate a mouth shape. Then, the size of the mouth is just proportional to the amplitude/power of the speech signal.
And that’s how I was going to do dynamic lip synch in ActionScript.
1 comment September 28, 2008
Flash CS4 tweening. Nothing new.
It’s funny how technology tends to recycle old ideas. Like Macromedia Director style animation paths and tweening, now recycled into Flash CS4. Don’t get me wrong, Flash CS4 looks pretty cool, and introduces a whole bunch of stuff that Director never had.
In fact, animation paths are such a cool idea, that I used them a few years ago in a Central application. I wrote a prototype animation application called e2animate.
I’d like to recycle it to work on AIR sometime.
The pictures below shows how similar the animation paths in e2animate were to Flash CS4. See how I used bezier curve paths that could be edited on the stage.
Some people have speculated that Adobe will eventually port versions of its software (Flash, etc.) to AIR, like it has already done with PhotoShop.
Well, back in the day, before AIR, before Flex – developers like myself were already beginning to move in that direction.
1 comment September 28, 2008


