Adventures in Lip-sync: Part 2

Posted by Shane Casey on Mon, 09 Nov 2009

This post follows on from Adventures in Lip-sync: Part 1

So, at this stage, we’ve got a string of gibberish from Repeat After Me and not much else. To get this working there’s four key ingredients:

  • the audio soundtrack;
  • the phoneme info from Repeat After Me;
  • the graphic representations of the mouth shapes (visemes);
  • and a dictionary to translate one to the other.

All I need to do is hook them up.

Step 4

I created a MovieClip with my mouth shapes on separate key-frames. The nine “main” mouth shapes, and extra for ‘th’ sounds and a closed/relaxed mouth – 11 in total. Here’s Preston Blair’s mouth shapes that are something similar to mine. I kept them pretty basic so they can be easily re-skinned or adapted next time. Keyframe by number they’re:

  1. M, B, P
  2. A, I
  3. H, EH, UH
  4. EE
  5. S, C
  6. V
  7. L
  8. O, OH
  9. W, OO
  10. TH
  11. (Closed)

Step 5

Next, I created an object called ref that I initialise like this:

private function initRef():void
{
	ref = new Object();
 
	ref["AA"] = 2;
	ref["AE"] = 2;
	ref["AH"] = 3;
	ref["AO"] = 8;
	ref["AW"] = 8;
	ref["AX"] = 2;
	ref["AXR"] = 5;
	ref["AY"] = 2;
	ref["EH"] = 3;
	ref["ER"] = 3;
	ref["EY"] = 2;
	ref["IH"] = 3;
	ref["IX"] = 3;
	ref["IY"] = 4;
	ref["OW"] = 8;
	ref["OY"] = 8;
	ref["UH"] = 9;
	ref["UW"] = 9;
	ref["B"] = 1;
	ref["CH"] = 5;
	ref["D"] = 5;
	ref["DH"] = 10;
	ref["DX"] = 1;
	ref["F"] = 6;
	ref["G"] = 5;
	ref["HH"] = 3;
	ref["JH"] = 5;
	ref["K"] = 5;
	ref["L"] = 7;
	ref["M"] = 1;
	ref["N"] = 5;
	ref["NG"] = 5;
	ref["P"] = 1;
	ref["R"] = 5;
	ref["S"] = 5;
	ref["SH"] = 5;
	ref["T"] = 5;
	ref["TH"] = 10;
	ref["V"] = 6;
	ref["W"] = 9;
	ref["Y"] = 5;
	ref["Z"] = 5;
	ref["ZH"] = 5;
 
	ref[""] = 11;
}

This gives me a reference or dictionary of sorts to convert the phoneme info from the data to the appropriate keyframe. Here’s a great reference that really helped reduce the 40-odd phonemes into my 10 visemes.

Step 6

To tie in the phonemes with the audio, I loaded the audio clip and using this nifty little Audio Cuepoint class by Armen Abrahamyan I can fire events at the appropriate times.

Step 7

Next thing is to get the phoneme data in. I wanted to re-use this class so I put the data in a text-file and loaded it in via ActionScript. Looking at the data generated by Repeat After Me I figured out all that’s relevant to me is:

[phoneme name] {D:[duration (ms)]; … }

That’s enough of a pattern for me to parse it out (watching for a gotcha in the form of a random numeric character at the start of the phonemes). I created an XML object that the Audio Cuepoint class can understand to store the info:

private var cuepoints:XML = ;

and here’s the function I called once the load completed:

private function parseData(_data:String):void{
 
	var instructions:Array = new Array();
	instructions = _data.split("\n");
 
	var instructions_length:uint = instructions.length;
	var time:Number = 0;
	for (var a:uint = 0; a < instructions_length; a++) { 		var duration_delimiter:String = " {D "; 		if(instructions[a].indexOf(duration_delimiter) > -1) {
			var instruction:Object = new Object();
			instruction.viseme = instructions[a].split(duration_delimiter)[0];
 
			//strip anything that's not a letter
			var pattern:RegExp = /[^a-z]/gi;
			instruction.viseme = instruction.viseme.replace(pattern,""); 
 
			cuepoints.appendChild(new XML('' + instruction.viseme + ''));
 
			//get duration in milliseconds
			instruction.duration = Number(instructions[a].split(duration_delimiter)[1].split(";")[0].split("}")[0]);
			time += instruction.duration;					
 
			visemes.push(instruction);
		}
	}
}

Step 8

All that’s left to do now is listen for my audio cuepoints, cross-reference the phoneme with the appropriate keyframe and gotoAndStop().

private function getViseme(_key:String):uint
{
	return ref[_key.toUpperCase()] as uint;
}
 
private function onCuepointFind(e:Event):void
{
	trace("cuepoint find time: " +AudioCuePoint(e.target).cuepointTime + " / text: "+AudioCuePoint(e.target).cuepointText+"\n");
	mouth.gotoAndStop(getViseme( AudioCuePoint(e.target).cuepointText ) );
}

Here’s a clip in action. It’s not perfect but for a 5-minute lip-synced animation, this saved me a huge amount of time!

Get the Flash Player to see this content.

Get the Flash Player to see this content.

Tags: , , ,

Leave a comment


Blog WebMastered by All in One Webmaster.