Originally Posted by jott View Post
Hmm strange, this particular scene seems to work fine here.
I'm using the latest SVN build of ScummVM btw....

Anyway, if somebody has some inside knowledge of the exact tag definition that (s)he is willing to share, we could probably fix some of the problems....
Probably not what you're asking, but I know how lip-synch tags are stored in SCUMM V5.

Warning! Techno-babble ahead

Tags are 16-bit words stored in big-endian format.

Ignoring the \xFF\x0A bits, your first four values are four bytes for the speech offset (little-endian), and the last four values are four bytes (little-endian) representing the number of lip-synch tags as modified by this formula:

(numtags << 1) + 8

e.g. To play a sample at offset 0x1234 with 2 lip-synch tags:

\xFF\x0A\x34\x12\xFF\x0A\x00\x00\xFF\x0A\x0C\x00\x FF\x0A\x00\x00

Compressed MONSTER.SOU:
The number of tags are stored for each sound, as part of the table at the start of the file. The tags themselves are written before the MP3/OGG/FLAC data (e.g. for a sound at 0x1234 with two tags, 0x1234 to 0x1238 stores the tags, and 0x1238 onwards stores the sound data).

Tag format:
Each tag is just a time position in the sound file being played (I'm not sure what measurement the position is in, maybe milliseconds, maybe not). Whenever the next tag's position is encountered, the talking animation is toggled on or off.

e.g. if there are three tags, with values 120, 240, 640, the animation timeline will play like this:
0-120: talking
120-240: not talking
240-640: talking
640-end: not talking

I think one problem is that the old SCUMM games did not support multiple speech sounds per line. Check Indiana Jones and the Fate of Atlantis, and I'm sure you'll find that all multi-line speech just uses the one sound file, and the subtitles are frequently out of synch with the sound.

Sorry I'm late to the party!
