This tool will use speech recognition to analyze an audio file and auto-generate a matching FXE lip-syncing data file for use by the NWN2 game engine. Make sure you read the ReadMe.txt file that is included!
Name | Size | Description |
---|---|---|
Demo_Skit2.7z | 14.8Mb | Demo module with HAK files. Displays a cutscene of multiple character speaking with voiced lines, with FXE animated lip-sync files generated from the tool. |
FXE_Generator.rar | 41.02mb | Installation file (Microsoft installer msi file) contains all needed dependencies. |
Attachment | Size |
---|---|
![]() | 20.99 MB |
![]() | 41.02 MB |
Unfortunately, the Select Model menu is missing P_EEM.
I just noticed that when generating an FXE for Daeghun (male elf model from the OC). Added "P_EEM" to the source, compiled, and it worked np.
So i'm currently working on upgrading 34's Generator (to work w/ win10 etc.)
will post a link to it if/when it's ready.
ps. I noticed that Daeghun's stock FXE files, the one(s) I checked, simply use P_HHM ...
The readme has an interesting note on the lack of P_EEM
Am also wondering what .FXM and .FXS files are for ... and how the AnimTags XML files relate to lipsyncing; they look like helpers or tweaks for specific NPCs and creature types. And whether the Head "Model" should really be the Head "Skeleton" etc (these are different in Appearance.2da).
basically, Lolzorcoptorz
I tried this on my Win 7 64 bit computer and just got an error file. Tried with another computer, the one I have at work, which is a Win 7 32 bit computer, and got the same error message. When I found a computer at work that had Win XP, and tried it on that one, it worked without problems.
Does anyone know if there are a new version of this program out there that works with Win 7?
I tried on a Win 10 Pro 64 edition machine, a Win 7 Pro 32 bit machine and even with the XP compatibility mode: the program appears to work (there are percentages of words and phonemes recognition), it even creates an fxe file.
But unfortunately the generated fxe crashes NWN2.
hi lord XD
this is not my version of the lip-syncer. This is 0100010's prior version ... i cant change his upload eh.
My version ( which iirc is linked to in the Nwn2 forum ) doesn't have a readme. It fixes the crash that 4760 was getting, and roughly speaking tweaks things up a bit -- so the readme that 0100010 wrote is still the current readme.
I didn't release my version to the vault (yet?) because i got stuck on the FXE format and so i couldn't really advance the program further ... (yet?)
[what follows is the contents of 0100010's ReadMe.txt file. I make no claims to its accuracy or relevance ...]
--------
This tool only supports American English (although it should be passable with Non-American english)
Instructuions:
Step 0.
Run FXE_Generator.msi This will install both the .NET framework and SAPI 5.1, as well
as the necessary files for the tool.
To run in dialog mode just launch FXE_Generator.exe
Step 1.
Click Open, navigate to and select a .WAV or .MP3 file. (example: "testfile.wav")
If there is a text files called: "testfile.txt", in the same folder as "testfile.wav",
then that text of the file will appear in the 'Actual Text' box. It is intended that this
text file contain the actual text of what is being spoken in the audio file.
Text in the 'Actual Text' text box surrounded by :
angle bracks < >, curly braces { }, square brackets [ ], or pipe characters |
will be ignored by the speech recognition component.
Punchtuation is also ignored, and non-alpha numeric characters other than the apostraphe '
is also ignored.
If there is an fxe file called "testfile.fxe" then the FXE Data Blocks table will populate with
the values in the fxe file. THIS FILE WILL BE OVERWRITTEN WHEN YOU CLICK ON THE MAKE FXE BUTTON!
Step 2.
Edit the 'Actual Text' (or add text where none existed) to better reflect the spoken dialogue.
If necessary, spell out numbers, or phonetically spell out acronyms if it is having trouble with them.
Step 3.
Click Generate.
Your goal here is to get one of the numbers (non-enhanced or enhanced phoneme list) to get at least above 65%.
The higher the number the better the speech recognition component understood the audio file.
The word match percent values are there only for informative purposes.
If both phonemes estimates score below 65% then the labels will show red and you have a bad sampling.
If one of the phonemes estimates scores above 80% you have a good sampling, and the labels will show green.
Step 4.
If your sampling is bad, go back and play with the 'Actual Text' words to see if you can generate
a better score by clicking generate again.
If you get stuck with "Recognized text (enhanced by actual text)" values which are repeatedly identical to
the "Recognized text (No enhanced)", then close and reopen the tool.
If you still get bad values despite trying all of the above, then you probably have a bad audio file.
Typical causes of a bad audio file are: Unclear speech, too heavy of an accent, too unusual of a voice,
or too much background noise or echo. Try re-recording it
Step 5.
Select the type of head model that should be associated with the FXE file from the combo box.
These files correspond to the known game data files that end in '.fxa'. If you don't see the
head model you want to use, check and see if they default to one of the existing ones already listed.
How do you do that? Open up a game FXE file for a dialogue line known to be spoken by a character with
the model you are expecting and see if it uses something else instead. For example, P_EEM doesnt exist.
P_EEM is a male elf, trying looking at the FXE for one of Sand's dialogue lines to see what model it uses
You'll fine that Male Elves use P_HHM as the base model for their FXE files. Others I have not checked.
If the model you want to use still doesn't exist, tough sh**, learn how to generate an fxa file for it and
tell the rest of us.
Step 6.
Click the Make FXE button.
This will populate the FXE Datablock table with new values and create or overwrite the FXE file which shares
the name of your audio file. Drop this FXE file into your override or your hak along with the audio file and
watch the speaker lip sync in game.
Running in console mode:
Two arguments are needed, the fully pathed file name of the wav/mp3 to be analyzed, and the name of the model
file to use with the generated FXE file. If a txt file matching the name of the audio file is found, it will
use that as the source file for the actual text and attempt to enhance the results just like the dialog mode does.
If no such file exists it will use the non-enhanced results.
Usage Example: FXE_Generator testfile.wav P_HHM
Important tips:
The phonemes estimate percentages are only a guide!!!
The end result in-game is your primary means of evaluating if you have good enough results.
Don't worry too much about what the tool says about matching words.
Still dont like the end results or want better results?
You probably will need to add new Viseme data points into the FXE file or adjust existing ones, that is
what all those values are in the FXE Data Block table. It's a graph w/ time on the X-axis and morphic weight
on the Y-Axis. Version 1.0 does not have the ability to let the user edit these (yet) but the source
code is included so if your not included to wait and know how to code then by all means have a crack at it.
Here's a hint though: the only data the game cares about are the local maximums and the local minimums for
each viseme graph. The maximum Y values for "Fave" should not exceed 0.7, "Cage" should not exceed 0.2 and all
others should not exceed 0.85. Just like as is specified here:
http://support.oc3ent.com/fogbugz/default.php?W112
If you want more details on how it works behind the scenes then PM me, it's too much to explain here.
--------