Ticket #481 (new enhancement)

Opened 13 years ago

Last modified 13 years ago

add MFCC_0_D_A model type

Reported by: kmaclean Owned by: kmaclean
Priority: major Milestone: Acoustic Model 0.1.2
Component: Acoustic Model Version: Acoustic Model 0.1.1
Keywords: Cc:


from this post


We (simon) have basically been using the voxforge script (transformed to C++ code) to create the speech model from the users input files.

Yesterday, we got a suggestion to switch the model type to MFCC_0_D_A (which means that the model uses 39 features instead of just 25). According to someone at the SPSC Graz this would especially improve the model when the training data uses more than one microphone.

Moreover, he suggested to use HHEds MU command to add more GMMs to the final model. I implemented the suggestions in simon and the improvement in recognition rate was drastic (in my tests).

Maybe you could try to change the model creation procedure for the voxforge model and see if this improves recognition rates there as well?

Steps to take if you want to try it out:

Change your model type to MFCC_0_D_A, adjust your prototype to use 39 features and add a few new steps after hmm15:

Use HHEd like this:

HHEd -A -D -T 1 -H hmm15/macros -H hmm15/hmmdefs -M hmm16 gmm1.hed tiedlist

Where gmm1.hed contains:

MU 4 {*.state[2-4].mix}

Re-estimate hmm16 twice, and repeat (technically for as long as you see recognition rates improve).

You can find the simon implementation here:





Change History

comment:1 Changed 13 years ago by kmaclean

nsh said:

You know about HLDA and HMMIRest, don't you?

Note: See TracTickets for help on using tickets.