Ticket #468 (new defect)
VoxForge pron dict not using CMU 0.6; using xVoice unstressed dict...
Reported by: | kmaclean | Owned by: | kmaclean |
---|---|---|---|
Priority: | major | Milestone: | Acoustic Model 0.1.2 |
Component: | Acoustic Model | Version: | Acoustic Model 0.1.1 |
Keywords: | Cc: |
Description (last modified by kmaclean) (diff)
from this post:
For Festival CMU generated phonemes I do not find documentation for DX and IX. Are these needed for efficient phonetic alignment?
Now that is an interesting question...
I have not looked at this in a while, but it seems that when I first created the VoxForge? dictionary, I could not use CMU dict v0.6 (contrary to what I state in this FAQ entry: What is the VoxForge? phoneset?) but actually used the CMU unstressed dictionary from the xvoice site (see Notes file).
All this time, I have assumed that CMU dict v0.6 and the CMU unstressed dictionary used the same phone set, but that does not seem to be the case, e.g.:
xvoice cmu-unstressed:
ABBREVIATED AX B R IY V IY EY DX AX D
Voxforge pronunciation dictionary
ABBREVIATED [ABBREVIATED] ax b r iy v iy ey dx ax d
cmu dict v0.6 (stressed - remove numbers for unstressed)
ABBREVIATED AH0 B R IY1 V IY0 EY2 T AH0 D
Speech Recognition engines don't really care what identifier you use for a particular phoneme, as long as you are consistent. The CMU pronunciation dictionary is the de facto standard open source pronunciation dictionary (for English), so I would recommend that you use the most current version (CMU dict v0.7).