Ticket #338 (new enhancement)
HDMan triphone counts
Reported by: | kmaclean | Owned by: | kmaclean |
---|---|---|---|
Priority: | major | Milestone: | WebSite 0.2.1 |
Component: | Audio | Version: | Website 0.2 |
Keywords: | Cc: |
Description (last modified by kmaclean) (diff)
see this thread re: Ralf TTS voice:
nsh said:
let me repeat that it looks like a deeply wrong idea to get all triphones, current
senone tying technique allows you to get effective recognition without good
coverage. And the biggest problem is that rare triphones give you zero
improvement in the accuracty.
Thanks for this clarification. My assumption that a good acoustic model (for speech recognition) needs to be trained from recordings of words containing all triphones is wrong. Therefore, the key is to get recordings of words that contain the most common triphones, and using "tied-state triphone" models (which I think is HTK terminology for "senone tying" technique, which is what Sphinx uses...) to cover the rare triphones.
I'm wondering if HTK's HDMan command can provide triphone counts (in a similar way that it provides phoneme counts), so we can then create prompts that might give us the "most bang for our buck". I'm thinking we would run it against a large database to get these triphone counts (even it could even be proprietary, since we are only looking for the counts), and then generate a list of words (from this same database) that cover off these common triphones, so Ralf (and others creating prompts for new languages) could use these words in his prompts.
... I'll put it on my todo list :)