Ticket #1 (closed defect: fixed)
Julian 3.5.1 problems with VoxForge Tutorial Acoustic Models
Reported by: | kmaclean | Owned by: | kmaclean |
---|---|---|---|
Priority: | minor | Milestone: | 0.1-beta |
Component: | Speech Rec Engine | Version: | 0.1-alpha |
Keywords: | Speech Recognition Engin Julius 3.5.1 | Cc: |
Description (last modified by kmaclean) (diff)
Running the same acoustic model under Julian 3.5 and 3.5.1, and everything seems to work OK with 3.5, but I get no recognition with at all with 3.5.1. Looking at the console output, Julian 3.5.1 doesn't seem to be picking up the end silence tag </s>, and I am not sure why.
Here is part of the console output for recognition of the phrase "call Steve" under 3.5.1:
$/usr/local/julius/julius-3.5.1-linuxbin/bin/julian-3.5.1-std -input mic -C julian.jconf ... ### read waveform input pass1_best: <s> DIAL pass1_best_wordseq: 0 3 pass1_best_phonemeseq: sil | d ay l pass1_best_score: -102358.328125 length: 593 frames (1.97 sec.) ### Recognition: 2nd pass (RL heuristic best-first with DFA) samplenum=593 stack empty, search terminate now 0 sentences have found got no candidates, output 1st pass result as a final result sentence1: <s> DIAL wseq1: 0 3 phseq1: sil | d ay l cmscore1: 0.000 0.000 score1: -102358.328125 0 generated, 0 pushed, 0 nodes popped in 593 <<< please speak >>>
Here is the console output for the same utterance and julian configuration file under 3.5:
$ /usr/local/julius/julius-3.5-linuxbin/bin/julian-3.5-std -input mic -C julian.jconf ... ### read waveform input pass1_best: <s> CALL STEVE </s> pass1_best_wordseq: 0 2 4 1 pass1_best_phonemeseq: sil | k ao l | s t iy v | sil pass1_best_score: -14968.178711 length: 542 frames (1.80 sec.) ### Recognition: 2nd pass (RL heuristic best-first with DFA) samplenum=542 stack empty, search terminate now 2 sentences have found sentence1: <s> PHONE STEVE </s> wseq1: 0 2 4 1 phseq1: sil | f ow n | s t iy v | sil cmscore1: 1.000 0.000 1.000 1.000 score1: -15512.497070 14 generated, 14 pushed, 16 nodes popped in 542 <<< please speak >>>
I am using the precompiled Julius/Julian? binaries on Fedora Core 4 (64bit) on an AMD64 PC.
Solution:
- use Julius 3.5 for Acoustic Model creation
- apply patch to Julius 3.5.1
- wait for Julius 3.5.2
Change History
Note: See
TracTickets for help on using
tickets.
Reply from Julius support:
We now found a small bug that causes wrong feature extraction when using microphone input with 0'th cepstral parameter.
The attached file is a patch for Julius-3.5.1 to fix the bug. This patch is now applied to the current development source on CVS, and will be released as part of 3.5.2 in near future.
LEE Akinobu