Ticket #40 (new defect)

Opened 16 years ago

Last modified 15 years ago

Using HTK single pass transformation to create other type of mfc files

Reported by: kmaclean Owned by: kmaclean
Priority: minor Milestone: Acoustic Model 1.0
Component: Acoustic Model Version: 0.1-alpha
Keywords: Cc:


Single-Pass Retraining

In addition to re-estimating the parameters of a HMM set, HEREST also provides a mechanism for mapping a set of models trained using one parameterisation into another set based on a different parameterisation. This facility allows the front-end of a HMM-based recogniser to be modified without having to rebuild the models from scratch.

This facility is known as single-pass retraining . Given one set of well-trained models, a new set matching a different training data parameterisation can be generated in a single re-estimation pass. This is done by computing the forward and backward probabilities using the original models together with the original training data, but then switching to the new training data to compute the parameter estimates for the new set of models.

Single-pass retraining is enabled in HEREST by setting the -r switch. This causes the input training files to be read in pairs. The first of each pair is used to compute the forward/backward probabilities and the second is used to estimate the parameters for the new models. Very often, of course, data input to HTK is modified by the HPARM module in accordance with parameters set in a configuration file. In single-pass retraining mode, configuration parameters can be prefixed by the pseudo-module names HPARM1 and HPARM2. Then when reading in the first file of each pair, only the HPARM1 parameters are used and when reading the second file of each pair, only the HPARM2 parameters are used.

As an example, suppose that a set of models has been trained on data with MFCC_E_D parameterisation and a new set of models using Cepstral Mean Normalisation (_Z) is required. These two data parameterisations are specified in a configuration file (config) as two separate instances of the configuration variable TARGETKIND i.e.


HEREST would then be invoked with the -r option set to enable single-pass retraining. For example,

HERest -r -C config -S trainList -I labs -H dir1/hmacs -M dir2 hmmList

The script file trainlist contains a list of data file pairs. For each pair, the first file should match the parameterisation of the original model set and the second file should match that of the required new set. This will cause the model parameter estimates to be performed using the new set of training data and a new set of models matching this data will be output to dir2. This process of single-pass retraining is a significantly faster route to a new set of models than training a fresh set from scratch.

Change History

comment:1 Changed 15 years ago by kmaclean

  • Milestone set to Unassigned

comment:2 Changed 15 years ago by kmaclean

  • Milestone changed from Unassigned to Acoustic Model 1.0
Note: See TracTickets for help on using tickets.