wiki:AudioSegmentation

Version 4 (modified by anonymous, 15 years ago) (diff)

--

Resources to aid in the segmentation of speech:

Links:

  • an approach:
Date :	 Wed, 6 Dec 2006 21:27:29 +0800
De :	"Xie Zhiqing" <kramxxx@gmail.com>
À :	htk-users@eng.cam.ac.uk
Objet:	[HTK-Users] Lightly supervised acoustic training

Hi, my name is Mark and I am a student from Singapore.  Currently I am
working on a project on speech recognition, specifically on trainning
portion of the system.  From my knowledge, basically what I am
supposed to do is to train the system on a small amount of manually
transcribed speech (.wav and .lab) and then use it to transcribe a
larger amount of untranscribe speech (only with .wav).  If the
confidence level is high enough, then it will be added into the
trainning data and the process will be run iteratively until all the
untranscribe speech is added to the trainning data.  Is this method
correct?

>From what I gather from the HTK book, using Hvite it will output
transcriptions for the raw speech.  The format is as such => start
time , end time , phoneme and the total log probability.  Is there a
connection between the confidence measure and the total log
probability?  I am currently using Matlab to implement HTK.

Are there any good sites that can explain the process in more layman 
terms?