Method and apparatus for automatic speech segmentation into phoneme-like units for use in speech processing applications, and based on segmentation into broad phonetic classes, sequence-constrained vector quantization and hidden-markov-models
First Claim
Patent Images
1. A method for automatically segmenting speech for use in speech processing applications, said method comprising the steps of:
- classifying and segmenting utterances from a speech data base into three broad phonetic classes (BPC) voiced, unvoiced, and silence, for attaining preliminary segmentation positions;
using preliminary segmentation positions as anchor points for further segmentation into phoneme-like units by sequence-constrained vector quantization (SCVQ) in an SCVQ-step;
initializing phoneme Hidden-Markov-Models with the segments provided by the SCVQ-step, and further tuning of the HMM parameters by Baum-Welch estimation;
finally, using the fully trained HMMs to perform Viterbi alignment of the utterances with respect to their phonetic transcription and in this way obtaining the final segmentation points.
1 Assignment
0 Petitions
Accused Products
Abstract
For machine segmenting of speech, first utterances from a database of known spoken words are classified and segmented into three broad phonetic classes (BPC) voiced, unvoiced, and silence. Next, using preliminary segmentation positions as anchor points, sequence-constrained vector quantization is used for further segmentation into phoneme-like units. Finally, exact tuning to the segmented phonemes is done through Hidden-Markov Modelling and after training a diphone set is composed for further usage.
194 Citations
6 Claims
-
1. A method for automatically segmenting speech for use in speech processing applications, said method comprising the steps of:
-
classifying and segmenting utterances from a speech data base into three broad phonetic classes (BPC) voiced, unvoiced, and silence, for attaining preliminary segmentation positions;
using preliminary segmentation positions as anchor points for further segmentation into phoneme-like units by sequence-constrained vector quantization (SCVQ) in an SCVQ-step;
initializing phoneme Hidden-Markov-Models with the segments provided by the SCVQ-step, and further tuning of the HMM parameters by Baum-Welch estimation;
finally, using the fully trained HMMs to perform Viterbi alignment of the utterances with respect to their phonetic transcription and in this way obtaining the final segmentation points. - View Dependent Claims (2, 3)
-
-
4. An apparatus for segmenting speech for use in speech processing applications, said apparatus comprising:
-
BPC segmenting means fed by a speech data base for classifying and segmenting utterances received into three broad phonetic classes (BPC) voiced, unvoiced, and silence, SCVQ segmenting means fed by said BPC segmenting means for by using preliminary segmentation positions as anchor points executing further segmentation into phoneme-like units by sequence-constrained vector quantization (SCVQ), phone Hidden-Markov-Means (HMM) fed by said SCVQ segmenting means for initialization of phoneme HMM and further tuning of HMM parameters;
final segmentation means controlled by said HMM. - View Dependent Claims (5, 6)
-
Specification