Method for speech processing involving whole-utterance modeling
First Claim
1. A method of speaker verification by matching a claimed speaker with a known speaker, including the steps of processing spoken input enrollment speech data and test speech data, generating respective match scores therefrom, and determining whether the test speech data corresponds with the enrollment speech data, the method comprising:
- forming enrollment speech data as a first plurality of pair-phrases using a set of words, the set of words consisting of a predetermined number of words, wherein the set of words are words between one to nine and at least one bridging word “
ti”
;
forming test speech data as a second plurality of pair-phrases from the same set of words, the second plurality of pair-phrases different from the first plurality of pair-phrases;
converting, by a Baum-Welch algorithm, the first plurality of pair-phrases into a first set of adapted HMM word models;
converting, by the Baum-Welch algorithm, the second plurality of pair-phrases into a second set of adapted HMM word models;
ordering the first set of adapted HMM word models into a first sequence;
ordering the second set of adapted HMM word models into a second sequence, the second sequence and the first sequence having the same order and the same predetermined number of words; and
comparing the first and second sets of adapted HMM word models using a weighted Euclidean distance.
3 Assignments
0 Petitions
Accused Products
Abstract
A speech verification process involves comparison of enrollment and test speech data and an improved method of comparing the data is disclosed, wherein segmented frames of speech are analyzed jointly, rather than independently. The enrollment and test speech are both subjected to a feature extraction process to derive fixed-length feature vectors, and the feature vectors are compared, using a linear discriminant analysis and having no dependence upon the order of the words spoken or the speaking rate. The discriminant analysis is made possible, despite a relatively high dimensionality of the feature vectors, by a mathematical procedure provided for finding an eigenvector to simultaneously diagonalize the between-speaker and between-channel covariances of the enrollment and test data.
28 Citations
7 Claims
-
1. A method of speaker verification by matching a claimed speaker with a known speaker, including the steps of processing spoken input enrollment speech data and test speech data, generating respective match scores therefrom, and determining whether the test speech data corresponds with the enrollment speech data, the method comprising:
-
forming enrollment speech data as a first plurality of pair-phrases using a set of words, the set of words consisting of a predetermined number of words, wherein the set of words are words between one to nine and at least one bridging word “
ti”
;forming test speech data as a second plurality of pair-phrases from the same set of words, the second plurality of pair-phrases different from the first plurality of pair-phrases; converting, by a Baum-Welch algorithm, the first plurality of pair-phrases into a first set of adapted HMM word models; converting, by the Baum-Welch algorithm, the second plurality of pair-phrases into a second set of adapted HMM word models; ordering the first set of adapted HMM word models into a first sequence; ordering the second set of adapted HMM word models into a second sequence, the second sequence and the first sequence having the same order and the same predetermined number of words; and comparing the first and second sets of adapted HMM word models using a weighted Euclidean distance. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification