Subword-based speaker verification with multiple-classifier score fusion weight and threshold adaptation
First Claim
1. An automatic speaker verification system comprising:
- a receiver, the receiver obtaining enrollment speech over an enrollment channel;
a means, connected to the receiver, for developing an estimate of the enrollment channel;
a first storage device, connected to the receiver, for storing the enrollment channel estimate;
a means for extracting predetermined features of the enrollment speech;
a means, operably connected to the extracting means, for segmenting the predetermined features of the enrollment speech, wherein the features are segmented into a plurality of subwords;
a plurality of classifiers, connected to the segmenting means, wherein the classifiers model the plurality of subwords and output classifier scores and a means, connected to the classifier, for fusing the classifier scores, wherein the fusing means weighs the scores from the classifier models with a fusion constant and combines the weighted scores resulting in a final score for the combined system, and wherein the weighted scores are variable and are dynamically adapted.
4 Assignments
0 Petitions
Accused Products
Abstract
The voice print system of the present invention is a subword-based, text-dependent automatic speaker verification system that embodies the capability of user-selectable passwords with no constraints on the choice of vocabulary words or the language. Automatic blind speech segmentation allows speech to be segmented into subword units without any linguistic knowledge of the password. Subword modeling is performed using a multiple classifiers. The system also takes advantage of such concepts as multiple classifier fusion and data resampling to successfully boost the performance. Key word/key phrase spotting is used to optimally locate the password phrase. Numerous adaptation techniques increase the flexibility of the base system, and include: channel adaptation, fusion adaptation, model adaptation and threshold adaptation.
76 Citations
5 Claims
-
1. An automatic speaker verification system comprising:
-
a receiver, the receiver obtaining enrollment speech over an enrollment channel;
a means, connected to the receiver, for developing an estimate of the enrollment channel;
a first storage device, connected to the receiver, for storing the enrollment channel estimate;
a means for extracting predetermined features of the enrollment speech;
a means, operably connected to the extracting means, for segmenting the predetermined features of the enrollment speech, wherein the features are segmented into a plurality of subwords;
a plurality of classifiers, connected to the segmenting means, wherein the classifiers model the plurality of subwords and output classifier scores and a means, connected to the classifier, for fusing the classifier scores, wherein the fusing means weighs the scores from the classifier models with a fusion constant and combines the weighted scores resulting in a final score for the combined system, and wherein the weighted scores are variable and are dynamically adapted.
-
-
2. An automatic speaker verification method, comprising the steps of:
-
obtaining enrollment speech over an enrollment channel;
storing an estimate of the enrollment channel;
extracting predetermined features of the enrollment speech;
segmenting the enrollment speech, wherein the enrollment speech is segmented into a plurality of subwords;
modeling the plurality of subwords using a plurality of classifier models resulting in an output of classifier scores;
weighing the scores from the classifier models with a fusion constant, wherein the fusion constant is variable and is dynamically adapted; and
combining the weighted scores resulting in a final score for the combined system.
-
-
3. An automatic speaker verification method, wherein the results of prior verifications are stored, including the steps of:
-
obtaining test speech from a user seeking authorization or identification;
generating subwords of the test speech;
scoring the subwords against subwords of a known individual using a plurality of modeling classifiers;
storing the results of each model classifiers as a classifier score;
fusing the results of each classifier score using a fusion constant and weighing function to generate a final score;
comparing the final score to a threshold value to determine whether the test speech and enrollment speech are from the known individual;
determining that fusion adaptation inclusion criteria are met; and
changing the fusion constant to provide more weight to the classifier score which more accurately corresponds to the threshold value.
-
-
4. An automatic speaker verification method, wherein the results of prior verifications are stored, including the steps of:
-
obtaining test speech from a user seeking authorization or identification;
generating subwords of the test speech;
scoring the subwords against subwords of a known individual using a plurality of modeling classifiers;
storing the results of each model classifiers as a classifier score;
fusing the results of each classifier score using a fusion constant and weighing function to generate a final score;
comparing final score to a threshold value to determine whether the test speech and enrollment speech are from the known individual;
determining that model adaptation inclusion criteria are met, including that one or more verifications have been successful; and
training the model classifiers with previously stored enrollment speech and with speech corresponding to the successful verifications, including the steps of generating a new threshold value; and
storing the new threshold value.
-
-
5. An automatic speaker verification method, wherein the results of prior verifications are stored, including the steps of:
-
obtaining test speech from a user seeking authorization or identification;
generating subwords of the test speech;
scoring the subwords against subwords of a known individual using a plurality of modeling classifiers;
storing the results of each model classifiers as a classifier score;
fusing the results of each classifier score using a fusion constant and weighing function to generate a final score;
comparing the final score to a threshold value to determine whether the test speech and enrollment speech are from the known individual;
determining that threshold adaptation inclusion criteria are met;
analyzing the stored final scores;
calculating a new threshold value in response to the analyzation; and
storing the new threshold value.
-
Specification