Discriminative training for speaker and speech verification
First Claim
1. A method for discriminatively training acoustic models for automated speech verification, comprising:
- defining a likelihood ratio for a given speech segment X having a known linguist identity W, using an acoustic model which represents W and an alternative acoustic model which represents linguist identities other than W;
determining an average likelihood ratio score for the likelihood ratio scores over a set of training utterances whose linguist identities are the same, W;
determining an average likelihood ratio score for the likelihood ratio scores over a competing set of training utterances whose linguist identities are not W; and
optimizing a difference between the average likelihood ratio score over the set of training utterances and the average likelihood ratio score over the competing set of training utterances, thereby improving the acoustic model.
6 Assignments
0 Petitions
Accused Products
Abstract
A method for discriminatively training acoustic models is provided for automated speaker verification (SV) and speech (or utterance) verification (UV) systems. The method includes: defining a likelihood ratio for a given speech segment, whose speaker identity (for SV system) or linguist identity (for UV system) is known, using a corresponding acoustic model, and an alternative acoustic model which represents all other speakers (in SV) or all other linguist identities (in UV); determining an average likelihood ratio score for the likelihood ratio scores over a set of training utterances (referred to as true data set) whose speaker identities (for SV) or linguist identities (for UV) are the same; determining an average likelihood ratio score for the likelihood ratio scores over a competing set of training utterances which excludes the speech data in the true data set (referred to as competing data set); and optimizing a difference between the average likelihood ratio score over the true data set and the average likelihood ratio score over the competing data set, thereby improving the acoustic model.
6 Citations
15 Claims
-
1. A method for discriminatively training acoustic models for automated speech verification, comprising:
-
defining a likelihood ratio for a given speech segment X having a known linguist identity W, using an acoustic model which represents W and an alternative acoustic model which represents linguist identities other than W; determining an average likelihood ratio score for the likelihood ratio scores over a set of training utterances whose linguist identities are the same, W; determining an average likelihood ratio score for the likelihood ratio scores over a competing set of training utterances whose linguist identities are not W; and optimizing a difference between the average likelihood ratio score over the set of training utterances and the average likelihood ratio score over the competing set of training utterances, thereby improving the acoustic model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A method for discriminatively training acoustic models for automated speaker verification, comprising:
-
defining a likelihood ratio for a given speech segment X having a known speaker identity, K, using an acoustic model which represents K and an alternative acoustic model which represents speakers other than K; determining an average likelihood ratio score for the likelihood ratio scores over a set of training utterances which are spoken by speaker K; determining an average likelihood ratio score for the likelihood ratio scores over a competing set of training utterances which are spoken by speakers other than speaker K; and optimizing a difference between the average likelihood ratio score over the set of training utterances and the average likelihood ratio score over the competing set of training utterances, thereby improving the acoustic model. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
Specification