Speaker verification and speaker identification based on eigenvoices
First Claim
1. A method for assessing speech with respect to a predetermined client speaker, comprising:
- training a set of speech models upon the speech from a plurality of training speakers, the plurality of training speakers including at least one client speaker;
constructing an eigenspace to represent said plurality of training speakers by performing dimensionality reduction upon said set of models to generate a set of basis vectors that define said eigenspace;
representing said client speaker as a first location in said eigenspace;
processing new speaker input data by training a new speech model upon said input data and by performing dimensionality reduction upon said new speech model to generate a representation of said new speaker as a second location in eigenspace;
assessing the proximity between said first and second locations and using said assessment as an indication of whether the new speaker is the client speaker.
2 Assignments
0 Petitions
Accused Products
Abstract
Speech models are constructed and trained upon the speech of known client speakers (and also impostor speakers, in the case of speaker verification). Parameters from these models are concatenated to define supervectors and a linear transformation upon these supervectors results in a dimensionality reduction yielding a low-dimensional space called eigenspace. The training speakers are then represented as points or distributions in eigenspace. Thereafter, new speech data from the test speaker is placed into eigenspace through a similar linear transformation and the proximity in eigenspace of the test speaker to the training speakers serves to authenticate or identify the test speaker.
-
Citations
11 Claims
-
1. A method for assessing speech with respect to a predetermined client speaker, comprising:
-
training a set of speech models upon the speech from a plurality of training speakers, the plurality of training speakers including at least one client speaker; constructing an eigenspace to represent said plurality of training speakers by performing dimensionality reduction upon said set of models to generate a set of basis vectors that define said eigenspace; representing said client speaker as a first location in said eigenspace; processing new speaker input data by training a new speech model upon said input data and by performing dimensionality reduction upon said new speech model to generate a representation of said new speaker as a second location in eigenspace; assessing the proximity between said first and second locations and using said assessment as an indication of whether the new speaker is the client speaker. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
Specification