Fast, language-independent method for user authentication by voice
First Claim
1. A method of training a user authentication by speech signal, comprising:
- decomposing a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence; and
computing at least one speaker-specific distribution value from the speaker-specific recognition unit.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and system for training a user authentication by voice signal are described. In one embodiment, a set of feature vectors are decomposed into speaker-specific recognition units. The speaker-specific recognition units are used to compute distribution values to train the voice signal. In addition, spectral feature vectors are decomposed into speaker-specific characteristic units which are compared to the speaker-specific distribution values. If the speaker-specific characteristic units are within a threshold limit of the speaker-specific distribution values, the speech signal is authenticated.
181 Citations
48 Claims
-
1. A method of training a user authentication by speech signal, comprising:
-
decomposing a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence; and
computing at least one speaker-specific distribution value from the speaker-specific recognition unit. - View Dependent Claims (2, 3, 4, 5, 6, 33, 34)
-
-
7. A method of authenticating a speech signal comprising:
-
decomposing at least one spectral feature vector into at least one speaker-specific characteristic unit;
comparing the at least one speaker-specific characteristic unit to at least one speaker-specific distribution value, the at least one speaker-specific distribution value previously trained by a user and generated by decomposing a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence; and
authenticating the speech signal if the at least one speaker-specific characteristic unit is within a threshold limit of the at least one speaker-specific distribution value. - View Dependent Claims (8, 9, 10, 11, 12, 13, 35, 36, 37)
-
-
14. A system for training a user authentication by speech signal, comprising:
-
means for decomposing a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence; and
means for computing at least one speaker-specific distribution value from the speaker-specific recognition unit.
-
-
15. A system for authenticating a speech signal comprising:
-
means for decomposing at least one spectral feature vector into at least one speaker-specific characteristic unit;
means for comparing the at least one speaker-specific characteristic unit to at least one speaker-specific distribution value, the at least one speaker-specific distribution value previously trained by a user and generated by decomposing a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence; and
means for authenticating the speech signal if the at least one speaker-specific characteristic unit is within a threshold limit of the at least one speaker-specific distribution value. - View Dependent Claims (38, 39, 40)
-
-
16. A computer readable medium comprising instructions, which when executed on a processor, perform a method for training a user authentication by speech signal, comprising:
-
decomposing a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence; and
computing at least one speaker-specific distribution value from the speaker-specific recognition unit.
-
-
17. A computer readable medium comprising instructions, which when executed on a processor, perform a method for authenticating a speech signal, comprising:
-
decomposing at least one spectral feature vector into at least one speaker-specific characteristic unit;
comparing the at least one speaker-specific characteristic unit to at least one speaker-specific distribution value, the at least one speaker-specific distribution value previously trained by a user and generated by decomposing a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence; and
authenticating the speech signal if the at least one speaker-specific characteristic unit is within a threshold limit of the at least one speaker-specific distribution value.
-
-
18. A system for training a user authentication by speech signal, comprising:
a processor configured to decompose a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence, and compute at least one speaker-specific distribution value from the at least one speaker-specific recognition unit. - View Dependent Claims (19, 20, 21, 22, 23, 24, 41, 42, 43, 44, 45)
-
25. A system for authenticating a speech signal comprising:
a processor to decompose at least one spectral feature vector into at least one speaker-specific characteristic unit, compare the at least one speaker-specific characteristic unit to at least one speaker-specific distribution value, the at least one speaker-specific distribution value previously trained by a user, and authenticate the speech signal if the at least one speaker-specific characteristic unit is within a threshold limit of the at least one speaker-specific distribution value, wherein the at least one speaker-specific distribution value is generated by decomposing a plurality of feature vectors into sets of vectors, a first set of vectors defining at least one speaker-specific recognition unit and a second set of vectors defining at least one content reference sequence. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 46, 47, 48)
Specification