Apparatus and method for speaker verification/identification/classification employing non-acoustic and/or acoustic models and databases
First Claim
1. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for evaluating a user of one of a service and a facility, said method steps comprising:
- (a) receiving an identity claim of the user;
(b) querying the user with a random question;
(c) receiving an answer of the user to said random question;
at least one of said identity claim and said answer being received as a spoken utterance of the user;
(d) evaluating correctness of said answer of the user;
(e) performing speaker recognition on said at least one of said identity claim and said answer which is received as said spoken utterance; and
(f) granting access to the user if steps (d) and (e) indicate such access to be warranted.
0 Assignments
0 Petitions
Accused Products
Abstract
A method of controlling access of a speaker to one of a service and a facility, the method comprising the steps of: (a) receiving first spoken utterances of the speaker, the first spoken utterances containing indicia of the speaker;(b) decoding the first spoken utterances; (c) accessing a database corresponding to the decoded first spoken utterances, the database containing information attributable to a speaker candidate having indicia substantially similar to the speaker; (d) querying the speaker with at least one question based on the information contained in the accessed database; (e) receiving second spoken utterances of the speaker, the second spoken utterances being representative of at least one answer to the at least one question; (f) decoding the second spoken utterances; (g) verifying the accuracy of the decoded answer against the information contained in the accessed database serving as the basis for the question; (h) taking a voice sample from the utterances of the speaker and processing the voice sample against an acoustic model attributable to the speaker candidate; (i) generating a score corresponding to the accuracy of the decoded answer and the closeness of the match between the voice sample and the model; and (j) comparing the score to a predetermined threshold value and if the score is one of substantially equivalent to and above the threshold value, then permitting speaker access to one of the service and the facility.
-
Citations
45 Claims
-
1. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for evaluating a user of one of a service and a facility, said method steps comprising:
-
(a) receiving an identity claim of the user;
(b) querying the user with a random question;
(c) receiving an answer of the user to said random question;
at least one of said identity claim and said answer being received as a spoken utterance of the user;
(d) evaluating correctness of said answer of the user;
(e) performing speaker recognition on said at least one of said identity claim and said answer which is received as said spoken utterance; and
(f) granting access to the user if steps (d) and (e) indicate such access to be warranted.
-
-
2. A method for evaluating a user of one of a service and a facility, said method comprising the steps of:
-
(a) receiving an identity claim of the user;
(b) querying the user with a random question;
(c) receiving an answer of the user to said random question;
at least one of said identity claim and said answer being received as a spoken utterance of the user;
(d) evaluating correctness of said answer of the user;
(e) performing speaker recognition on said at least one of said identity claim and said answer which is received as said spoken utterance; and
(f) granting access to the user if steps (d) and (e) indicate such access to be warranted. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
step (d) comprises generating a first partial probability score;
step (e) comprises generating a second partial probability score; and
step (f) comprises the sub-steps of;
(f-1) generating a combined probability score from said first and second partial probability scores; and
(f-2) granting access to the user if said combined probability score is one of greater than or equal to a predetermined value.
-
-
21. A method for evaluating a user of one of a service and a facility, said method comprising the steps of:
-
(a) receiving a spoken utterance of a user;
(b) decoding said spoken utterance via automatic speech recognition to obtain information bearing indications of identity of the user;
(c) performing text-independent speaker recognition on said spoken utterance to test whether said spoken utterance was likely uttered by a person corresponding to said indications of said identity of the user; and
(d) granting access to the user if steps (b) and (c) indicate such access to be warranted. - View Dependent Claims (22, 23, 24)
step (a) comprises receiving said spoken utterance of the user as at least one of;
a name associated with the user;
a password associated with the user;
a social security number associated with the user;
an identification number associated with the user;
a phone number associated with the user;
a customer number associated with the user;
an address associated with the user; and
an object of a request associated with a user; and
step (b) comprises decoding said spoken utterance to obtain, as said information, said at least one of said name, said password, said social security number, said identification number, said phone number, said customer number, said address and said object of said request.
-
-
23. The method of claim 21, wherein step (a) comprises receiving said spoken utterance of the user as at least one of a static feature and a dynamic feature.
-
24. The method of claim 21, wherein:
-
step (a) comprises receiving said spoken utterance of said user as indicative of a subset of users having more than one member; and
step (b) comprises decoding said spoken utterance to obtain said information, said information bearing indications of membership of the user in said subset.
-
-
25. A method for evaluating a user of one of a service and a facility, said method comprising the steps of:
-
(a) receiving a spoken utterance of a user;
(b) decoding said spoken utterance via automatic speech recognition to obtain information bearing indications of identity of the user;
(c) performing speaker identification, via text-independent speaker recognition, on said spoken utterance to develop an estimation of said identity of the user; and
(d) granting access to the user if steps (b) and (c) indicate such access to be warranted. - View Dependent Claims (26, 27, 28, 29)
step (a) comprises receiving said spoken utterance of the user as at least one of;
a name associated with the user;
a password associated with the user;
a social security number associated with the user;
an identification number associated with the user;
a phone number associated with the user;
a customer number associated with the user;
an address associated with the user; and
an object of a request associated with a user; and
step (b) comprises decoding said spoken utterance to obtain, as said information, said at least one of said name, said password, said social security number, said identification number, said phone number, said customer number, said address and said object of said request.
-
-
27. The method of claim 25, wherein step (a) comprises receiving said spoken utterance of the user as at least one of a static feature and a dynamic feature.
-
28. The method of claim 25, wherein:
-
step (a) comprises receiving said spoken utterance of said user as indicative of a subset of users having more than one member; and
step (b) comprises decoding said spoken utterance to obtain said information, said information bearing indications of membership of the user in said subset.
-
-
29. The method of claim 25, wherein steps (b) and (c) are performed substantially simultaneously.
-
30. A method for evaluating a user of one of a service and a facility, said method comprising the steps of:
-
(a) receiving a first natural language spoken utterance of the user;
(b) decoding said first natural language spoken utterance, via natural language understanding (NLU) speech recognition, to obtain a first decoded utterance having factual content;
(c) performing text-independent speaker recognition on said first natural language spoken utterance; and
(d) granting access to the user if both said factual content of said first decoded utterance and said text-independent speaker recognition of said first natural language spoken utterance indicate such access to be warranted. - View Dependent Claims (31)
receiving a second natural language spoken utterance of the user;
prior to at least one of step (a) and said step of receiving said second utterance, querying said user with a question, such that at least one of said first and second natural language spoken utterances are uttered in response to said question; and
performing text-independent speaker recognition on said second natural language spoken utterance;
wherein step (d) further comprises granting said access to the user if said text-independent speaker recognition of said second natural language spoken utterance, as well as said first decoded utterance and said text-independent speaker recognition of said first natural language spoken utterance, indicate such access to be warranted.
-
-
32. A method for evaluating a user of one of a service and a facility having a plurality of permitted users, said method comprising the steps of:
-
(a) receiving a first piece of information pertaining to the user, said first piece of information containing information sufficient to identify the user as a member of a multi-user group including less than all of said permitted users of the one of the service and the facility;
(b) accessing a database, based on said first piece of information, to identify said multi-user group; and
(c) determining a most likely member of said multi-user group to whom the user corresponds, based on speaker identification performed on a spoken utterance of the user. - View Dependent Claims (33, 34, 35, 36, 37)
step (a) comprises receiving, as said first piece of information, one of a name associated with the user, a password associated with the user, an object of a request associated with the user and a phone number associated with the user; and
said decoding step is performed to determine said one of said name, said password, said object of said request and said phone number.
-
-
37. The method of claim 36, wherein step (a) comprises receiving said name associated with the user;
- and
said multi-user group comprises users having names which are similar-sounding to said name associated with the user.
- and
-
38. A method for evaluating a user of one of a service and a facility having a plurality of permitted users, said method comprising the steps of:
-
(a) receiving a first piece of information pertaining to the user, said first piece of information containing information sufficient to identify the user as a member of a multi-user group including less than all of said permitted users of the one of the service and the facility;
(b) accessing a database, based on said first piece of information, to identify said multi-user group;
(c) forming a ranked list of possible users to whom the user may correspond, based on speaker identification performed on a spoken utterance of the user; and
(d) searching for a match between said ranked list of possible users and members of said multi-user group. - View Dependent Claims (39, 40)
said multi-user group comprises users having names which are similar-sounding to said name associated with the user.
-
-
41. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for evaluating a user of one of a service and a facility, said method steps comprising:
-
(a) receiving a spoken utterance of a user;
(b) decoding said spoken utterance via automatic speech recognition to obtain information bearing indications of identity of the user;
(c) performing text-independent speaker recognition on said spoken utterance to test whether said spoken utterance was likely uttered by a person corresponding to said indications of said identity of the user; and
(d) granting access to the user if steps (b) and (c) indicate such access to be warranted.
-
-
42. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for evaluating a user of one of a service and a facility, said method steps comprising:
-
(a) receiving a spoken utterance of a user;
(b) decoding said spoken utterance via automatic speech recognition to obtain information bearing indications of identity of the user;
(c) performing speaker identification, via text-independent speaker recognition, on said spoken utterance to develop an estimation of said identity of the user; and
(d) granting access to the user if steps (b) and (c) indicate such access to be warranted.
-
-
43. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for evaluating a user of one of a service and a facility, said method steps comprising:
-
(a) receiving a first natural language spoken utterance of the user;
(b) decoding said first natural language spoken utterance, via natural language understanding (NLU) speech recognition, to obtain a first decoded utterance having factual content;
(c) performing text-independent speaker recognition on said first natural language spoken utterance; and
(d) granting access to the user if both said factual content of said first decoded utterance and said text-independent speaker recognition of said first natural language spoken utterance indicate such access to be warranted.
-
-
44. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for evaluating a user of one of a service and a facility, said method steps comprising:
-
(a) receiving a first piece of information pertaining to the user, said first piece of information containing information sufficient to identify the user as a member of a multi-user group including less than all of said permitted users of the one of the service and the facility;
(b) accessing a database, based on said first piece of information, to identify said multi-user group; and
(c) determining a most likely member of said multi-user group to whom the user corresponds, based on speaker identification performed on a spoken utterance of the user.
-
-
45. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for evaluating a user of one of a service and a facility, said method steps comprising:
-
(a) receiving a first piece of information pertaining to the user, said first piece of information containing information sufficient to identify the user as a member of a multi-user group including less than all of said permitted users of the one of the service and the facility;
(b) accessing a database, based on said first piece of information, to identify said multi-user group;
(c) forming a ranked list of possible users to whom the user may correspond, based on speaker identification performed on a spoken utterance of the user; and
(d) searching for a match between said ranked list of possible users and members of said multi-user group.
-
Specification