User authentication for devices using voice input or audio signatures
First Claim
1. An apparatus comprising;
- a speaker;
a microphone to generate one or more audio signals from sound captured within an environment;
a processor; and
computer-readable media storing computer-executable instructions that, when executed on the processor, cause the processor to perform acts comprising;
identifying, based at least in part on the one or more audio signals, a request from a user to initiate a transaction;
outputting, via the speaker, a request that the user utter a password associated with the user;
determining, from the one or more audio signals, whether a first utterance of the user includes a password that matches the password associated with the user and whether an audio signature of the first utterance has a similarity score to an audio signature associated with the user that is greater than a first pre-defined threshold, the audio signature of the first utterance being based at least partly on a pitch, a decibel level, and a tone associated with the one or more audio signals;
at least partly in response to determining that the passwords match and that the similarity score of the audio signature of the first utterance to the audio signature associated with the user is greater than the first pre-defined threshold, causing output of, via the speaker, a request that the user answer a pre-stored question having a previously selected answer;
determining, from the one or more audio signals, whether a second utterance of the user includes the previously selected answer and whether an audio signature of the second utterance has a similarity score to the audio signature associated with the user that is greater than a second pre-defined threshold; and
initiating the transaction at least partly in response to determining that the second utterance includes the previously selected answer and that the similarity score of the audio signature of the second utterance to the audio signature associated with the user is greater than the second pre-defined threshold.
2 Assignments
0 Petitions
Accused Products
Abstract
Techniques for authenticating users at devices that interact with the users via voice input. For instance, the described techniques may allow a voice-input device to safely verify the identity of a user by engaging in a back-and-forth conversation. The device or another device coupled thereto may then verify the accuracy of the responses from the user during the conversation, as well as compare an audio signature associated with the user'"'"'s responses to a pre-stored audio signature associated with the user. By utilizing multiple checks, the described techniques are able to accurately and safely authenticate the user based solely on an audible conversation between the user and the voice-input device.
-
Citations
20 Claims
-
1. An apparatus comprising;
-
a speaker; a microphone to generate one or more audio signals from sound captured within an environment; a processor; and computer-readable media storing computer-executable instructions that, when executed on the processor, cause the processor to perform acts comprising; identifying, based at least in part on the one or more audio signals, a request from a user to initiate a transaction; outputting, via the speaker, a request that the user utter a password associated with the user; determining, from the one or more audio signals, whether a first utterance of the user includes a password that matches the password associated with the user and whether an audio signature of the first utterance has a similarity score to an audio signature associated with the user that is greater than a first pre-defined threshold, the audio signature of the first utterance being based at least partly on a pitch, a decibel level, and a tone associated with the one or more audio signals; at least partly in response to determining that the passwords match and that the similarity score of the audio signature of the first utterance to the audio signature associated with the user is greater than the first pre-defined threshold, causing output of, via the speaker, a request that the user answer a pre-stored question having a previously selected answer; determining, from the one or more audio signals, whether a second utterance of the user includes the previously selected answer and whether an audio signature of the second utterance has a similarity score to the audio signature associated with the user that is greater than a second pre-defined threshold; and initiating the transaction at least partly in response to determining that the second utterance includes the previously selected answer and that the similarity score of the audio signature of the second utterance to the audio signature associated with the user is greater than the second pre-defined threshold. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. Non-transitory computer-readable media storing computer-executable instructions that, when executed on a processor, cause the processor to perform acts comprising:
-
receiving a request from a user; causing output of a pre-stored question having a previously selected answer; determining, based at least in part on an audio signal, whether an answer audibly provided by the user matches the previously selected answer; determining whether a similarity score between an audio signature of the audio signal and an audio signature previously associated with the user meets or exceeds a pre-defined threshold, the audio signature of the audio signal being based at least partly on at least one of a pitch, a decibel level, or a tone associated with the audio signal; and initiating the request at least partly in response to determining that the answer audibly provided by the user matches the previously selected answer and that the similarity score meets or exceeds the pre-defined threshold. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. Non-transitory computer-readable media storing computer-executable instructions that, when executed on a processor, cause the processor to perform acts comprising:
-
receiving a request from a user; requesting that the user utter one or more particular words; determining whether an audio signal includes a user utterance that includes the one or more particular words; determining whether a similarity score between an audio signature of the audio signal and an audio signature previously associated with the user meets or exceeds a pre-defined threshold, the audio signature of the audio signal being based at least partly on at least one of a pitch, a decibel level, or a tone associated with the audio signal; and initiating the request at least partly in response to determining that the audio signal includes a user utterance of the one or more particular words and that the similarity score meets or exceeds the pre-defined threshold. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification