User authentication

US 10,733,996 B2
Filed: 03/30/2018
Issued: 08/04/2020
Est. Priority Date: 03/30/2018
Status: Active Grant

First Claim

Patent Images

1. A device comprising:

a processor configured to;

extract a set of parameters from an audio signal;

perform liveness verification by determining, based on a first plurality of parameters and a liveness data model, whether the audio signal corresponds to a first audio type indicating spoken speech or a second audio type indicating playback of recorded speech;

perform user verification by determining, based on a second plurality of parameters and a user speech model, whether the audio signal corresponds to speech of a particular user associated with the user speech model and that the audio signal corresponds to the first audio type, and refrain from performing the user verification based on determining that the audio signal corresponds to the second audio type;

perform keyword verification by determining, based on a third plurality of parameters and a keyword data model, whether the audio signal corresponds to a particular keyword, wherein the set of parameters includes the first plurality of parameters, the second plurality of parameters, and the third plurality of parameters; and

generate an output indicating that user authentication is successful in response to determining that the audio signal corresponds to the speech of the particular user, to the particular keyword, and to the first audio type.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A device includes a processor configured to extract parameters from an audio signal. The processor is configured to perform liveness verification by determining, based on first parameters and a liveness data model, whether the audio signal corresponds to a first audio type indicating spoken speech, to perform user verification by determining, based on second parameters and a user speech model, whether the audio signal corresponds to speech of a particular user, and to perform keyword verification by determining, based on third parameters and a keyword data model, whether the audio signal corresponds to a particular keyword. The processor is configured to generate an output indicating that user authentication is successful in response to determining that the audio signal corresponds to speech of the particular user, to the particular keyword, and to the first audio type.

44 Citations

29 Claims

1. A device comprising:
- a processor configured to;
  
  extract a set of parameters from an audio signal;
  
  perform liveness verification by determining, based on a first plurality of parameters and a liveness data model, whether the audio signal corresponds to a first audio type indicating spoken speech or a second audio type indicating playback of recorded speech;
  
  perform user verification by determining, based on a second plurality of parameters and a user speech model, whether the audio signal corresponds to speech of a particular user associated with the user speech model and that the audio signal corresponds to the first audio type, and refrain from performing the user verification based on determining that the audio signal corresponds to the second audio type;
  
  perform keyword verification by determining, based on a third plurality of parameters and a keyword data model, whether the audio signal corresponds to a particular keyword, wherein the set of parameters includes the first plurality of parameters, the second plurality of parameters, and the third plurality of parameters; and
  
  generate an output indicating that user authentication is successful in response to determining that the audio signal corresponds to the speech of the particular user, to the particular keyword, and to the first audio type.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The device of claim 1, wherein the processor is further configured to generate a second output indicating that the user authentication failed in response to determining that the audio signal corresponds to the second audio type.
  - 3. The device of claim 2, wherein the processor is configured to generate the second output independently of performing the keyword verification and the user verification.
  - 4. The device of claim 1, wherein the liveness data model is user-independent.
  - 5. The device of claim 1, wherein the first plurality of parameters is the same as the second plurality of parameters, and wherein the second plurality of parameters is the same as the third plurality of parameters.
  - 6. The device of claim 1, wherein the liveness data model is trained based on a first set of recordings corresponding to spoken speech and a second set of recordings corresponding to played-back speech.
  - 7. The device of claim 6, wherein the processor is configured to determine that the audio signal corresponds to the first audio type based on determining that the liveness data model indicates that the first plurality of parameters corresponds more closely to a second plurality of parameters of the first set of recordings than to a third plurality of parameters of the second set of recordings.
  - 8. The device of claim 1, wherein the liveness data model includes a machine-learning data model, and wherein the processor is configured to train the liveness data model by:
    - extracting a first set of parameters from a first recording corresponding to spoken speech;
      
      extracting a second set of parameters from a second recording corresponding to playback of the first recording;
      
      updating the liveness data model based on the first set of parameters corresponding to the first audio type; and
      
      updating the liveness data model based on the second set of parameters corresponding to the second audio type.
  - 9. The device of claim 1, wherein the first plurality of parameters indicates characteristics of the audio signal.
  - 10. The device of claim 9, wherein the characteristics include a dynamic frequency range of the audio signal.
  - 11. The device of claim 1, further comprising a display, wherein the processor is further configured to:
    - receive user input indicating that the user authentication is to be performed;
      
      generate the particular keyword in response to receiving the user input; and
      
      provide a graphical user interface (GUI) indicating the particular keyword to the display, wherein the microphone is configured to receive the audio signal subsequent to the processor providing the GUI to the display.
  - 12. The device of claim 1, further comprising:
    - a microphone configured to generate the audio signal responsive to receiving an input audio signal;
      
      an antenna; and
      
      a transmitter coupled to the antenna and configured to transmit, via the antenna, authentication data to a second device based on determining that the user authentication is successful.
  - 13. The device of claim 12, wherein the microphone, the processor, the antenna, and the transmitter are integrated into a mobile device.

14. A method comprising:
- receiving an audio signal at a device;
  
  extracting, at the device, a set of parameters from the audio signal;
  
  performing, at the device, liveness verification by determining, based on a first plurality of parameters and a liveness data model, whether the audio signal corresponds to a first audio type indicating spoken speech or a second audio type indicating playback of recorded speech;
  
  performing, at the device, user verification by determining, based on a second plurality of parameters and a user speech model, whether the audio signal corresponds to speech of a particular user associated with the user speech model, wherein the user speech model is distinct from the liveness data model;
  
  performing, at the device, keyword verification by determining, based on a third plurality of parameters and a keyword data model, whether the audio signal corresponds to a particular keyword, wherein the set of parameters includes the first plurality of parameters, the second plurality of parameters, and the third plurality of parameters; and
  
  generating, at the device, an output indicating that user authentication is successful based on determining that the audio signal corresponds to the speech of the particular user, to the particular keyword, and to the first audio type.
- View Dependent Claims (15, 16, 17, 18, 19)
- - 15. The method of claim 14, wherein the liveness data model is trained based on a first set of recordings corresponding to spoken speech and a second set of recordings corresponding to played-back speech.
  - 16. The method of claim 15, wherein the first plurality of parameters indicates a dynamic frequency range of the audio signal, wherein the audio signal is determined to correspond to the first audio type based on determining that that the liveness data model indicates that the dynamic frequency range of the audio signal corresponds more closely to first dynamic frequency ranges of the first set of recordings than to second dynamic frequency ranges of the second set of recordings.
  - 17. The method of claim 14, wherein the user speech model is trained based on input audio signals associated with the particular user.
  - 18. The method of claim 14, wherein the set of parameters is extracted in response to receiving a user input indicating that the user authentication is to be performed.
  - 19. The method of claim 14, wherein the user verification is performed in response to determining that the audio signal corresponds to the first audio type.

20. A computer-readable storage device storing instructions that, when executed by a processor, cause the processor to perform operations comprising:
- extracting a set of parameters from an audio signal;
  
  performing liveness verification by determining, based on a first plurality of parameters and a liveness data model, whether the audio signal corresponds to a first audio type indicating spoken speech or a second audio type indicating playback of recorded speech;
  
  performing user verification by determining, based on a second plurality of parameters and a user speech model, whether the audio signal corresponds to speech of a particular user associated with the user speech model;
  
  performing keyword verification by determining, based on a third plurality of parameters and a keyword data model, whether the audio signal corresponds to a particular keyword, wherein the set of parameters includes the first plurality of parameters, the second plurality of parameters, and the third plurality of parameters;
  
  generating an output indicating that user authentication is successful in response to determining that the audio signal corresponds to the speech of the particular user, to the particular keyword, and to the first audio type; and
  
  updating the user speech model based on the set of parameters in response to determining that the user authentication is successful.
- View Dependent Claims (21, 22, 23, 24, 25)
- - 21. The computer-readable storage device of claim 20, wherein the liveness data model is trained based on a first set of recordings corresponding to spoken speech and a second set of recordings corresponding to played-back speech.
  - 22. The computer-readable storage device of claim 20, wherein the user speech model is trained based on input audio signals associated with the particular user.
  - 23. The computer-readable storage device of claim 20, wherein the first plurality of parameters is the same as the second plurality of parameters, and wherein the second plurality of parameters is the same as the third plurality of parameters.
  - 24. The computer-readable storage device of claim 20, wherein the operations further comprise:
    - determining that the user authentication is to be performed in response to determining that the audio signal corresponds to the particular keyword; and
      
      in response to determining that the user authentication is to be performed, performing the liveness verification and performing the user verification.
  - 25. The computer-readable storage device of claim 20, wherein the user speech model is distinct from the liveness data model.

26. An apparatus comprising:
- means for generating an output signal responsive to receiving an audio signal;
  
  means for extracting a set of parameters from the output signal;
  
  means for performing liveness verification by determining, based on a first plurality of parameters and a liveness data model, whether the audio signal corresponds to a first audio type indicating spoken speech or a second audio type indicating playback of recorded speech;
  
  means for performing user verification by determining, based on a second plurality of parameters and a user speech model, whether the audio signal corresponds to speech of a particular user associated with the user speech model, wherein the user speech model is distinct from the liveness data model;
  
  means for performing keyword verification by determining, based on a third plurality of parameters and a keyword data model, whether the audio signal corresponds to a particular keyword, wherein the set of parameters includes the first plurality of parameters, the second plurality of parameters, and the third plurality of parameters; and
  
  means for generating an output indicating that user authentication is successful in response to determining that the audio signal corresponds to the speech of the particular user, to the particular keyword, and to the first audio type.
- View Dependent Claims (27)
- - 27. The apparatus of claim 26, wherein the means for generating the output signal, the means for extracting the set of parameters, the means for performing the liveness verification, the means for performing the user verification, the means for performing the keyword verification, and the means for generating the output are integrated into a communication device, a personal digital assistant (PDA), a computer, a music player, a video player, an entertainment unit, a navigation device, a mobile device, a fixed location data unit, or a set top box.

28. A device comprising:
- a processor configured to;
  
  extract a set of parameters from an audio signal;
  
  perform liveness verification by determining, based on a first plurality of parameters and a liveness data model, whether the audio signal corresponds to a first audio type indicating spoken speech or a second audio type indicating playback of recorded speech;
  
  perform user verification by determining, based on a second plurality of parameters and a user speech model, whether the audio signal corresponds to speech of a particular user associated with the user speech;
  
  perform keyword verification by determining, based on a third plurality of parameters and a keyword data model, whether the audio signal corresponds to a particular keyword, wherein the set of parameters includes the first plurality of parameters, the second plurality of parameters, and the third plurality of parameters;
  
  generate an output indicating that user authentication is successful in response to determining that the audio signal corresponds to the speech of the particular user, to the particular keyword, and to the first audio type; and
  
  generate a second output indicating that the user authentication failed in response to determining that the audio signal corresponds to the second audio type, wherein the second output is generated independently of performing the keyword verification and the user verification.

29. A device comprising:
- a processor configured to;
  
  extract a set of parameters from an audio signal;
  
  perform liveness verification by determining, based on a first plurality of parameters and a liveness data model, whether the audio signal corresponds to a first audio type indicating spoken speech or a second audio type indicating playback of recorded speech, wherein the liveness data model is user-independent;
  
  perform user verification by determining, based on a second plurality of parameters and a user speech model, whether the audio signal corresponds to speech of a particular user associated with the user speech model;
  
  perform keyword verification by determining, based on a third plurality of parameters and a keyword data model, whether the audio signal corresponds to a particular keyword, wherein the set of parameters includes the first plurality of parameters, the second plurality of parameters, and the third plurality of parameters; and
  
  generate an output indicating that user authentication is successful in response to determining that the audio signal corresponds to the speech of the particular user, to the particular keyword, and to the first audio type.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Pendyala, Bhaskara Ramudu, Kadiyala, Pavan Kumar
Primary Examiner(s)
Islam, Mohammad K

Application Number

US15/942,196
Publication Number

US 20190304472A1
Time in Patent Office

858 Days
Field of Search
US Class Current
CPC Class Codes

G06F 21/32   using biometric data, e.g. ...

G06N 20/00   Machine learning

G06N 7/01   Probabilistic graphical mod...

G06V 40/40   Spoof detection, e.g. liven...

G10L 17/00   Speaker identification or v...

G10L 17/24   the user being prompted to ...

G10L 17/26   Recognition of special voic...

G10L 25/51   for comparison or discrimin...

User authentication

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

44 Citations

29 Claims

Specification

Solutions

Use Cases

Quick Links

User authentication

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

44 Citations

29 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links