Speaker verification using co-location information

US 10,460,735 B2
Filed: 10/26/2018
Issued: 10/29/2019
Est. Priority Date: 07/18/2014
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, at data processing hardware, audio data corresponding to an utterance of a voice command captured by a user device, the user device having a plurality of different users, each user of the plurality of different users having different corresponding user permissions to access a plurality of applications on the user device;

for each user of the plurality of different users of the user device;

obtaining, by the data processing hardware, corresponding speaker verification data from memory hardware in communication with the data processing hardware; and

generating, by the data processing hardware, a corresponding speaker verification score by comparing the corresponding speaker verification data and the audio data, the corresponding speaker verification score indicating a likelihood that the utterance of the voice command was spoken by the corresponding user of the plurality of different users of the user device;

identifying, by the data processing hardware, a speaker of the utterance of the voice command as the user of the plurality of different users of the user device associated with a highest corresponding speaker verification score; and

processing, by the data processing hardware, the voice command using a speech recognition module to identify a particular action for the user device to execute, the particular action, when executed by the user device, launching a particular application of the plurality of applications on the user device based on the corresponding user permissions associated with the identified speaker to access the application.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying a user in a multi-user environment. One of the methods includes receiving, by a first user device, an audio signal encoding an utterance, obtaining, by the first user device, a first speaker model for a first user of the first user device, obtaining, by the first user device for a second user of a second user device that is co-located with the first user device, a second speaker model for the second user or a second score that indicates a respective likelihood that the utterance was spoken by the second user, and determining, by the first user device, that the utterance was spoken by the first user using (i) the first speaker model and the second speaker model or (ii) the first speaker model and the second score.

Citations

22 Claims

1. A method comprising:
- receiving, at data processing hardware, audio data corresponding to an utterance of a voice command captured by a user device, the user device having a plurality of different users, each user of the plurality of different users having different corresponding user permissions to access a plurality of applications on the user device;
  
  for each user of the plurality of different users of the user device;
  
  obtaining, by the data processing hardware, corresponding speaker verification data from memory hardware in communication with the data processing hardware; and
  
  generating, by the data processing hardware, a corresponding speaker verification score by comparing the corresponding speaker verification data and the audio data, the corresponding speaker verification score indicating a likelihood that the utterance of the voice command was spoken by the corresponding user of the plurality of different users of the user device;
  
  identifying, by the data processing hardware, a speaker of the utterance of the voice command as the user of the plurality of different users of the user device associated with a highest corresponding speaker verification score; and
  
  processing, by the data processing hardware, the voice command using a speech recognition module to identify a particular action for the user device to execute, the particular action, when executed by the user device, launching a particular application of the plurality of applications on the user device based on the corresponding user permissions associated with the identified speaker to access the application.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein obtaining the corresponding speaker verification data comprises obtaining a corresponding speaker verification model for each user of the plurality of different users of the user device.
  - 3. The method of claim 2, wherein at least one of the corresponding speaker verification models comprises an i-vector speaker verification model.
  - 4. The method of claim 2, wherein at least one of the corresponding speaker verification models comprises a d-vector speaker verification model.
  - 5. The method of claim 1, wherein receiving the audio data corresponding to the utterance of the voice command comprises receiving the audio data corresponding to the utterance of the voice command that is preceded by a particular, predefined hotword captured by the user device while in a locked state.
  - 6. The method of claim 5, wherein the user device is configured to respond to voice commands while in the locked state upon receipt of the particular, predefined hotword.
  - 7. The method of claim 1, further comprising, for each user of the plurality of different users of the user device:
    - receiving, at the data processing hardware, one or more sample utterances from the corresponding user of the plurality of different users of the user device during an enrollment process;
      
      generating, by the data processing hardware, the corresponding speaker verification data for the corresponding user of the plurality of different users of the user device based on the one or more sample utterances received from the corresponding user of the plurality of different users of the user device; and
      
      storing, by the data processing hardware, the corresponding speaker verification data in the memory hardware in communication with the data processing hardware.
  - 8. The method of claim 7, further comprising, associating, by the data processing hardware, the corresponding speaker verification data stored in the memory hardware with a corresponding user identifier associated with the corresponding user of the plurality of different users of the user device.
  - 9. The method of claim 1, further comprising, prior to identifying the speaker of the utterance of the voice command, determining, by the data processing hardware, that the highest corresponding speaker verification score satisfies an acceptance threshold.
  - 10. The method of claim 1, wherein the data processing hardware resides on a server in communication with the user device.
  - 11. The method of claim 1, wherein the data processing hardware resides on the user device.

12. A system comprising:
- data processing hardware; and
  
  memory hardware in communication with the data processing hardware and storing instructions, that when executed by the data processing hardware, cause the data processing hardware to perform operations comprising;
  
  receiving audio data corresponding to an utterance of a voice command captured by a user device, the user device having a plurality of different users, each user of the plurality of different users having different corresponding user permissions to access a plurality of applications on the user device;
  
  for each user of the plurality of different users of the user device;
  
  obtaining corresponding speaker verification data from the memory hardware; and
  
  generating a corresponding speaker verification score by comparing the corresponding speaker verification data and the audio data, the corresponding speaker verification score indicating a likelihood that the utterance of the voice command was spoken by the corresponding user of the plurality of different users of the user device;
  
  identifying a speaker of the utterance of the voice command as the user of the plurality of different users of the user device associated with a highest corresponding speaker verification score; and
  
  processing the voice command using a speech recognition module to identify a particular action for the user device to execute, the particular action, when executed by the user device, launching a particular application of the plurality of applications on the corresponding user device based on user permissions associated with the identified speaker to access the application.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 13. The system of claim 12, wherein obtaining the corresponding speaker verification data comprises obtaining a corresponding speaker verification model for each user of the plurality of different users of the user device.
  - 14. The system of claim 13, wherein at least one of the corresponding speaker verification models comprises an i-vector speaker verification model.
  - 15. The system of claim 13, wherein at least one of the corresponding speaker verification models comprises a d-vector speaker verification model.
  - 16. The system of claim 12, wherein receiving the audio data corresponding to the utterance of the voice command comprises receiving the audio data corresponding to the utterance of the voice command that is preceded by a particular, predefined hotword captured by the user device while in a locked state.
  - 17. The system of claim 16, wherein the user device is configured to respond to voice commands while in the locked state upon receipt of the particular, predefined hotword.
  - 18. The system of claim 12, wherein the operations further comprise, for each user of the plurality of different users of the user device:
    - receiving one or more sample utterances from the corresponding user of the plurality of different users of the user device during an enrollment process;
      
      generating the corresponding speaker verification data for the corresponding user of the plurality of different users of the user device based on the one or more sample utterances received from the corresponding user of the plurality of different users of the user device; and
      
      storing the corresponding speaker verification data in the memory hardware.
  - 19. The system of claim 18, wherein the operations further comprise, associating the corresponding speaker verification data stored in the memory hardware with a corresponding user identifier associated with the corresponding user of the plurality of different users of the user device.
  - 20. The system of claim 12, wherein the operations further comprise, prior to identifying the speaker of the utterance of the voice command, determining that the highest corresponding speaker verification score satisfies an acceptance threshold.
  - 21. The system of claim 12, wherein the data processing hardware resides on a server in communication with the user device.
  - 22. The system of claim 12, wherein the data processing hardware resides on the user device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Guevara, Raziel Alvarez, Hansson, Othar
Primary Examiner(s)
Vo, Huyen X

Application Number

US16/172,221
Publication Number

US 20190074017A1
Time in Patent Office

368 Days
Field of Search

704 1- 10, 704230-257, 704270-277
US Class Current
CPC Class Codes

G06F 21/32   using biometric data, e.g. ...

G06F 2221/2111   Location-sensitive, e.g. ge...

G10L 15/08   Speech classification or se...

G10L 15/18   using natural language mode...

G10L 17/00   Speaker identification or v...

G10L 17/20   Pattern transformations or ...

G10L 17/22   Interactive procedures; Man...

G10L 17/24   the user being prompted to ...

G10L 19/00   Speech or audio signals ana...

G10L 2015/088   Word spotting

G10L 2015/223   Execution procedure of a sp...

H04L 63/0861   using biometrical features,...

H04W 12/06   Authentication

Speaker verification using co-location information

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Speaker verification using co-location information

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links