Speaker verification using co-location information

US 10,147,429 B2
Filed: 09/06/2017
Issued: 12/04/2018
Est. Priority Date: 07/18/2014
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

a first computing device configured to respond to voice commands while in a locked state upon receipt of a particular, predefined hotword and one or more storage devices storing instructions that are operable, when executed by the first computing device, to cause the first computing device to perform operations comprising;

receiving, while the first computing device is in the locked state and is co-located with a second computing device configured to respond to voice commands that are preceded by the particular, predefined hotword, audio data that corresponds to an utterance of a voice command that is preceded by the particular, predefined hotword;

transmitting, by the first computing device while in the locked state, a first message that includes speaker verification data to a server that receives speaker verification data from multiple co-located devices, including the first computing device and the second computing device, and uses the received speaker verification data to generate a first speaker verification score representing a likelihood a first user of the first computing device spoke the voice command;

receiving, by the first computing device while in the locked state, a second message from the server indicating that the first user of the first computing device did not likely speak the voice command based on the first speaker verification score; and

in response to receiving the second message, determining to remain in the locked state and not respond to the voice command despite receiving the audio data that corresponds to the utterance of the voice command that is preceded by the particular, predefined hotword.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying a user in a multi-user environment. One of the methods includes receiving, by a first user device, an audio signal encoding an utterance, obtaining, by the first user device, a first speaker model for a first user of the first user device, obtaining, by the first user device for a second user of a second user device that is co-located with the first user device, a second speaker model for the second user or a second score that indicates a respective likelihood that the utterance was spoken by the second user, and determining, by the first user device, that the utterance was spoken by the first user using (i) the first speaker model and the second speaker model or (ii) the first speaker model and the second score.

Citations

15 Claims

1. A system comprising:
- a first computing device configured to respond to voice commands while in a locked state upon receipt of a particular, predefined hotword and one or more storage devices storing instructions that are operable, when executed by the first computing device, to cause the first computing device to perform operations comprising;
  
  receiving, while the first computing device is in the locked state and is co-located with a second computing device configured to respond to voice commands that are preceded by the particular, predefined hotword, audio data that corresponds to an utterance of a voice command that is preceded by the particular, predefined hotword;
  
  transmitting, by the first computing device while in the locked state, a first message that includes speaker verification data to a server that receives speaker verification data from multiple co-located devices, including the first computing device and the second computing device, and uses the received speaker verification data to generate a first speaker verification score representing a likelihood a first user of the first computing device spoke the voice command;
  
  receiving, by the first computing device while in the locked state, a second message from the server indicating that the first user of the first computing device did not likely speak the voice command based on the first speaker verification score; and
  
  in response to receiving the second message, determining to remain in the locked state and not respond to the voice command despite receiving the audio data that corresponds to the utterance of the voice command that is preceded by the particular, predefined hotword.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The system of claim 1, wherein:
    - the server further uses the received speaker verification data to generate a second speaker verification score representing a likelihood that a second user of the second computing device spoke the voice command; and
      
      receiving the second message indicating that the first user of the first computing device did not likely speak the voice command is further based on the second speaker verification score.
  - 3. The system of claim 1, wherein the operations further comprise obtaining a value for a setting that indicates that the first computing device is permitted to provide speaker verification data to other computing devices, wherein transmitting the first message that includes speaker verification data to the server is based on the obtained value for the setting that indicates that the first computing device is permitted to share speaker verification data with other computing devices.
  - 4. The system of claim 1, wherein transmitting the first message that includes the speaker verification data to the server comprises transmitting the first message that includes a first speaker verification model for the first user of the first computing device.
  - 5. The system of claim 1, wherein transmitting the first message that includes the speaker verification data to the server is responsive to receiving the audio data that corresponds to the utterance.
  - 6. The system of claim 1, wherein the operations further comprise determining that the second computing device is co-located with the first computing device, wherein transmitting the first message that includes the speaker verification data to the server is responsive to determining that the second computing device is co-located with the first computing device.

7. A computer-implemented method comprising:
- receiving, by a first computing device configured to respond to voice commands while in a locked state upon receipt of a particular, predefined hotword and while the first computing device is in a locked state and is co-located with a second computing device configured to respond to voice commands that are preceded by the predefined hotword, audio data that corresponds to an utterance of a voice command that is preceded by the particular, predefined hotword;
  
  transmitting, by the first computing device while in the locked state, a first message that includes speaker verification data to a server that receives speaker verification data from multiple co-located devices, including the first computing device and the second computing device, and uses the received speaker verification data to generate a first speaker verification score representing a likelihood a first user of the first computing device spoke the voice command;
  
  receiving, by the first computing device while in the locked state, a second message from the server indicating that the first user of the first computing device did not likely speak the voice command based on the first speaker verification score; and
  
  in response to receiving the second message, determining to remain in the locked state and not respond to the voice command despite receiving the audio data that corresponds to the utterance of the voice command that is preceded by the particular, predefined hotword.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The method of claim 7, wherein:
    - the server further uses the received speaker verification data to generate a second speaker verification score representing a likelihood that a second user of the second computing device spoke the voice command; and
      
      receiving the second message indicating that the first user of the first computer device did not likely speak the voice command is further based on the second speaker verification score.
  - 9. The method of claim 7, further comprising obtaining a value for a setting that indicates that the first computing device is permitted to provide speaker verification data to other computing devices, wherein transmitting the first message that includes speaker verification data to the server is based on the obtained value for the setting that indicates that the first computing device is permitted to share speaker verification data with other computing devices.
  - 10. The method of claim 7, wherein transmitting the first message that includes the speaker verification data to the server comprises transmitting the first message that includes a first speaker verification model for the first user of the first computing device.
  - 11. The method of claim 7, wherein transmitting the first message that includes the speaker verification data to the server is responsive to receiving the audio data that corresponds to the utterance.
  - 12. The method of claim 7, further comprising determining that the second computing device is co-located with the first computing device, wherein transmitting the first message that includes the speaker verification data to the server is responsive to determining that the second computing device is co-located with the first computing device.

13. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- receiving, by a first computing device configured to respond to voice commands while in a locked state upon receipt of a particular, predefined hotword and while the first computing device is in a locked state and is co-located with a second computing device configured to respond to voice commands that are preceded by the predefined hotword, audio data that corresponds to an utterance of a voice command that is preceded by the particular, predefined hotword;
  
  transmitting, by the first computing device while in the locked state, a first message that includes speaker verification data to a server that receives speaker verification data from multiple co-located devices, including the first computing device and the second computing device, and uses the received speaker verification data to generate a first speaker verification score representing a likelihood a first user of the first computing device spoke the voice command;
  
  receiving, by the first computing device while in the locked state, a second message from the server indicating that the first user of the first computing device did not likely speak the voice command based on the first speaker verification score; and
  
  in response to receiving the second message, determining to remain in the locked state and not respond to the voice command despite receiving the audio data that corresponds to the utterance of the voice command that is preceded by the particular, predefined hotword.
- View Dependent Claims (14, 15)
- - 14. The computer-readable medium of claim 13, wherein:
    - the server further uses the received speaker verification data to generate a second speaker verification score representing a likelihood that a second user of the second computing device spoke the voice command; and
      
      receiving the second message indicating that the first user of the first computer device did not likely speak the voice command is further based on the second speaker verification score.
  - 15. The computer-readable medium of claim 13, wherein the operations further comprise obtaining a value for a setting that indicates that the first computing device is permitted to provide speaker verification data to other computing devices, wherein transmitting the first message that includes speaker verification data to the server is based on the obtained value for the setting that indicates that the first computing device is permitted to share speaker verification data with other computing devices.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Guevara, Raziel Alvarez, Hansson, Othar
Primary Examiner(s)
VO, HUYEN X

Application Number

US15/697,052
Publication Number

US 20180012604A1
Time in Patent Office

454 Days
Field of Search

704 1- 10, 704230-257, 704270-275
US Class Current
CPC Class Codes

G06F 21/32   using biometric data, e.g. ...

G06F 2221/2111   Location-sensitive, e.g. ge...

G10L 15/08   Speech classification or se...

G10L 15/18   using natural language mode...

G10L 17/00   Speaker identification or v...

G10L 17/20   Pattern transformations or ...

G10L 17/22   Interactive procedures; Man...

G10L 17/24   the user being prompted to ...

G10L 19/00   Speech or audio signals ana...

G10L 2015/088   Word spotting

G10L 2015/223   Execution procedure of a sp...

H04L 63/0861   using biometrical features,...

H04W 12/06   Authentication

Speaker verification using co-location information

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Speaker verification using co-location information

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links