INDIVIDUALIZED HOTWORD DETECTION MODELS

US 20170186433A1
Filed: 06/29/2016
Published: 06/29/2017
Est. Priority Date: 07/22/2015
Status: Active Grant

First Claim

Patent Images

1. (canceled)

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.

17 Citations

View as Search Results

21 Claims

1. (canceled)

2. A computer-implemented method comprising:
- during an enrollment process, prompting, by a client device, a user to speak a particular hotword, and receiving, by the client device, audio data corresponding to only a single utterance of the particular hotword by the user;
  
  in response to receiving the audio data corresponding to only the single utterance of the particular hotword by the user, obtaining, by the client device, a personalized hotword detection model, wherein the personalized hotword detection model is trained to detect a likely utterance of the particular hotword by the user using the audio data corresponding to only the single utterance of the particular hotword by the user and without requiring the user to speak additional utterances of the particular hotword; and
  
  after obtaining the personalized hotword detection model that is trained to detect when the user speaks the particular hotword using the audio data corresponding to only the single utterance of the particular hotword by the user and without requiring the user to speak the additional utterances of the particular hotword, detecting, by the client device, the likely utterance of the particular hotword by the user in subsequently received audio data using the personalized hotword detection model.
- View Dependent Claims (3, 4, 5, 6, 7, 8)
- - 3. The method of claim 2, wherein during an enrollment process, prompting the user to speak the particular hotword, and receiving the audio data corresponding to only the single utterance of the particular hotword by the user comprises:
    - prompting the user to speak one or more terms that trigger semantic interpretation of the one or more terms or one or more terms that follow the particular hotword.
  - 4. The method of claim 2, wherein in response to receiving the audio data corresponding to only the single utterance of the particular hotword by the user, obtaining the personalized hotword detection model comprises:
    - generating the personalized hotword detection model after receiving the audio data corresponding to only the single utterance of the particular hotword by the user and without receiving additional audio data corresponding to another utterance of the particular hotword.
  - 5. The method of claim 2, comprising:
    - ending the enrollment process after obtaining the personalized hotword detection model.
  - 6. The method of claim 2, wherein the personalized hotword detection model is based at least on the single utterance and not based on another utterance of the particular hotword.
  - 7. The method of claim 2, wherein using the personalized hotword detection model to detect the likely utterance of the particular hotword in the subsequently received audio data comprises:
    - receiving audio data corresponding to a subsequent utterance; and
      
      determining whether the subsequent utterance likely includes the particular hotword based at least on the personalized hotword detection model.
  - 8. The method of claim 7, comprising:
    - in response to determining whether the subsequent utterance likely includes the particular hotword based at least on the personalized hotword detection model, performing semantic interpretation on at least a portion of the subsequent utterance.

9. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  during an enrollment process, prompting, by a client device, a user to speak a particular hotword, and receiving, by the client device, audio data corresponding to only a single utterance of the particular hotword by the user;
  
  in response to receiving the audio data corresponding to only the single utterance of the particular hotword by the user, obtaining, by the client device, a personalized hotword detection model, wherein the personalized hotword detection model is trained to detect a likely utterance of the particular hotword by the user using the audio data corresponding to only the single utterance of the particular hotword by the user and without requiring the user to speak additional utterances of the particular hotword; and
  
  after obtaining the personalized hotword detection model that is trained to detect when the user speaks the particular hotword using the audio data corresponding to only the single utterance of the particular hotword by the user and without requiring the user to speak the additional utterances of the particular hotword, detecting, by the client device, the likely utterance of the particular hotword by the user in subsequently received audio data using the hotword detection model.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The system of claim 9, wherein during an enrollment process, prompting the user to speak the particular hotword, and receiving the audio data corresponding to only the single utterance of the particular hotword by the user comprises:
    - prompting the user to speak one or more terms that trigger semantic interpretation of the one or more terms or one or more terms that follow the particular hotword.
  - 11. The system of claim 9, wherein in response to receiving the audio data corresponding to only the single utterance of the particular hotword by the user, obtaining the personalized hotword detection model comprises:
    - generating the personalized hotword detection model after receiving the audio data corresponding to only the single utterance of the particular hotword by the user and without receiving additional audio data corresponding to another utterance of the particular hotword.
  - 12. The system of claim 9, the operations comprising:
    - ending the enrollment process after obtaining the personalized hotword detection model.
  - 13. The system of claim 9, wherein the personalized hotword detection model is based at least on the single utterance and not based on another utterance of the particular hotword.
  - 14. The system of claim 9, wherein using the personalized hotword detection model to detect the likely utterance of the particular hotword in the subsequently received audio data comprises:
    - receiving audio data corresponding to a subsequent utterance; and
      
      determining whether the subsequent utterance likely includes the particular hotword based at least on the personalized hotword detection model.
  - 15. The system of claim 14, comprising:
    - in response to determining whether the subsequent utterance likely includes the particular hotword based at least on the personalized hotword detection model, performing semantic interpretation on at least a portion of the subsequent utterance.

16. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  during an enrollment process, prompting, by a client device, a user to speak a particular hotword, and receiving, by the client device, audio data corresponding to only a single utterance of the particular hotword by the user;
  
  in response to receiving the audio data corresponding to only the single utterance of the particular hotword by the user, obtaining, by the client device, a personalized hotword detection model, wherein the personalized hotword detection model is trained to detect a likely utterance of the particular hotword by the user using the audio data corresponding to only the single utterance of the particular hotword by the user and without requiring the user to speak additional utterances of the particular hotword; and
  
  after obtaining the personalized hotword detection model that is trained to detect when the user speaks the particular hotword using the audio data corresponding to only the single utterance of the particular hotword by the user and without requiring the user to speak the additional utterances of the particular hotword, detecting, by the client device, the likely utterance of the particular hotword by the user in subsequently received audio data using the hotword detection model.
- View Dependent Claims (17, 18, 19, 20, 21)
- - 17. The medium of claim 16, wherein during an enrollment process, prompting the user to speak the particular hotword, and receiving the audio data corresponding to only the single utterance of the particular hotword by the user comprises:
    - prompting the user to speak one or more terms that trigger semantic interpretation of the one or more terms or one or more terms that follow the particular hotword.
  - 18. The medium of claim 16, wherein in response to receiving the audio data corresponding to only the single utterance of the particular hotword by the user, obtaining the personalized hotword detection model comprises:
    - generating the personalized hotword detection model after receiving the audio data corresponding to only the single utterance of the particular hotword by the user and without receiving additional audio data corresponding to another utterance of the particular hotword.
  - 19. The medium of claim 16, the operations comprising:
    - ending the enrollment process after obtaining the personalized hotword detection model.
  - 20. The medium of claim 16, wherein the personalized hotword detection model is based at least on the single utterance and not based on another utterance of the particular hotword.
  - 21. The medium of claim 16, wherein using the personalized hotword detection model to detect a likely utterance of the particular hotword in subsequently received audio data comprises:
    - receiving audio data corresponding to a subsequent utterance; and
      
      determining whether the subsequent utterance likely includes the particular hotword based at least on the personalized hotword detection model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Alvarez Guevara, Raziel

Granted Patent

US 10,535,354 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/063   Training

G10L 15/07   to the speaker

G10L 15/075   supervised, i.e. under mach...

G10L 15/1815   Semantic context, e.g. disa...

G10L 17/04   Training, enrolment or mode...

G10L 17/06   Decision making techniques;...

G10L 17/08   Use of distortion metrics o...

G10L 17/18   Artificial neural networks;...

G10L 17/24   the user being prompted to ...

G10L 2015/0638   Interactive procedures

G10L 2015/088   Word spotting

INDIVIDUALIZED HOTWORD DETECTION MODELS

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

17 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

INDIVIDUALIZED HOTWORD DETECTION MODELS

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

17 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links