INDIVIDUALIZED HOTWORD DETECTION MODELS

US 20170025125A1
Filed: 07/22/2015
Published: 01/26/2017
Est. Priority Date: 07/22/2015
Status: Active Grant

First Claim

Patent Images

1. A computer implement method comprising:

obtaining enrollment acoustic data representing an utterance of a particular, predefined hotword that was spoken by a user during an enrollment process associated with a mobile device;

obtaining a set of candidate acoustic data representing utterances that were previously-spoken by other users, wherein the utterances are of the same, particular, predefined hotword that was spoken by the user during the enrollment process associated with the mobile device;

determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, wherein the similarity score is associated with the candidate acoustic data;

determining, for each candidate acoustic data of the set of candidate acoustic data, whether the similarity score associated with the candidate acoustic data satisfies a threshold similarity score;

selecting a subset of candidate acoustic data from the set of candidate acoustic data, in response to determining that the similarity score associated with the candidate acoustic data satisfies the threshold similarity score; and

generating a neural network-based, hotword detection model based using the enrollment acoustic data, and the selected subset of candidate acoustic data; and

providing the neural network-based, hotword detection model for use in detecting an utterance of the particular, predefined hotword that is subsequently spoken by the user.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.

Citations

20 Claims

1. A computer implement method comprising:
- obtaining enrollment acoustic data representing an utterance of a particular, predefined hotword that was spoken by a user during an enrollment process associated with a mobile device;
  
  obtaining a set of candidate acoustic data representing utterances that were previously-spoken by other users, wherein the utterances are of the same, particular, predefined hotword that was spoken by the user during the enrollment process associated with the mobile device;
  
  determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, wherein the similarity score is associated with the candidate acoustic data;
  
  determining, for each candidate acoustic data of the set of candidate acoustic data, whether the similarity score associated with the candidate acoustic data satisfies a threshold similarity score;
  
  selecting a subset of candidate acoustic data from the set of candidate acoustic data, in response to determining that the similarity score associated with the candidate acoustic data satisfies the threshold similarity score; and
  
  generating a neural network-based, hotword detection model based using the enrollment acoustic data, and the selected subset of candidate acoustic data; and
  
  providing the neural network-based, hotword detection model for use in detecting an utterance of the particular, predefined hotword that is subsequently spoken by the user.
- View Dependent Claims (2, 3, 4, 5, 6, 8)
- - 2. The method of claim 1, wherein obtaining enrollment acoustic data representing an utterance of a particular, predefined hotword that was spoken by a user during an enrollment process associated with a mobile device comprises:
    - obtaining enrollment acoustic data for multiple utterances of the particular, predefined hotword spoken by the user.
  - 3. The method of claim 1, wherein obtaining a set of candidate acoustic data representing utterances of the same, particular, predefined hotword that was previously spoken by other users comprises:
    - determining the utterance is of the particular, predefined hotword; and
      
      identifying candidate acoustic data representing utterances of the particular, predefined hotword spoken by other users.
  - 4. The method of claim 1, wherein determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data comprises:
    - determining an acoustic distance between the enrollment acoustic data and the candidate acoustic data; and
      
      determining the similarity score based on the acoustic distance.
  - 5. The method of claim 1, wherein determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data comprises:
    - determining the similarity scores based on demographic information of the other user that spoke the utterance represented by the candidate acoustic data and demographic information of the user that spoke the enrollment utterance.
  - 6. The method of claim 1, wherein selecting a subset of candidate acoustic data from the set of candidate acoustic data includes selecting a predetermined number of candidate acoustic data.
  - 8. The method of claim 1, comprising:
    - detecting an utterance of the particular, predefined hotword using the neural network-based, hotword detection model.

7. (canceled)

9. A system comprising:
- one or more computers; and
  
  one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising;
  
  obtaining enrollment acoustic data representing an utterance of a particular, predefined hotword that was spoken by a user during an enrollment process associated with a mobile device;
  
  obtaining a set of candidate acoustic data representing utterances that were previously-spoken by other users, wherein the utterances are of the same, particular, predefined hotword that was spoken by the user during the enrollment process associated with the mobile device;
  
  determining, for each candidate acoustic data of the set of candidate acoustic data a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, wherein the similarity score is associated with the candidate acoustic data;
  
  determining, for each candidate acoustic data of the set of candidate acoustic data, whether the similarity score associated with the candidate acoustic data satisfies a threshold similarity score;
  
  selecting a subset of candidate acoustic data from the set of candidate acoustic data in response to determining that the similarity score associated with the candidate acoustic data satisfies the threshold similarity score; and
  
  generating a neural network-based, hotword detection model using the enrollment acoustic data and the selected subset of candidate acoustic data; and
  
  providing the neural network-based detection model for use in detecting an utterance of the particular, predefined hotword that is subsequently spoken by the user.
- View Dependent Claims (10, 11, 12, 13, 14, 16)
- - 10. The system of claim 9, wherein obtaining enrollment acoustic data representing an utterance of a particular, predefined hotword that was spoken by a user during an enrollment process associated with a mobile device comprises:
    - obtaining enrollment acoustic data for multiple utterances of the particular, predefined hotword spoken by the user.
  - 11. The system of claim 9, wherein obtaining a set of candidate acoustic data representing utterances of the same, particular, predefined hotword that was previously spoken by other users comprises:
    - determining the utterance is of the particular, predefined hotword; and
      
      identifying candidate acoustic data representing utterances of the particular, predefined hotword spoken by other users.
  - 12. The system of claim 9, wherein determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data comprises:
    - determining an acoustic distance between the enrollment acoustic data and the candidate acoustic data; and
      
      determining the similarity score based on the acoustic distance.
  - 13. The system of claim 9, wherein determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data comprises:
    - determining the similarity scores based on demographic information of the other user that spoke the utterance represented by the candidate acoustic data and demographic information of the user that spoke the enrollment utterance.
  - 14. The system of claim 9, wherein selecting a subset of candidate acoustic data from the set of candidate acoustic data includes selecting a predetermined number of candidate acoustic data.
  - 16. The system of claim 9, the operations comprising:
    - detecting an utterance of the particular, predefined hotword using the neural network-based, hotword detection model.

15. (canceled)

17. A non-transitory computer-readable medium storing instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
- obtaining enrollment acoustic data representing an utterance of a particular, predefined hotword that was spoken by a user during an enrollment process associated with a mobile device;
  
  obtaining a set of candidate acoustic data representing utterances that were previously-spoken by other users, wherein the utterances are of the same, particular, predefined hotword that was spoken by the user during the enrollment process associated with the mobile device;
  
  determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, wherein the similarity score is associated with the candidate acoustic data;
  
  selecting a subset of candidate acoustic data from the set of candidate acoustic data in response to determining that the similarity score associated with the candidate acoustic data satisfies the threshold similarity score; and
  
  generating a neural network-based, hotword detection model using the enrollment acoustic data and the selected subset of candidate acoustic data; and
  
  providing the neural network-based, hotword detection model for use in detecting an utterance of the particular, predefined hotword that is subsequently spoken by the user.
- View Dependent Claims (18, 19, 20)
- - 18. The medium of claim 17, wherein obtaining enrollment acoustic data representing an utterance of a particular, predefined hotword that was spoken by a user during an enrollment process associated with a mobile device comprises:
    - obtaining enrollment acoustic data for multiple utterances of the particular, predefined hotword spoken by the user.
  - 19. The medium of claim 17, wherein obtaining a set of candidate acoustic data representing utterances of the same, particular, predefined hotword that was previously spoken by other users comprises:
    - determining the utterance is of the particular, predefined hotword; and
      
      identifying candidate acoustic data representing utterances of the particular, predefined hotword spoken by other users.
  - 20. The medium of claim 17, wherein determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data comprises:
    - determining an acoustic distance between the enrollment acoustic data and the candidate acoustic data; and
      
      determining the similarity score based on the acoustic distance.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google Inc. (Alphabet Inc.)
Inventors
Alvarez Guevara, Raziel

Granted Patent

US 10,438,593 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/063   Training

G10L 15/07   to the speaker

G10L 15/075   supervised, i.e. under mach...

G10L 15/1815   Semantic context, e.g. disa...

G10L 17/04   Training, enrolment or mode...

G10L 17/06   Decision making techniques;...

G10L 17/08   Use of distortion metrics o...

G10L 17/18   Artificial neural networks;...

G10L 17/24   the user being prompted to ...

G10L 2015/0638   Interactive procedures

G10L 2015/088   Word spotting

INDIVIDUALIZED HOTWORD DETECTION MODELS

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

INDIVIDUALIZED HOTWORD DETECTION MODELS

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links