Keyword spotting with competitor models

US 9,159,319 B1
Filed: 12/03/2012
Issued: 10/13/2015
Est. Priority Date: 12/03/2012
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

an electronic data store configured to store a keyword model that models a keyword and a competitor model that models a competitor word; and

a user device in communication with the electronic data store, the user device configured to;

receive a voice signal corresponding to a first utterance of a user;

compute feature vectors using the voice signal;

obtain a first score using the feature vectors and the keyword model, wherein the first score indicates a likelihood that the voice signal comprises the keyword;

obtain a second score using the feature vectors and the competitor model, wherein the second score indicates a likelihood that the voice signal comprises the competitor word;

determine that the voice signal comprises the keyword using the first score and the second score; and

transmit information identifying the keyword and a portion of the voice signal that corresponds to the keyword to a server device, wherein the server device is configured to;

perform speech recognition on the portion to obtain speech recognition results;

generate, using the speech recognition results, a model, wherein the model is one of an updated keyword model, an updated competitor model, or a second competitor model; and

transmit the generated model to the user device; and

store the generated model in the electronic data store.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Keyword spotting may be improved by using a competitor model. In some embodiments, audio data is received by a device. At least a portion of the audio data may be compared with a keyword model to obtain a first score. The keyword model may model a keyword. The portion of the audio data may also be compared with a competitor model to obtain a second score. The competitor model may model a competitor word, which may be a word that is similar to the keyword. The device may compare the first score and the second score to determine if a keyword is spoken.

125 Citations

32 Claims

1. A system comprising:
- an electronic data store configured to store a keyword model that models a keyword and a competitor model that models a competitor word; and
  
  a user device in communication with the electronic data store, the user device configured to;
  
  receive a voice signal corresponding to a first utterance of a user;
  
  compute feature vectors using the voice signal;
  
  obtain a first score using the feature vectors and the keyword model, wherein the first score indicates a likelihood that the voice signal comprises the keyword;
  
  obtain a second score using the feature vectors and the competitor model, wherein the second score indicates a likelihood that the voice signal comprises the competitor word;
  
  determine that the voice signal comprises the keyword using the first score and the second score; and
  
  transmit information identifying the keyword and a portion of the voice signal that corresponds to the keyword to a server device, wherein the server device is configured to;
  
  perform speech recognition on the portion to obtain speech recognition results;
  
  generate, using the speech recognition results, a model, wherein the model is one of an updated keyword model, an updated competitor model, or a second competitor model; and
  
  transmit the generated model to the user device; and
  
  store the generated model in the electronic data store.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The system of claim 1, wherein the keyword model comprises a hidden Markov model and a Gaussian mixture model, and wherein the user device is further configured to obtain the first score using a Viterbi algorithm.
  - 3. The system of claim 1, wherein the user device is further configured to determine that the voice signal comprises the keyword using a support vector machine.
  - 4. The system of claim 1, wherein the speech recognition results do not comprise the keyword, wherein the speech recognition results do not comprise the competitor word, wherein the generated model is the second competitor model, and wherein the second competitor model models a word in the speech recognition results.
  - 5. The system of claim 1, wherein the speech recognition results comprise the keyword, and wherein the generated model is the updated keyword model.
  - 6. The system of claim 1, wherein the electronic data store is further configured to store a background model, and wherein the user device is further configured to:
    - obtain a third score using the feature vectors and the background model; and
      
      determine that the voice signal comprises the keyword using the third score.

7. A computer-implemented method, comprising:
- as implemented by one or more computing devices configured with specific computer-executable instructions,receiving a voice signal corresponding to a first utterance;
  
  obtaining a first score using the voice signal and a keyword model, wherein the first score indicates a degree of similarity between the voice signal and a keyword;
  
  obtaining a second score using the voice signal and a competitor model, wherein the second score indicates a degree of similarity between the voice signal and a competitor word;
  
  determining that the voice signal comprises the keyword using the first score and the second score; and
  
  transmitting the keyword and a portion of the voice signal that corresponds to the keyword to a second device configured to perform speech recognition on the portion and configured to provide, based on the performed speech recognition, one of an updated keyword model or a second competitor model.
- View Dependent Claims (8, 9, 10, 11, 12, 13)
- - 8. The computer-implemented method of claim 7, further comprising:
    - computing feature vectors using the voice signal;
      
      wherein the keyword model comprises a hidden Markov model (HMM) and a Gaussian mixture model; and
      
      wherein obtaining the first score comprises using a Viterbi algorithm to align the feature vectors with states of the HMM.
  - 9. The computer-implemented method of claim 7, wherein obtaining the first score using the voice signal comprises using a support vector machine.
  - 10. The computer-implemented method of claim 7, wherein the model is one of an updated keyword model or a second competitor model.
  - 11. The computer-implemented method of claim 7, wherein determining that the voice signal comprises the keyword using the first score and the second score comprises determining that the first score is greater than the second score.
  - 12. The computer-implemented method of claim 7, further comprising:
    - obtaining a third score using the voice signal and a background model; and
      
      wherein determining that the voice signal comprises the keyword further comprises using the third score.
  - 13. The computer-implemented method of claim 7, wherein the first score indicates a likelihood that the voice signal comprises the keyword, and wherein the second score indicates a likelihood that the voice signal comprises the competitor word.

14. A non-transitory computer-readable medium comprising one or more modules configured to execute in one or more processors of a computing device, the one or more modules being further configured to:
- receive a voice signal corresponding to a first utterance;
  
  obtain a first score using the voice signal and a keyword model, wherein the first score indicates a degree of similarity between the voice signal and a keyword;
  
  obtain a second score using the voice signal and a competitor model, wherein the second score indicates a degree of similarity between the voice signal and a competitor word;
  
  determine that the voice signal comprises the keyword using the first score and the second score; and
  
  transmit the keyword and a portion of the voice signal that corresponds to the keyword to a second device configured to perform speech recognition on the portion and configured to provide a model based on the performed speech recognition.
- View Dependent Claims (15, 16, 17, 18, 19, 20)
- - 15. The non-transitory computer-readable medium of claim 14, wherein the one or more modules are further configured to:
    - compute feature vectors using the voice signal;
      
      obtain the first score using a Viterbi algorithm to align the feature vectors with states of a hidden Markov model corresponding to the keyword.
  - 16. The non-transitory computer-readable medium of claim 14, wherein the one or more modules are further configured to obtain the first score using a support vector machine.
  - 17. The non-transitory computer-readable medium of claim 14, wherein the model is one of an updated keyword model or a second competitor model.
  - 18. The non-transitory computer-readable medium of claim 14, wherein the one or more modules are configured to determine that the voice signal comprises the keyword by determining that the first score is greater than the second score.
  - 19. The non-transitory computer-readable medium of claim 14, wherein the one or more modules are further configured to:
    - obtaining a third score using the voice signal and a background model; and
      
      wherein determining that the voice signal comprises the keyword further comprises using the third score.
  - 20. The non-transitory computer-readable medium of claim 14, wherein the first score indicates a likelihood that the voice signal comprises the keyword, and wherein the second score indicates a likelihood that the voice signal comprises the competitor word.

21. A system comprising:
- a memory storing specific computer-executable instructions; and
  
  a processor configured to execute the specific computer-executable instructions, wherein execution of the specific computer-executable instructions by the processor causes the system to;
  
  receive, from a user device, a voice signal and a keyword identified by a keyword model as corresponding to the voice signal;
  
  perform speech recognition on the voice signal to obtain speech recognition results;
  
  determine that the speech recognition results do not comprise the keyword; and
  
  determine a competitor model, wherein the competitor model models a competitor word, and wherein the speech recognition results comprise the competitor word.
- View Dependent Claims (22, 23, 24, 25, 26)
- - 22. The system of claim 21, wherein execution of the specific computer-executable instructions further causes the system to transmit the competitor model to the user device.
  - 23. The system of claim 21, wherein the competitor model comprises a hidden Markov model.
  - 24. The system of claim 21, further comprising an electronic data store, wherein execution of the specific computer-executable instructions further causes the system to store the competitor word and a number of times the competitor word is falsely identified as the keyword in the electronic data store.
  - 25. The system of claim 24, wherein execution of the specific computer-executable instructions further causes the system to:
    - determine that the competitor word is falsely identified as the keyword greater than a predetermined number of times.
  - 26. The system of claim 21, wherein execution of the specific computer-executable instructions further causes the system to:
    - receive a second voice signal;
      
      perform speech recognition on the voice signal to obtain second speech recognition results;
      
      determine that the second speech recognition results comprise the keyword;
      
      generate an updated keyword model; and
      
      transmit the updated keyword model to the user device.

27. A computer-implemented method, comprising:
- as implemented by one or more computing devices configured with specific computer-executable instructions,receiving, from a user device, a voice signal and a keyword identified by a keyword model as corresponding the voice signal;
  
  performing speech recognition on the voice signal to obtain speech recognition results;
  
  determining that the speech recognition results do not comprise the keyword; and
  
  determining a competitor model, wherein the competitor model models a competitor word, and wherein the speech recognition results comprise the competitor word.
- View Dependent Claims (28, 29, 30, 31, 32)
- - 28. The computer-implemented method of claim 27, further comprising transmitting the competitor model to the user device.
  - 29. The computer-implemented method of claim 27, wherein the competitor model comprises a hidden Markov model.
  - 30. The computer-implemented method of claim 27, further comprising storing the competitor word and a number of times the competitor word is falsely identified as the keyword.
  - 31. The computer-implemented method of claim 27, further comprising:
    - determining that the competitor word is falsely identified as the keyword greater than a predetermined number of times.
  - 32. The computer-implemented method of claim 27, further comprising:
    - receiving a second voice signal from the user device;
      
      performing speech recognition on the voice signal to obtain second speech recognition results;
      
      determining that the second speech recognition results comprise the keyword;
      
      generate an updated keyword model; and
      
      transmitting the updated keyword model to the user device.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Hoffmeister, Bjorn
Primary Examiner(s)
Godbold, Douglas

Application Number

US13/692,775
Time in Patent Office

1,044 Days
Field of Search

704/231, 704/235, 704/244, 704251-2568
US Class Current

1/1
CPC Class Codes

G10L 15/08 Speech classification or se...

G10L 2015/088 Word spotting

Keyword spotting with competitor models

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

125 Citations

32 Claims

Specification

Solutions

Use Cases

Quick Links

Keyword spotting with competitor models

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

125 Citations

32 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links