Dual mode speech recognition

US 10,410,635 B2
Filed: 06/09/2017
Issued: 09/10/2019
Est. Priority Date: 06/09/2017
Status: Active Grant

First Claim

Patent Images

1. A speech recognition method comprising:

sending speech to a first recognizer and a second recognizer;

receiving, from the first recognizer, a first result associated with a recognition score;

setting a value of a timeout duration as a function of a value of the recognition score, such that the value of the timeout duration is set, in dependence upon the value of the recognition score, from at least one of a maximum value, an intermediary value and a minimum value;

responsive to receiving no result from the second recognizer before the timeout duration expires, choosing the first result as a basis for creating a response; and

responsive to receiving a second result from the second recognizer, updating a speech recognition vocabulary of the first recognizer to include at least one of an updated vocabulary model, an updated language model and an updated acoustic model.

View all claims

11 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A dual mode speech recognition system sends speech to two or more speech recognizers. If a first recognition result is received, whose recognition score exceeds a high threshold, the first result is selected without waiting for another result. If the score is below a low threshold, the first result is ignored. At intermediate values of recognition scores, a timeout duration is dynamically determined as a function of the recognition score. The timeout duration determines how long the system will wait for another result. Many functions of the recognition score are possible, but timeout durations generally decrease as scores increase. When receiving a second recognition score before the timeout occurs, a comparison based on recognition scores determines whether the first result or the second result is the basis for creating a response.

Citations

23 Claims

1. A speech recognition method comprising:
- sending speech to a first recognizer and a second recognizer;
  
  receiving, from the first recognizer, a first result associated with a recognition score;
  
  setting a value of a timeout duration as a function of a value of the recognition score, such that the value of the timeout duration is set, in dependence upon the value of the recognition score, from at least one of a maximum value, an intermediary value and a minimum value;
  
  responsive to receiving no result from the second recognizer before the timeout duration expires, choosing the first result as a basis for creating a response; and
  
  responsive to receiving a second result from the second recognizer, updating a speech recognition vocabulary of the first recognizer to include at least one of an updated vocabulary model, an updated language model and an updated acoustic model.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 13, 14)
- - 2. The method of claim 1, wherein the first recognizer and the second recognizer are local.
  - 3. The method of claim 1, wherein the first recognizer and the second recognizer are remote.
  - 4. The method of claim 1, wherein the speech is a continuous audio stream.
  - 5. The method of claim 1, wherein the speech is a delimited spoken query.
  - 6. The method of claim 1, wherein the recognition score is based on a phonetic sequence score.
  - 7. The method of claim 1, wherein the recognition score is based on a transcription score.
  - 8. The method of claim 1, wherein the recognition score is based on a grammar parse score.
  - 9. The method of claim 1, wherein the recognition score is based on an interpretation score.
  - 13. The method of claim 1, wherein the function is selected from a set of functions consisting of a linear function, a parabolic function and an s-shaped function.
  - 14. The method of claim 1, wherein the function is not a step function.

10. A non-transitory computer readable medium storing code that, when executed by one or more computer processors, causes the one or more computer processors to:
- send speech to a first recognizer and a second recognizer;
  
  receive, from the first recognizer, a first result associated with a recognition score;
  
  set a value of a timeout duration as a function of the value of the recognition score, such that the value of the timeout duration is set, in dependence upon the value of the recognition score, from at least one of a maximum value, an intermediary value and a minimum value;
  
  responsive to receiving no result from the second recognizer before the timeout duration expires, choose the first result as a basis for creating a response; and
  
  responsive to receiving a second result from the second recognizer, updating a speech recognition vocabulary of the first recognizer to include at least one of an updated vocabulary model, an updated language model and an updated acoustic model.

11. A mobile device enabled to perform dual mode speech recognition, the device comprising:
- a module for receiving speech from a user;
  
  a module for sending speech to a first recognizer;
  
  a module for sending speech to a second recognizer;
  
  a module for receiving a recognition score corresponding to recognition by the first recognizer;
  
  a module for detecting a timeout based on a timeout duration, a value of the timeout duration being selected as a function of the value of the recognition score, such that the value of the timeout duration is set in dependence upon the value of the recognition score, from at least one of a maximum value, an intermediary value and a minimum value; and
  
  responsive to receiving a second result from the second recognizer, updating a speech recognition vocabulary of the first recognizer to include at least one of an updated vocabulary model, an updated language model and an updated acoustic model,wherein the mobile device chooses a first result from the first recognizer if it does not receive a result from the second recognizer before the timeout occurs.
- View Dependent Claims (12)
- - 12. The mobile device of claim 11 wherein the first recognizer is local to the device and the second recognizer is remote from the mobile device.

15. A speech recognition method comprising:
- continuously sending speech to both (i) a first recognizer for first recognition of the speech and (ii) a second recognizer for second recognition of the speech;
  
  receiving, from the first recognizer, a first speech recognition result and an associated first recognition score;
  
  responsive to the first recognition score being above a threshold, choosing the first speech recognition result from the first recognizer as a basis for creating a response to the speech and responsive to the first recognition score being below the threshold, waiting a predetermined period to receive a second speech recognition result and an associated second recognition score from the second recognizer;
  
  receiving, from the second recognizer, the second speech recognition result and the second recognition score; and
  
  responsive to receiving the second speech recognition result and the second recognition score, choosing one of the first speech recognition result and the second speech recognition result, in dependence upon the first recognition score and the second recognition score,wherein the first recognition of the speech by the first recognizer and the second recognition of the speech by the second recognizer are continuous, andwherein the first recognition score and the second recognition score are recomputed and adjusted on a continuing basis by the first recognizer and the second recognizer as the speech continues to be sent to both the first recognizer and the second recognizer, such that the first recognition score and the second recognition score are continuously updated as new and continuous speech is recognized and until an end of the first recognition of the speech and the second recognition of the speech.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23)
- - 16. The method of claim 15, further comprising:
    - responsive to the first recognition score being below a low threshold, ignoring the first speech recognition result; and
      
      responsive to not receiving a second response before a timeout occurs, signaling an error.
  - 17. The method of claim 15, wherein the first recognizer and the second recognizer are local.
  - 18. The method of claim 15, wherein the first recognizer and the second recognizer are remote.
  - 19. The method of claim 15, wherein the speech is a delimited spoken query.
  - 20. The method of claim 15, wherein the first recognition score is based on a phonetic sequence score.
  - 21. The method of claim 15, wherein the first recognition score is based on a transcription score.
  - 22. The method of claim 15, wherein the first recognition score is based on a grammar parse score.
  - 23. The method of claim 15, wherein the first recognition score is based on an interpretation score.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Soundhound AI IP Holding LLC (SoundHound AI, Inc. (f/k/a Archimedes Tech SPAC Partners Co.)), Soundhound AI IP LLC (SoundHound AI, Inc. (f/k/a Archimedes Tech SPAC Partners Co.))
Original Assignee
SoundHound, Inc. (SoundHound AI, Inc. (f/k/a Archimedes Tech SPAC Partners Co.))
Inventors
Mont-Reynaud, Bernard
Primary Examiner(s)
Patel, Shreyans A

Application Number

US15/619,304
Publication Number

US 20180358019A1
Time in Patent Office

823 Days
Field of Search
US Class Current
CPC Class Codes

G10L 15/02   Feature extraction for spee...

G10L 15/063   Training

G10L 15/1822   Parsing for meaning underst...

G10L 15/30   Distributed recognition, e....

G10L 15/32   Multiple recognisers used i...

G10L 2015/0635   updating or merging of old ...

Dual mode speech recognition

First Claim

11 Assignments

0 Petitions

Accused Products

Abstract

Citations

23 Claims

Specification

Solutions

Use Cases

Quick Links

Dual mode speech recognition

First Claim

11 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

23 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links