Multi-pass recognition of spoken dialogue

US 20030236664A1
Filed: 06/24/2002
Published: 12/25/2003
Est. Priority Date: 06/24/2002
Status: Active Grant

First Claim

Patent Images

1. A recognition system, comprising:

a first speech recognizer to implement a first language model with an utterance from a user and to generate a first hypothesis;

a first confidence estimator to indicate a first confidence score based on the first hypothesis, the first confidence estimator being programmed with a first threshold level; and

a second speech recognizer to implement a second language model with the utterance and to generate a second hypothesis, the second hypothesis being determinative of an outcome of the system if the first confidence score is less than the first confidence threshold level.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method of multi-pass recognition for conversational spoken dialogue systems includes two speech recognizers: a first recognizer that implements, for example, a statistical language model (SLM) and a second recognizer that implements, for example, a grammar-based model. A word-spotting speech recognizer may be included, as may confidence estimators for each speech recognizer. The system and method provide a multi-pass approach to speech recognition, which reevaluates speech inputs to improve recognition where confidence scores returned from confidence estimators are low.

147 Citations

43 Claims

1. A recognition system, comprising:
- a first speech recognizer to implement a first language model with an utterance from a user and to generate a first hypothesis;
  
  a first confidence estimator to indicate a first confidence score based on the first hypothesis, the first confidence estimator being programmed with a first threshold level; and
  
  a second speech recognizer to implement a second language model with the utterance and to generate a second hypothesis, the second hypothesis being determinative of an outcome of the system if the first confidence score is less than the first confidence threshold level.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The system of claim 1, wherein the first language model is selected from the group consisting of a statistical language model (SLM) and a grammar-based model.
  - 3. The system of claim 1, wherein the second language model is selected from the group consisting of a statistical language model (SLM) and a grammar-based model.
  - 4. The system of claim 1, wherein the first language model is a statistical language model (SLM) and the second language model is a grammar-based model.
  - 5. The system of claim 1, wherein if the first confidence score is greater than or equal to the first confidence threshold level, the system implements a desired action.
  - 6. The system of claim 1, further including a second confidence estimator to indicate a second confidence score based on the second hypothesis, the second confidence estimator being programmed with a second threshold level.
  - 7. The system of claim 6, wherein if the second confidence score is less than the second confidence threshold level, the system prompts the user to repeat the utterance.
  - 8. The system of claim 6, wherein if the second confidence score is greater than or equal to the second confidence threshold level, the system implements a desired action.
  - 9. The system of claim 1, further including a word-spotting speech recognizer to identify a keyword included in the utterance and to generate a keyword hypothesis.
  - 10. The system of claim 9, wherein the keyword hypothesis is determinative of the outcome of the system if the first confidence score is less than the first confidence threshold level.
  - 11. The system of claim 9, wherein the second language model is a grammar-based model that examines a number of grammar-based rules, and the keyword hypothesis is transmitted to the second speech recognizer to reduce the number of grammar-based rules that the second speech recognizer examines during an implementation of the second language model.
  - 12. The system of claim 9, further including a word-spotting confidence estimator to indicate a word-spotting confidence score based on the keyword hypothesis, the word-spotting confidence estimator being programmed with a word-spotting threshold level.
  - 13. The system of claim 12, wherein if the word-spotting confidence score is less than the word-spotting confidence threshold level, the system prompts the user to repeat the utterance.
  - 14. The system of claim 12, wherein the second hypothesis is determinative of the outcome of the system if the word-spotting confidence score is greater than or equal to the word-spotting confidence threshold level.
  - 15. The system of claim 9, wherein if the second confidence score is less than the second confidence threshold level, the system prompts the user to repeat information related to the keyword.
  - 16. The system of claim 9, wherein if the second confidence score is greater than or equal to the second confidence threshold level, the system implements a desired action.
  - 17. The system of claim 1, wherein the first speech recognizer and the second speech recognizer can operate in parallel.

18. A method of recognizing an utterance from a user, comprising:
- processing the utterance through a first recognition pass;
  
  generating a first sentence hypothesis by an implementation of a first language model during the first recognition pass;
  
  indicating a first confidence score based upon a perceived accuracy of the first sentence hypothesis;
  
  comparing the first confidence score to a first threshold level; and
  
  processing the utterance through a second recognition pass if the first confidence score is less than the first threshold level.
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
- - 19. The method of claim 18, further including taking a desired action if the first confidence score is greater than or equal to the first confidence threshold level.
  - 20. The method of claim 18, further including:
    - generating a second sentence hypothesis by an implementation of a second language model during the second recognition pass;
      
      indicating a second confidence score based upon a perceived accuracy of the second sentence hypothesis; and
      
      comparing the second confidence score to a second threshold level.
  - 21. The method of claim 20, further including taking a desired action if the second confidence score is greater than or equal to the second confidence threshold level.
  - 22. The method of claim 20, further including prompting the user to repeat the utterance if the second confidence score is less than the second confidence threshold level.
  - 23. The method of claim 18, wherein if the first confidence score is less than the first confidence threshold level, the method further includes:
    - generating a keyword hypothesis by implementing a word-spotting recognition pass with the utterance that recognizes a keyword;
      
      indicating a word-spotting confidence score based upon a perceived accuracy of the keyword hypothesis; and
      
      comparing the word-spotting confidence score to a word-spotting threshold level.
  - 24. The method of claim 23, further including prompting the user to repeat the utterance if the word-spotting confidence score is less than the word-spotting confidence threshold level.
  - 25. The method of claim 23, wherein processing the utterance through a second recognition pass occurs if the word-spotting confidence score is greater than the word-spotting confidence threshold level.
  - 26. The method of claim 23, further including prompting the user to repeat information related to the keyword if the second confidence score is less than the second confidence threshold level.
  - 27. The method of claim 18, wherein processing the utterance through the first recognition pass and processing the utterance through the second recognition pass can occur in parallel.
  - 28. The method of claim 18, wherein the first language model is a statistical language model.
  - 29. The method of claim 18, wherein the second language model is a grammar-based model.
  - 30. The method of claim 18, wherein the second language model is constrained by a keyword obtained from the first language model.

31. A program code storage device, comprising:
- a machine-readable storage medium; and
  
  machine-readable program code, stored on the machine-readable storage medium, the machine-readable program code having instructions to;
  
  process an utterance by a user through a first recognition pass, generate a first sentence hypothesis by an implementation of a first language model during the first recognition pass, indicate a first confidence score based upon a perceived accuracy of the first sentence hypothesis, compare the first confidence score to a first threshold level, and process the utterance through a second recognition pass if the first confidence score is less than the first threshold level.
- View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
- - 32. The device of claim 31, wherein the machine-readable program code further includes instructions to take a desired action if the first confidence score is greater than or equal to the first confidence threshold level.
  - 33. The device of claim 31, wherein the machine-readable program code further includes instructions to:
    - generate a second sentence hypothesis by an implementation of a second language model during the second recognition pass, indicate a second confidence score based upon a perceived accuracy of the second sentence hypothesis, and compare the second confidence score to a second threshold level.
  - 34. The device of claim 33, wherein the machine-readable program code further includes instructions to take a desired action if the second confidence score is greater than or equal to the second confidence threshold level.
  - 35. The device of claim 33, wherein the machine-readable program code further includes instructions to prompt the user to repeat the utterance if the second confidence score is less than the second confidence threshold level.
  - 36. The device of claim 31, wherein if the first confidence score is less than the first confidence threshold level, the machine-readable program code further includes instructions to:
    - generate a keyword hypothesis by implementing a word-spotting recognition pass with the utterance that recognizes a keyword, indicate a word-spotting confidence score based upon a perceived accuracy of the keyword hypothesis, and compare the word-spotting confidence score to a word-spotting threshold level.
  - 37. The device of claim 36, wherein the machine-readable program code further includes instructions to prompt the user to repeat the utterance if the word-spotting confidence score is less than the word-spotting confidence threshold level.
  - 38. The device of claim 36, wherein the instruction to process the utterance through a second recognition pass issues if the word-spotting confidence score is greater than the word-spotting confidence threshold level.
  - 39. The device of claim 36, wherein the machine-readable program code further includes instructions to prompt the user to repeat information related to the keyword if the second confidence score is less than the second confidence threshold level.
  - 40. The device of claim 31, wherein the instructions to process the utterance through the first recognition pass and process the utterance through the second recognition pass issue in parallel.
  - 41. The device of claim 31, wherein the first language model is a statistical language model.
  - 42. The device of claim 31, wherein the second language model is a grammar-based model.
  - 43. The device of claim 31, wherein the second language model is constrained by a keyword obtained from the first language model.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Sharma, Sangita R.

Granted Patent

US 7,502,737 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/251
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 15/18   using natural language mode...

G10L 15/22   Procedures used during a sp...

Multi-pass recognition of spoken dialogue

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

147 Citations

43 Claims

Specification

Solutions

Use Cases

Quick Links

Multi-pass recognition of spoken dialogue

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

147 Citations

43 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links