Spoken dialog system using a best-fit language model and best-fit grammar

US 20030149561A1
Filed: 02/01/2002
Published: 08/07/2003
Est. Priority Date: 02/01/2002
Status: Active Grant

First Claim

Patent Images

1. A spoken dialog system using a best-fit language model, comprising:

a dialog manager, coupled to a language model selector, that provides to the language model selector a current dialog state;

the language model selector, coupled to a plurality of dialog-state dependent language models, that selects one of the plurality of dialog-state dependent language models;

the plurality of dialog-state dependent language models that are interpolated from a general-task language model;

a large vocabulary continuous speech recognizer, coupled to the dialog manager and the language model selector, that receives input speech and generates a first hypothesis result for the input speech with a likelihood score, based on the selected dialog-state dependent language model;

the general-task language model, coupled to the large vocabulary continuous speech recognizer, that enables the large vocabulary continuous speech recognizer to generate a second hypothesis with a second likelihood score; and

a plurality of dialog strategies based on the language model system, coupled to the dialog manager.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A spoken dialog system using a best-fit language model and a spoken dialog system using best-fit grammar are disclosed. A spoken dialog system implementing both a best-fit language model and best-fit grammar is further disclosed. Regarding the language model, likelihood scores from a large vocabulary continuous speech recognition (“LVCSR”) module are used to select the best-fit language model among a general task language model and dialog-state dependent language models. Based on the chosen language model, a dialog manager can implement different strategies to improve general dialog performance and recognition accuracy. Regarding grammar, the best-fit grammar method improves performance and user experience of dialog systems by choosing the best-fit grammar among a general purpose grammar and dialog-state dependent sub-grammars. Based on the selected grammar pattern, the dialog system can choose from varying dialog strategies, resulting in an increase in user acceptance of spoken dialog systems.

Citations

39 Claims

1. A spoken dialog system using a best-fit language model, comprising:
- a dialog manager, coupled to a language model selector, that provides to the language model selector a current dialog state;
  
  the language model selector, coupled to a plurality of dialog-state dependent language models, that selects one of the plurality of dialog-state dependent language models;
  
  the plurality of dialog-state dependent language models that are interpolated from a general-task language model;
  
  a large vocabulary continuous speech recognizer, coupled to the dialog manager and the language model selector, that receives input speech and generates a first hypothesis result for the input speech with a likelihood score, based on the selected dialog-state dependent language model;
  
  the general-task language model, coupled to the large vocabulary continuous speech recognizer, that enables the large vocabulary continuous speech recognizer to generate a second hypothesis with a second likelihood score; and
  
  a plurality of dialog strategies based on the language model system, coupled to the dialog manager.
- View Dependent Claims (2, 3, 4)
- - 2. The spoken dialog system of claim 1, wherein the language model selector selects one of the plurality of dialog-state dependent language models based on the current dialog state.
  - 3. The spoken dialog system of claim 1, wherein an end result is chosen from the higher value of the first likelihood score and the second likelihood score.
  - 4. The spoken dialog system of claim 1, wherein the dialog manager deploys a plurality of various dialog components to improve dialog performance.

5. A spoken dialog system using best-fit grammar, comprising:
- a dialog manager, coupled to a grammar selector, that provides to the grammar selector a current dialog state;
  
  the grammar selector, coupled to a plurality of dialog-state dependent sub-grammars, that selects one of the plurality of dialog-state dependent sub-grammars;
  
  the plurality of dialog-state dependent sub-grammars, that contain a plurality of speech patterns;
  
  a speech recognition module, coupled to the dialog manager and the grammar selector, that receives input speech;
  
  a general-purpose grammar, coupled to the grammar selector, that contains patterns of general user responses; and
  
  a plurality of dialog strategies based on the selected grammar system, coupled to the dialog manager, to enhance dialog performance.
- View Dependent Claims (6, 7, 8, 9, 10)
- - 6. The spoken dialog system of claim 5, wherein the grammar selector chooses one of the plurality of dialog-state dependent sub-grammars based on the current dialog state.
  - 7. The spoken dialog system of claim 5, wherein the dialog manager commands the grammar selector to use the general-purpose grammar, if the selected dialog-state dependent sub-grammar fails to provide a matching pattern to the input speech.
  - 8. The spoken dialog system of claim 5, wherein each of the plurality of dialog-state dependent sub-grammars is specific to at least one of a defined sub-task.
  - 9. The spoken dialog system of claim 5, wherein a speech understanding module, coupled to the dialog manager and the grammar selector, receives a word sequence generated by the speech recognition module.
  - 10. The spoken dialog system of claim 5, wherein the dialog manager deploys a plurality of various dialog components to improve dialog performance.

11. A spoken dialog system implementing a best-fit language model, comprising a computer readable medium and a computer readable program code stored on the computer readable medium having instructions to:
- receive a current dialog state from a dialog manager;
  
  select a dialog-state dependent language model from a plurality of dialog-state dependent language models based on the current dialog state;
  
  generate a first hypothesis result for input speech with a first likelihood score;
  
  generate a second hypothesis result for input speech with a second likelihood score;
  
  select a best-fit language model from the higher value of the first likelihood score and the second likelihood score; and
  
  implement dialog strategies, based on the best-fit language model, to improve dialog performance.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The spoken dialog system of claim 11, wherein the instructions are provided to a language model selector to select the dialog-state dependent language model.
  - 13. The system of claim 12, wherein the instructions are provided to a large vocabulary continuous speech recognizer to receive input speech and generate the first hypothesis result based on the selected dialog-state dependent language model.
  - 14. The system of claim 13, wherein the instructions are provided to the large vocabulary continuous speech recognizer from a general-task language model to generate the second hypothesis result.
  - 15. The system of claim 11, wherein the instructions are provided to the dialog manager to implement at least one of a plurality of dialog strategies to further improve accuracy and enhance dialog performance.

16. A spoken dialog system implementing best-fit grammar, comprising a computer readable medium and a computer readable program code stored on the computer readable medium having instructions to:
- receive a current dialog state from a dialog manager;
  
  select one of a plurality of dialog-state dependent sub-grammars based on the current dialog state;
  
  select a general-purpose grammar, if the chosen dialog-state dependent sub-grammar fails to provide a matching pattern of input speech; and
  
  implement dialog strategies, based on one of the dialog-state dependent sub-grammar and the general-purpose grammar, to improve dialog performance.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The spoken dialog system of claim 16, wherein the instructions are provided to a grammar selector to select one of the plurality of dialog-state dependent sub-grammars.
  - 18. The spoken dialog system of claim 16, wherein the instructions are provided from the dialog manager to the grammar selector to select the general-purpose grammar that contains patterns of general user responses.
  - 19. The spoken dialog system of claim 16, wherein each of the plurality of dialog-state dependent sub-grammars is specific to at least one defined task.
  - 20. The system of claim 16, wherein the instructions are provided to the dialog manager to implement at least one of a plurality of dialog strategies to further improve accuracy and enhance dialog performance.

21. A method of implementing a best-fit language model in a spoken dialog system, comprising:
- receiving a current dialog state from a dialog manager;
  
  choosing a dialog-state dependent language model from a plurality of dialog-state dependent language models based on the current dialog state;
  
  calculating a first hypothesis result for input speech with a first likelihood score;
  
  calculating a second hypothesis result for input speech with a second likelihood score;
  
  choosing a best-fit language model from the higher value of the first likelihood score and the second likelihood score; and
  
  deploying dialog strategies, based on the best-fit language model, to improve dialog performance.
- View Dependent Claims (22, 23, 24)
- - 22. The method of claim 21, wherein a language model selector chooses the dialog-state dependent language model.
  - 23. The method of claim 21, wherein a large vocabulary continuous speech recognizer receives input speech and calculates the first hypothesis result based on the selected dialog-state dependent language model and calculates the second hypothesis result from input speech from a general-task language model.
  - 24. The method of claim 23, wherein an end result is selected from the greater value of the first likelihood score and the second likelihood score.

25. A method of implementing a best-fit grammar model in a spoken dialog system, comprising:
- receiving a current dialog state from a dialog manager;
  
  choosing one of a plurality of dialog-state dependent sub-grammars based on the current dialog state;
  
  choosing a general-purpose grammar, if the chosen dialog-state dependent sub-grammar fails to provide a matching pattern of input speech; and
  
  employing dialog strategies, based on one of the dialog-state dependent sub-grammar and the general-purpose grammar, to improve dialog performance.
- View Dependent Claims (26, 27, 28)
- - 26. The method of claim 25, wherein a grammar selector chooses one of the plurality of dialog-state dependent sub-grammars.
  - 27. The method of claim 25, wherein the grammar selector chooses the general-purpose grammar that contains patterns of general speech.
  - 28. The method of claim 25, wherein the dialog manager employs at least one of a plurality of dialog strategies to further enhance dialog performance and improve recognition accuracy.

29. A spoken dialog system using a best-fit language model and best-fit grammar, comprising:
- a dialog manager, coupled to a language model selector and a grammar selector, that provides to the language model selector and to the grammar selector a current dialog state;
  
  the language model selector, coupled to a plurality of dialog-state dependent language models, that selects one of the plurality of dialog-state dependent language models;
  
  the grammar selector, coupled to a plurality of dialog-state dependent sub-grammars, that contain a plurality of speech patterns;
  
  the plurality of dialog-state dependent language models that are interpolated from a general-task language model;
  
  the plurality of dialog-state dependent sub-grammars that contain a plurality of speech patterns;
  
  a large vocabulary continuous speech recognizer, coupled to the dialog manager and the language model selector and a language understanding module, that receives input speech and generates a first hypothesis result for the input speech with a likelihood score, based on the selected dialog-state dependent language model;
  
  the language understanding module, coupled to the dialog manager and to the grammar selector and to large vocabulary continuous speech recognizer, that extracts critical information from a word sequence generated by the large vocabulary continuous speech recognizer;
  
  the general-task language model, coupled to the large vocabulary continuous speech recognizer, that enables the large vocabulary continuous speech recognizer to generate a second hypothesis with a second likelihood score;
  
  a general-purpose grammar, coupled to the grammar selector, that contains patterns of general user responses; and
  
  a plurality of dialog strategies based on the language model system and a plurality of dialog strategies based on the selected grammar system, coupled to the dialog manager.
- View Dependent Claims (30, 31)
- - 30. The spoken dialog system of claim 29, wherein an end result is chosen from the higher value of the first likelihood score and the second likelihood score.
  - 31. The spoken dialog system of claim 29, wherein the dialog manager commands the grammar selector to use the general-purpose grammar, if the selected dialog-state dependent sub-grammar fails to provide a matching pattern to the input speech.

32. A spoken dialog system implementing a best-fit language model and best-fit grammar, comprising a computer readable medium and a computer readable program code stored on the computer readable medium having instructions to:
- receive a current dialog state from a dialog manager;
  
  select a dialog-state dependent language model from a plurality of dialog-state dependent language models based on the current dialog state;
  
  select a dialog-state dependent sub-grammar from a plurality of dialog-state dependent sub-grammars based on the current dialog state;
  
  select a general-purpose grammar, if the chosen dialog-state dependent sub-grammar fails to provide a matching pattern of input speech;
  
  generate a first hypothesis result for input speech with a first likelihood score;
  
  generate a second hypothesis result for input speech with a second likelihood score;
  
  select a best-fit language model from the higher value of the first likelihood score and the second likelihood score; and
  
  implement dialog strategies, based on the best-fit language model and based on one of the dialog-state dependent sub-grammar and the general-purpose grammar.
- View Dependent Claims (33, 34, 35, 36, 37, 38, 39)
- - 33. The spoken dialog system of claim 32, wherein the instructions are provided to a language model selector to select the dialog-state dependent language model.
  - 34. The system of claim 33, wherein the instructions are provided to a large vocabulary continuous speech recognizer to receive input speech and generate the first hypothesis result based on the selected dialog-state dependent language model.
  - 35. The system of claim 34, wherein the instructions are provided to the large vocabulary continuous speech recognizer from a general-task language model to generate the second hypothesis result.
  - 36. The system of claim 32, wherein the instructions are provided to the dialog manager to implement at least one of a plurality of dialog strategies to further improve accuracy and enhance dialog performance.
  - 37. The spoken dialog system of claim 32, wherein the instructions are provided to a grammar selector to select one of the plurality of dialog-state dependent sub-grammars.
  - 38. The spoken dialog system of claim 32, wherein the instructions are provided from the dialog manager to the grammar selector to select the general-purpose grammar that contains patterns of general user responses.
  - 39. The spoken dialog system of claim 32, wherein each of the plurality of dialog-state dependent sub-grammars is specific to at least one defined task.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Zhou, Guojun

Granted Patent

US 6,999,931 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/240
CPC Class Codes

G10L 15/18   using natural language mode...

G10L 15/22   Procedures used during a sp...

G10L 15/26   Speech to text systems G10L...

Spoken dialog system using a best-fit language model and best-fit grammar

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

39 Claims

Specification

Solutions

Use Cases

Quick Links

Spoken dialog system using a best-fit language model and best-fit grammar

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

39 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links