Targeted clarification questions in speech recognition with concept presence score and concept correctness score

US 9,953,644 B2
Filed: 12/01/2014
Issued: 04/24/2018
Est. Priority Date: 12/01/2014
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

processing, via a speech recognizer, an utterance from a speaker to produce speech recognition output;

identifying speech segments in the speech recognition output;

generating two pairs of values for each speech segment including a first pair indicating a concept presence score for a corresponding speech segment and a second pair indicating a concept correctness score for the corresponding speech segment using a context that is unavailable to the speech recognizer throughout a dialog;

generating, for a chosen speech segment from the speech segments and based on the concept presence score and the concept correctness score, a targeted clarification question associated with the utterance, wherein the chosen speech segment is a recognizable speech segment that has a high recognition certainty in which the context indicates that a word in the chosen speech segment is unsuitable for the context; and

presenting the targeted clarification question to the speaker in response to the utterance.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system, method and computer-readable storage devices are disclosed for using targeted clarification (TC) questions in dialog systems in a multimodal virtual agent system (MVA) providing access to information about movies, restaurants, and musical events. In contrast with open-domain spoken systems, the MVA application covers a domain with a fixed set of concepts and uses a natural language understanding (NLU) component to mark concepts in automatically recognized speech. Instead of identifying an error segment, localized error detection (LED) identifies which of the concepts are likely to be present and correct using domain knowledge, automatic speech recognition (ASR), and NLU tags and scores. If at least concept is identified to be present but not correct, the TC component uses this information to generate a targeted clarification question. This approach computes probability distributions of concept presence and correctness for each user utterance, which can apply to automatic learning for clarification policies.

Citations

20 Claims

1. A method comprising:
- processing, via a speech recognizer, an utterance from a speaker to produce speech recognition output;
  
  identifying speech segments in the speech recognition output;
  
  generating two pairs of values for each speech segment including a first pair indicating a concept presence score for a corresponding speech segment and a second pair indicating a concept correctness score for the corresponding speech segment using a context that is unavailable to the speech recognizer throughout a dialog;
  
  generating, for a chosen speech segment from the speech segments and based on the concept presence score and the concept correctness score, a targeted clarification question associated with the utterance, wherein the chosen speech segment is a recognizable speech segment that has a high recognition certainty in which the context indicates that a word in the chosen speech segment is unsuitable for the context; and
  
  presenting the targeted clarification question to the speaker in response to the utterance.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the context that is unavailable to the speech recognizer comprises one of dialog history, a concept co-occurrence probability, domain history, speech recognition confidence scores, contextual features of the utterance, and tagging scores.
  - 3. The method of claim 1, wherein the concept presence score indicates a confidence that a concept type is present in a respective speech segment, and wherein the concept correctness score indicates a confidence that an identification of the concept type is correct.
  - 4. The method of claim 1, wherein the targeted clarification question is generated based on a question template associated with the speech segments.
  - 5. The method of claim 1, further comprising:
    - identifying multiple speech segments below a certainty threshold; and
      
      generating the targeted clarification question based on respective concept presence scores and concept correctness scores for the multiple speech segments.
  - 6. The method of claim 1, wherein the concept presence score and the concept correctness score are generated based on a domain of available concepts.
  - 7. The method of claim 1, wherein the speech recognizer identifies that at least one of the speech segments has a concept presence score above a certainty threshold.

8. A system comprising:
- a processor;
  
  a speech recognizer; and
  
  a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  generating two pairs of values for each speech segment including a first pair indicating a concept presence score for a corresponding speech segment and a second pair indicating a concept correctness score for the corresponding speech segment using a context that is unavailable to the speech recognizer throughout a dialog;
  
  generating, for a chosen speech segment from the speech segments and based on the concept presence score and the concept correctness score, a targeted clarification question associated with an utterance, wherein the chosen speech segment is a recognizable speech segment that has a high recognition certainty in which the context indicates that a word in the chosen speech segment is unsuitable for the context; and
  
  presenting the targeted clarification question to a speaker in response to the utterance.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, wherein the context that is unavailable to the speech recognizer comprises one of dialog history, a concept co-occurrence probability, a domain history, speech recognition confidence scores, contextual features of the utterance, and tagging scores.
  - 10. The system of claim 8, wherein the concept presence score indicates a confidence that a concept type is present in a respective speech segment, and wherein the concept correctness score indicates a confidence that an identification of the concept type is correct.
  - 11. The system of claim 8, wherein the targeted clarification question is generated based on a question template associated with the speech segments.
  - 12. The system of claim 8, the computer-readable storage medium having additional instructions stored which, when executed by the processor, cause the processor to perform operations comprising:
    - identifying multiple speech segments below a certainty threshold; and
      
      generating the targeted clarification question based on respective concept presence scores and concept correctness scores for the multiple speech segments.
  - 13. The system of claim 8, wherein the concept presence score and the concept correctness score are generated based on a domain of available concepts.
  - 14. The system of claim 8, wherein the speech recognizer identifies that at least one of the speech segments has a concept presence score above a certainty threshold.

15. A non-transitory computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- generating two pairs of values for each speech segment including a first pair indicating a concept presence score for a corresponding speech segment and a second pair indicating a concept correctness score for the corresponding speech segment using a context that is unavailable to the speech recognizer throughout a dialog;
  
  generating, for a chosen speech segment from the speech segments and based on the concept presence score and the concept correctness score, a targeted clarification question associated with an utterance, wherein the chosen speech segment is a recognizable speech segment that has a high recognition certainty in which the context indicates that a word in the chosen speech segment is unsuitable for the context; and
  
  presenting the targeted clarification question to a speaker in response to the utterance.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The non-transitory computer-readable storage device of claim 15, wherein the context that is unavailable to the speech recognizer comprises one of dialog history, a concept co-occurrence probability, domain history, speech recognition confidence scores, contextual features of the utterance, and tagging scores.
  - 17. The non-transitory computer-readable storage device of claim 15, wherein the concept presence score indicates a confidence that a concept type is present in a respective speech segment, and wherein the concept correctness score indicates a confidence that an identification of the concept type is correct.
  - 18. The non-transitory computer-readable storage device of claim 15, wherein the targeted clarification question is generated based on a question template associated with the speech segments.
  - 19. The non-transitory computer-readable storage device of claim 15, having additional instructions stored which, when executed by the computing device, cause the computing device to perform operations comprising:
    - identifying multiple speech segments below a certainty threshold; and
      
      generating the targeted clarification question based on respective concept presence scores and concept correctness scores for the multiple speech segments.
  - 20. The non-transitory computer-readable storage device of claim 15, wherein the concept presence score and the concept correctness score are generated based on a domain of available concepts.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Selfridge, Ethan, Johnston, Michael J., Stoyanchev, Svetlana
Primary Examiner(s)
Lerner, Martin

Application Number

US14/557,030
Publication Number

US 20160155445A1
Time in Patent Office

1,240 Days
Field of Search

704236, 704257, 704275
US Class Current
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/1822   Parsing for meaning underst...

G10L 15/22   Procedures used during a sp...

H04M 2250/74   with voice recognition means

Targeted clarification questions in speech recognition with concept presence score and concept correctness score

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Targeted clarification questions in speech recognition with concept presence score and concept correctness score

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links