Methods, apparatus and computer programs for automatic speech recognition

US 20060116877A1
Filed: 10/20/2005
Published: 06/01/2006
Est. Priority Date: 12/01/2004
Status: Active Grant

First Claim

Patent Images

1. A method for controlling operation of an automatic speech recognition (ASR) system, comprising the steps of:

comparing sounds within an input audio signal with phones within an acoustic model to identify candidate matching phones;

calculating recognition confidence scores for individual candidate matching phones;

evaluating the recognition confidence scores to identify at least one of the candidate matching phones having a predefined recognition confidence characteristic; and

selecting a user prompt for eliciting a subsequent user input, wherein the selection is dependent on the identified at least one phone and the recognition confidence characteristic of the identified at least one phone.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An automatic speech recognition (ASR) system includes a speech-responsive application and a recognition engine. The ASR system generates user prompts to elicit certain spoken inputs, and the speech-responsive application performs operations when the spoken inputs are recognised. The recognition engine compares sounds within an input audio signal with phones within an acoustic model, to identify candidate matching phones. A recognition confidence score is calculated for each candidate matching phone, and the confidence scores are used to help identify one or more likely sequences of matching phones that appear to match a word within the grammar of the speech-responsive application. The per-phone confidence scores are evaluated against predefined confidence score criteria (for example, identifying scores below a ‘low confidence’ threshold) and the results of the evaluation are used to influence subsequent selection of user prompts. One such system uses confidence scores to select prompts for targetted recognition training—encouraging input of sounds identified as having low confidence scores. Another system selects prompts to discourage input of sounds that were not easily recognised.

267 Citations

13 Claims

1. A method for controlling operation of an automatic speech recognition (ASR) system, comprising the steps of:
- comparing sounds within an input audio signal with phones within an acoustic model to identify candidate matching phones;
  
  calculating recognition confidence scores for individual candidate matching phones;
  
  evaluating the recognition confidence scores to identify at least one of the candidate matching phones having a predefined recognition confidence characteristic; and
  
  selecting a user prompt for eliciting a subsequent user input, wherein the selection is dependent on the identified at least one phone and the recognition confidence characteristic of the identified at least one phone.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, for use in an ASR system in which a first user input is required for a first operation of the ASR system and a subsequent user input is required for a second operation of the ASR system, the method comprising the steps of:
    - evaluating a recognition confidence score for a phone that is a candidate matching phone for a sound within the first user input; and
      
      selecting a user prompt for the subsequent user input required for the second operation of the ASR system, wherein the selection is dependent on the recognition confidence score evaluated for the candidate matching phone for the sound within the first user input.
  - 3. The method of claim 2, wherein the selecting step comprises selecting at least one user prompt to encourage input of phones identified as having low confidence recognition scores.
  - 4. The method of claim 3, further comprising the steps of:
    - comparing sounds within a subsequently input audio signal with phones within the acoustic model to identify candidate matching phones;
      
      calculating recognition confidence scores for the candidate matching phones; and
      
      updating a recognition confidence score that relates the recognition confidence score for the first user input and the recognition confidence score for the subsequent user input.
  - 5. The method of claim 3, wherein the selecting step comprises comparing phones identified as having low recognition confidence scores with a list of optional user prompts and expected input phones associated with the optional user prompts, to select an input prompt associated with an expected input phone that is identified as having a relatively high likelihood of confusion with other phones.
  - 6. The method of claim 1, wherein the selecting step comprises selecting at least one user prompt to discourage input of phones identified as having low confidence recognition scores.
  - 7. The method of claim 6, wherein the selecting step comprises selecting a user prompt that invites input of a synonym for a phone identified as having a low recognition confidence score.
  - 8. The method of claim 1, further comprising the step of calculating an inherent likelihood of confusion between a phone and other phones, and wherein the step of evaluating confidence scores comprises combining the calculated recognition confidence scores with the calculated inherent likelihood of confusion and then comparing the combined result with a predefined recognition confidence characteristic.
  - 9. The method of claim 8, wherein the step of calculating an inherent likelihood of confusion comprises calculating distances between a first state of an acoustic model and other states of the model, the first state corresponding to a first sound and the other states corresponding to a set of states nearest to the first state.
  - 10. The method of claim 1, wherein an application grammar is modified in response to the calculated recognition confidence scores.
  - 11. The method of claim 10, wherein said modification of the application grammar comprises the steps of:
    - identifying words within the application grammar associated with confidence recognition scores below a predefined threshold score; and
      
      replacing said identified words within the application grammar with a synonym.
  - 12. The method of claim 11, further comprising the step of checking that the inherent confusability between said synonym and other words in the grammar is below a threshold before carrying out said replacing step.

13. An automatic speech recognition system comprising a speech-responsive application program and a speech recognition engine, the speech recognition system comprising:
- program code for comparing an input audio signal with phones within an acoustic model to identify candidate matching phones;
  
  program code for calculating recognition confidence scores for each of the candidate matching phones;
  
  program code for evaluating the recognition confidence scores for the candidate matching phones to identify at least one phone having a predefined recognition confidence characteristic; and
  
  program code, responsive to the identified at least one phone and responsive to the recognition confidence characteristic of the identified at least one phone, for selecting a user prompt to elicit a subsequent user input.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Staniford, Benjamin Terrick, Pickering, John Brian, Poultney, Timothy David, Whitbourne, Matthew

Granted Patent

US 8,694,316 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/231
CPC Class Codes

G10L 15/08   Speech classification or se...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 2015/085   Methods for reducing search...

Methods, apparatus and computer programs for automatic speech recognition

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

267 Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Methods, apparatus and computer programs for automatic speech recognition

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

267 Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links