Speech recognition error identification method and system

US 20050049868A1
Filed: 08/25/2003
Published: 03/03/2005
Est. Priority Date: 08/25/2003
Status: Abandoned Application

First Claim

Patent Images

1. A method for testing and improving the performance of a speech recognition engine, comprising:

identifying one or more words, phrases or utterances for recognition by a speech recognition engine;

passing the one or more identified words, phrases or utterances to a text-to-speech conversion module;

passing an audio pronunciation of each of the identified one or more words, phrases or utterances from the text-to-speech conversion module to the speech recognition engine;

creating a recognized word, phrase or utterance for each audio pronunciation passed to the speech recognition engine; and

analyzing each recognized word, phrase or utterance to determine how closely each recognized word, phrase or utterance approximates the respective audio pronunciation from which each recognized word, phrase or utterance is derived.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods and systems are provided for testing and improving the performance of a speech recognition system. Words, phrases or utterances are assembled for recognition by one or more speech recognition engines. At a text-to-speech application, an audio pronunciation of each word, phrase or utterance is created. Each audio pronunciation is passed to one or more speech recognition engines. The speech recognition engine analyzes the audio pronunciations and derives one or more words, phrases or utterances from the audio pronunciations. A confidence score is assigned to each of the one or more words, phrases or utterances derived from the audio pronunciations. If the confidence score for any derived word, phrase or utterance is below an acceptable threshold, the results of the speech recognition engine for the word, phrase or utterance are passed to a developer to allow the developer to take corrective action with respect to the speech recognition engine.

Citations

24 Claims

1. A method for testing and improving the performance of a speech recognition engine, comprising:
- identifying one or more words, phrases or utterances for recognition by a speech recognition engine;
  
  passing the one or more identified words, phrases or utterances to a text-to-speech conversion module;
  
  passing an audio pronunciation of each of the identified one or more words, phrases or utterances from the text-to-speech conversion module to the speech recognition engine;
  
  creating a recognized word, phrase or utterance for each audio pronunciation passed to the speech recognition engine; and
  
  analyzing each recognized word, phrase or utterance to determine how closely each recognized word, phrase or utterance approximates the respective audio pronunciation from which each recognized word, phrase or utterance is derived.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. The method of claim 1, further comprising assigning a confidence score to each recognized word, phrase or utterance.
  - 3. The method of claim 2, whereby assigning the confidence score to each recognized word, phrase or utterance is based on a confidence level associated with the each recognized word, phase or utterance based on prior speech recognition engine training.
  - 4. The method of claim 3, whereby assigning the confidence score to each recognized word, phrase or utterance is based on a confidence with which the speech recognition engine determines that each recognized word, phrase or utterance is the same as each respective word, phrase or utterance from which each recognized word, phrase or utterance is derived by the speech recognition engine based on prior speech recognition engine training.
  - 5. The method of claim 2, whereby if the confidence score exceeds an acceptable confidence score threshold level, designating the recognized word, phrase or utterance associated with the confidence score as being accurately recognized by the speech recognition engine.
  - 6. The method of claim 5, whereby if the confidence score is less than an acceptable threshold, modifying the speech recognition engine to recognize the word, phrase or utterance from which the recognized word, phrase or utterance is derived with higher accuracy.
  - 7. The method of claim 5, whereby if the confidence score is less than an acceptable confidence score threshold level, notifying a speech recognition engine developer.
  - 8. The method of claim 6, whereby modifying the speech recognition engine includes altering the audio pronunciation of the word, phrase or utterance associated with the confidence score that is less than an acceptable confidence score threshold level such that the altered audio pronunciation obtains an acceptable confidence score upon a next pass through the speech recognition engine.
  - 9. The method of claim 6, whereby modifying the speech recognition engine includes reducing the acceptable confidence score threshold level.
  - 10. The method of claim 1, after analyzing each recognized word, phrase or utterance, determining whether each recognized word, phrase or utterance is the same as a respective word, phrase or utterance from which the recognized word, phrase or utterance is derived.
  - 11. The method of claim 10, whereby if any recognized word, phrase or utterance is the same as the respective word, phrase or utterance from which the any recognized word, phrase or utterance is derived, designating the any recognized word, phrase or utterance as being accurately recognized by the speech recognition engine.
  - 12. The method of claim 1, prior to identifying one or more words, phrases or utterances for recognition by a speech recognition engine, loading into a memory location the one or more words, phrases or utterances.
  - 13. The method of claim 12, further comprising extracting the one or more words, phrases or utterances via a vocabulary extractor module.
  - 14. The method of claim 12, further comprising categorizing the one or more words, phrases or utterances by grammar type whereby all words, phrases or utterances of a same grammar type are grouped together in a grammar sub-tree.
  - 15. The method of claim 12, whereby a plurality of grammar sub-trees are grouped together to form a grammar tree containing all of the one or more words, phrases or utterances.
  - 16. The method of claim 14, whereby identifying one or more words, phrases or utterances for recognition by the speech recognition engine includes identifying a grammar sub-tree containing the one or more words, phrases or utterances.
  - 17. The method of claim 1, whereby creating a recognized word, phrase or utterance for each respective audio pronunciation includes converting each respective audio pronunciation from an audio format to a digital format by the speech recognition engine;
    - and analyzing phonetically each audio pronunciation of each of the one or more words, phrases or utterances to create the recognized word, phrase or utterance for each respective audio pronunciation.

18. A system for testing and improving the performance of a speech recognition engine, comprising:
- a text-to-speech conversion module operative to receive one or more identified words, phrases or utterances;
  
  to create and to pass an audio pronunciation of each of the identified one or more words, phrases or utterances the speech recognition engine;
  
  the speech recognition engine operative to create a recognized word, phrase or utterance for each audio pronunciation; and
  
  to analyze each recognized word, phrase or utterance to determine how closely each recognized word, phrase or utterance approximates the respective audio pronunciation from which each recognized word, phrase or utterance is derived.
- View Dependent Claims (19, 20, 21)
- - 19. The system of claim 18, whereby the speech recognition engine is further operative to assign a confidence score to each recognized word, phrase or utterance by analyzing each recognized word, phrase or utterance to determine how closely each recognized word, phrase or utterance approximates the respective audio pronunciation of each of one or more words, phrase, utterances.
  - 20. The system of claim 19, whereby the speech recognition engine is further operative to send a notification to a speech recognition engine developer if the confidence score is less than an acceptable confidence score threshold level.
  - 21. The system of claim 20, further comprising:
    - a vocabulary extractor module operative to extract the identified one or more words, phrases or utterances from a memory location; and
      
      to pass each extracted word, phrase or utterance to the text-to-speech conversion module.

22. A method for testing and improving the performance of a speech recognition engine, comprising:
- identifying one or more words, phrases or utterances for recognition by a speech recognition engine;
  
  creating and passing an audio pronunciation of each of the identified one or more words, phrases or utterances from a text-to-speech conversion module to the speech recognition engine;
  
  deriving a recognized word, phrase or utterance for each audio pronunciation passed to the speech recognition engine;
  
  assigning a confidence score to each recognized word, phrase or utterance based on the speech recognition engine'"'"'s confidence in each recognized word, phrase or utterance based on prior training of the speech recognition engine to recognize similar or same words, phrases or utterances as the each recognized word, phrase or utterance; and
  
  if the confidence score is less than an acceptable threshold, modifying the speech recognition engine to recognize the word, phrase or utterance from which the recognized word, phrase or utterance is derived with higher accuracy.
- View Dependent Claims (23, 24)
- - 23. The method of claim 22, whereby modifying the speech recognition engine includes altering the audio pronunciation of the word, phrase or utterance associated with the confidence score that is less than an acceptable confidence score threshold level such that the altered audio pronunciation obtains an acceptable confidence score upon a next pass through the speech recognition engine.
  - 24. The method of claim 22, whereby modifying the speech recognition engine includes reducing the acceptable confidence score threshold level.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Original Assignee
Bellsouth Intellectual Property Corporation (AT&T, Inc.)
Inventors
Busayapongchai, Senis

Application Number

US10/647,709
Publication Number

US 20050049868A1
Time in Patent Office

Days
Field of Search
US Class Current

704/251
CPC Class Codes

G10L 15/01 Assessment or evaluation of...

G10L 15/063 Training

Speech recognition error identification method and system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition error identification method and system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links