Speech recognition employing a permissive recognition criterion for a repeated phrase utterance

US 5,737,724 A
Filed: 08/08/1996
Issued: 04/07/1998
Est. Priority Date: 11/24/1993
Status: Expired due to Term

First Claim

Patent Images

1. A method of recognizing a spoken phrase, the phrase including one or more words, the method comprising the steps of:

performing a first speech recognition process in an attempt to recognize a first utterance of the phrase, said first speech recognition process employing a first speech recognition criterion;

if said first speech recognition process does not result in recognition of said first utterance in accordance with said first recognition criterion, establishing a time interval in which to receive a second utterance of the phrase; and

if said first speech recognition process does not result in recognition of said first utterance in accordance with said first recognition criterion, and if said second utterance is received during said time interval, performing a second speech recognition process in an attempt to recognize said second utterance, said second speech recognition process employing a second speech recognition criterion, wherein said second speech recognition criterion is more likely to be satisfied than said first speech recognition criterion.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention relates to a method and apparatus for speech recognition, the speech to be recognized including one or more words. Recognition is based on an analysis of a first and a second utterance. In accordance with the invention, the first utterance is compared to one or more models of speech to determine a similarity metric for each such comparison. The model of speech which most closely matches the first utterance is determined based on the one or more similarity metrics. The similarity metric corresponding to the most closely matching model of speech is analyzed to determine whether the similarity metric satisfies a first recognition criterion. The second utterance is compared to one or more models of speech associated with the most closely matching model (which may include the most closely matching model) to determine a second utterance similarity metric for each such comparison. The one or more second utterance similarity metrics are analyzed to determine whether the one or more metrics satisfies a second recognition criteria. The second utterance is recognized has the phrase corresponding to the most closely matching model of speech when the first and second recognition criteria are satisfied. The present invention has application to many problems in speech recognition including isolated word recognition and command spotting. An illustrative embodiment of the invention in the context of a cellular telephone is provided. Other embodiments are also discussed.

104 Citations

View as Search Results

36 Claims

1. A method of recognizing a spoken phrase, the phrase including one or more words, the method comprising the steps of:
- performing a first speech recognition process in an attempt to recognize a first utterance of the phrase, said first speech recognition process employing a first speech recognition criterion;
  
  if said first speech recognition process does not result in recognition of said first utterance in accordance with said first recognition criterion, establishing a time interval in which to receive a second utterance of the phrase; and
  
  if said first speech recognition process does not result in recognition of said first utterance in accordance with said first recognition criterion, and if said second utterance is received during said time interval, performing a second speech recognition process in an attempt to recognize said second utterance, said second speech recognition process employing a second speech recognition criterion, wherein said second speech recognition criterion is more likely to be satisfied than said first speech recognition criterion.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
- - 2. The method of claim 1 further comprising the step of comparing the first utterance to a model reflecting acoustic background.
  - 3. The method of claim 1 wherein the phrase corresponds to an isolated word.
  - 4. The method of claim 1 wherein the first speech recognition process comprises the steps of:
    - for each of one or more models of speech, determining one or more corresponding first similarity measures based on a comparison of said corresponding model and the first utterance; and
      
      determining whether any of said first similarity measures satisfies the first recognition criterion.
  - 5. The method of claim 4 wherein one or more of said models of speech reflects one or more predetermined words.
  - 6. The method of claim 5 wherein one or more of said models of speech reflects an acoustic background.
  - 7. The method of claim 4 wherein the first recognition criterion comprises a determination as to whether any of said first similarity measures exceeds a first threshold.
  - 8. The method of claim 1 wherein the second speech recognition process comprises the steps of:
    - for each of one or more models of speech, determining one or more corresponding second similarity measures based on a comparison of said corresponding model and the second utterance;
      
      determining which of said second similarity measures satisfies said second speech recognition criterion; and
      
      recognizing the second utterance as the phrase, based on a particular one of said second similarity measures which satisfies said second recognition criterion.
  - 9. The method of claim 8 wherein the first recognition criterion comprises a determination as to whether any of said first similarity measures exceeds a first threshold and wherein the second recognition criterion comprises a determination as to whether any of said second similarity measures exceeds a second threshold less than said first threshold.
  - 10. The method of claim 1 wherein the phrase comprises a command phase for a utilization device.
  - 11. The method of claim 1 further comprising the step of receiving the first utterance and wherein the time interval begins at a predetermined time after receipt of the first utterance.
  - 12. The method of claim 1 further comprising the step of issuing a prompt for the second utterance and wherein the time interval begins at a predetermined time after issuance of the prompt.
  - 13. The method of claim 1 wherein the first recognition criterion depends on the phrase to be recognized.
  - 14. The method of claim 1 wherein the second speech recognition criterion depends on the phrase to be recognized.
  - 15. The method of claim 1 wherein the first speech recognition criterion depends on a state of a utilization device.
  - 16. The method of claim 1 wherein the second speech recognition criterion depends on a state of a utilization device.
  - 17. The method of claim 1 wherein the steps are implemented in a telecommunications network to facilitate operation of a network service.
  - 18. The method of claim 1 wherein the steps are implemented in a computer to facilitate operation of the computer.
  - 19. The method of claim 1 wherein said first speech recognition process does not result in recognition of said first utterance in accordance with said first recognition criterion, but nonetheless results in a determination of a most likely-spoken phrase, and wherein said second speech recognition process employs one or more models of speech which correspond to said most likely spoken phrase.

20. An apparatus for recognizing a spoken phrase, the phrase including one or more words, the apparatus comprising:
- means for performing a first speech recognition process in an attempt to recognize a first utterance of the phrase, said first speech recognition process employing a first speech recognition criterion;
  
  means for establishing a time interval in which to receive a second utterance of the phrase when said first speech recognition process does not result in recognition of said first utterance in accordance with said first recognition criterion; and
  
  means for performing a second speech recognition process in an attempt to recognize said second utterance if said first speech recognition process does not result in recognition of said first utterance in accordance with said first recognition criterion and if said second utterance is received during said time interval, said second speech recognition process employing a second speech recognition criterion, wherein said second speech recognition criterion is more likely to be satisfied than said first speech recognition criterion.
- View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36)
- - 21. The apparatus of claim 20 wherein the means for performing the first speech recognition process comprises:
    - means for determining, for each of one or more models of speech, one or more corresponding first similarity measures based on a comparison of said corresponding model and the first utterance; and
      
      means for determining whether any of said first similarity measures satisfies the first recognition criterion.
  - 22. The apparatus of claim 21 wherein the first recognition criterion comprises a determination as to whether any of said first similarity measures exceeds a first threshold.
  - 23. The apparatus of claim 21 wherein one or more of said models of speech reflects one or more predetermined words.
  - 24. The apparatus of claim 23 wherein one or more of said models of speech reflects an acoustic background.
  - 25. The apparatus of claim 20 further comprising means for comparing the first utterance to a model reflecting acoustic background.
  - 26. The apparatus of claim 20 wherein the phrase comprises a command phrase for a utilization device.
  - 27. The apparatus of claim 20 further comprising means for receiving the first utterance and wherein the time interval begins at a predetermined time after receipt of the first utterance.
  - 28. The apparatus of claim 20 further comprising means for issuing a prompt for the second utterance and wherein the time interval begins at a predetermined time after issuance of the prompt.
  - 29. The apparatus of claim 20 wherein the first speech recognition criterion depends on the phrase to be recognized.
  - 30. The apparatus of claim 20 wherein the second speech recognition criterion depends on the phrase to be recognized.
  - 31. The apparatus of claim 20 wherein the first speech recognition criterion depends on a state of a utilization device.
  - 32. The apparatus of claim 20 wherein the second speech recognition criterion depends on a state of a utilization device.
  - 33. The apparatus of claim 20 wherein the means for performing the second speech recognition process comprises:
    - means for determining, for each of one or more models of speech, one or more corresponding second similarity measures based on a comparison of said corresponding model and the second utterance;
      
      means for determining which of said second similarity measures satisfies said second speech recognition criterion; and
      
      means for recognizing the second utterance as the phrase, based on a particular one of said second similarity measures which satisfies said second recognition criterion.
  - 34. The apparatus of claim 33 wherein the first recognition criterion comprises a determination as to whether any of said first similarity measures exceeds a first threshold and wherein the second speech recognition criterion comprises a determination as to whether any of said second similarity measures exceeds a second threshold less than said first threshold.
  - 35. The apparatus of claim 20 wherein the phrase corresponds to an isolated word.
  - 36. The apparatus of claim 20 wherein said means for performing a first speech recognition process comprises means for determining a most likely-spoken phrase when said first speech recognition process does not result in recognition of said first utterance in accordance with said first recognition criterion, and wherein said means for performing a second speech recognition process employs one or more models of speech which correspond to said most likely-spoken phrase.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
IPR 1 Pty Limited
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Roe, David Bjorn, Haimi-Cohen, Raziel, Atal, Bishnu Saroop
Primary Examiner(s)
Knepper, David D.

Application Number

US08/695,140
Time in Patent Office

607 Days
Field of Search

395/2.4, 395/2.6, 395/2.64, 395/2.45-2.48, 395/2.53, 395/2.61, 395/2.84
US Class Current

704/251
CPC Class Codes

G10L 15/10 using distance or distortio...

G10L 2015/088 Word spotting

Speech recognition employing a permissive recognition criterion for a repeated phrase utterance

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

104 Citations

36 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition employing a permissive recognition criterion for a repeated phrase utterance

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

104 Citations

36 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links