Processing speech recognition errors in an embedded speech recognition system

US 20020123893A1
Filed: 03/01/2001
Published: 09/05/2002
Est. Priority Date: 03/01/2001
Status: Active Grant

First Claim

Patent Images

1. In a remote training system, a method for processing a speech misrecognition generated when converting speech audio to text in an embedded speech recognition system comprising:

receiving from an embedded speech recognition system speech audio and an active acoustic model both associated with a detected speech misrecognition in said embedded speech recognition system;

first presenting a list of valid phrases which were contextually valid when the speech misrecognition occurred, and second presenting a list of words forming a selected one of said first presented contextually valid phrases;

modifying said active acoustic model based on selected ones of said words in said list and said received speech audio; and

, transmitting said modified acoustic model to said embedded speech recognition system.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system for processing speech misrecognitions. The system can include an embedded speech recognition system having at least one acoustic model and at least one active grammar, wherein the embedded speech recognition system is configured to convert speech audio to text using the at least one acoustic model and the at least one active grammar; a remote training system for modifying the at least one acoustic model based on corrections to speech misrecognitions detected in the embedded speech recognition system; and, a communications link for communicatively linking the embedded speech recognition system to the remote training system. The embedded speech recognition system can further include a user interface for presenting a dialog for correcting the speech misrecognitions detected in the embedded speech recognition system. Notably, the user interface can be a visual display. Alternatively, the user interface can be an audio user interface. Finally, the user interface can include both a visual display and an audio user interface.

Citations

17 Claims

1. In a remote training system, a method for processing a speech misrecognition generated when converting speech audio to text in an embedded speech recognition system comprising:
- receiving from an embedded speech recognition system speech audio and an active acoustic model both associated with a detected speech misrecognition in said embedded speech recognition system;
  
  first presenting a list of valid phrases which were contextually valid when the speech misrecognition occurred, and second presenting a list of words forming a selected one of said first presented contextually valid phrases;
  
  modifying said active acoustic model based on selected ones of said words in said list and said received speech audio; and
  
  , transmitting said modified acoustic model to said embedded speech recognition system.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The method of claim 1, further comprising receiving an active grammar from said embedded speech recognition system, wherein said active acoustic model is modified based on said active grammar in addition to said selected words and said received speech audio.
  - 3. The method of claim 1, wherein said first presenting step comprises visually presenting said list of contextually valid phrases in a user interface.
  - 4. The method of claim 1, wherein said first presenting step comprises audibly presenting said list of contextually valid phrases.
  - 5. The method of claim 3, wherein said first presenting step further comprises audibly presenting said list of contextually valid phrases.
  - 6. The method of claim 4, wherein said step of audibly presenting said list comprises:
    - text-to-speech (TTS) converting said list of contextually valid phrases; and
      
      , audibly presenting said TTS converted list.

7. A machine readable storage, having stored thereon a computer program for processing a speech misrecognition generated when converting speech audio to text in an embedded speech recognition system , said computer program having a plurality of code sections executable by a machine for causing the machine to perform the steps of:
- A receiving from an embedded speech recognition system speech audio and an active acoustic model both associated with a detected speech misrecognition in said embedded speech recognition system;
  
  first presenting a list of valid phrases which were contextually valid when the speech misrecognition occurred, and second presenting a list of words forming a selected one of said first presented contextually valid phrases;
  
  modifying said active acoustic model based on selected ones of said words in said list and said received speech audio; and
  
  , transmitting said modified acoustic model to said embedded speech recognition system.
- View Dependent Claims (8, 9, 10, 11, 12, 14, 15, 16, 17)
- - 8. The machine readable storage of claim 7, further comprising receiving an active grammar from said embedded speech recognition system, wherein said active acoustic model is modified based on said active grammar in addition to said selected words and said received speech audio.
  - 9. The machine readable storage of claim 7, wherein said first presenting step comprises visually presenting said list of contextually valid phrases in a user interface.
  - 10. The machine readable storage of claim 7, wherein said first presenting step comprises audibly presenting said list of contextually valid phrases.
  - 11. The machine readable storage of claim 9, wherein said first presenting step further comprises audibly presenting said list of contextually valid phrases.
  - 12. The machine readable storage of claim 10, wherein said step of audibly presenting said list comprises:
    - text-to-speech (TTS) converting said list of contextually valid phrases; and
      
      , audibly presenting said TTS converted list.
  - 14. The system of claim 13, wherein said remote training system further comprises a user interface for presenting a dialog for correcting said speech misrecognitions detected in said embedded speech recognition system.
  - 15. The system of claim 14, wherein said user interface is a visual display.
  - 16. The system of claim 14, wherein said user interface is an audio user interface.
  - 17. The system of claim 15, wherein said user interface further comprises an audio user interface.

13. A system for processing speech misrecognitions comprising:
- an embedded speech recognition system comprising at least one acoustic model and at least one active grammar, said embedded speech recognition system configured to convert speech audio to text using said at least one acoustic model and said at least one active grammar;
  
  a remote training system for modifying said at least one acoustic model based on corrections to speech misrecognitions detected in said embedded speech recognition system; and
  
  , a communications link for communicatively linking said embedded speech recognition system to said remote training system.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Woodward, Steven G.

Granted Patent

US 6,934,682 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/250
CPC Class Codes

G10L 15/22   Procedures used during a sp...

G10L 2015/0631   Creating reference template...

G10L 2015/221   Announcement of recognition...

Processing speech recognition errors in an embedded speech recognition system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Processing speech recognition errors in an embedded speech recognition system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links