Error correction in speech recognition

US 7,315,818 B2
Filed: 05/11/2005
Issued: 01/01/2008
Est. Priority Date: 05/02/2000
Status: Expired due to Term

- Alert
- Pin

First Claim

Patent Images

1. A computer-implemented method for speech recognition, the method comprising:

receiving dictated text;

generating recognized speech based on the received dictated text, the generating comprising determining acoustic models for the dictated text that best match acoustic data for the dictated text;

receiving an edited text of the recognized speech, the edited text indicating a replacement for a portion of the dictated text;

determining an acoustic model for the edited text;

determining whether to adapt acoustic models for the edited text based on the acoustic model for the edited text and the acoustic model for the dictated text portion.

View all claims

1 Assignment

Timeline View

Assignment View

Litigations

0 Petitions

Reexamination

Accused Products

Abstract

New techniques and systems may be implemented to improve error correction in speech recognition. These new techniques and systems may be implemented to correct errors in speech recognition systems may be used in a standard desktop environment, in a mobile environment, or in any other type of environment that can receive and/or present recognized speech.

317 Citations

21 Claims

1. A computer-implemented method for speech recognition, the method comprising:
- receiving dictated text;
  
  generating recognized speech based on the received dictated text, the generating comprising determining acoustic models for the dictated text that best match acoustic data for the dictated text;
  
  receiving an edited text of the recognized speech, the edited text indicating a replacement for a portion of the dictated text;
  
  determining an acoustic model for the edited text;
  
  determining whether to adapt acoustic models for the edited text based on the acoustic model for the edited text and the acoustic model for the dictated text portion.

2. The method of claim 1 further comprising calculating an acoustic model score based on a comparison between the acoustic model for the edited text and the acoustic data for the dictated text portion.

3. The method of claim 2 in which determining whether to adapt acoustic models for the edited text is based on the calculated acoustic model score.

4. The method of claim 3 in which determining whether to adapt acoustic models for the edited text comprises calculating an original acoustic model score based on a comparison between the acoustic model for the dictated text portion and the acoustic data for the dictated text portion.

5. The method of claim 4 in which determining whether to adapt acoustic models for the edited text comprises calculating a difference between the acoustic model score and the original acoustic model score.

6. The method of claim 5 in which determining whether to adapt acoustic models for the edited text comprises determining whether the difference is less than a predetermined value.

7. The method of claim 6 in which determining whether to adapt acoustic models for the edited text comprises adapting acoustic models for the edited text if the difference is less than a predetermined value.

8. The method of claim 6 in which determining whether to adapt acoustic models for the edited text comprises bypassing adapting acoustic models for the edited text if the difference is greater than or equal to a predetermined value.

9. The method of claim 1 in which receiving the edited text of the recognized speech occurs during a recognition session in which the recognized speech is generated.

10. The method of claim 1 in which receiving the edited text of the recognized speech occurs after a recognition session in which the recognized speech is generated.

11. The method of claim 1 in which receiving the edited text of the recognized speech comprises receiving a selection of the portion of the dictated text.

12. The method of claim 1 in which determining an acoustic model for the edited text comprises searching for the edited text in a vocabulary or a backup dictionary used to generate the recognized speech.

13. The method of claim 1 in which determining an acoustic model for the edited text comprises selecting an acoustic model that best matches the edited text.

14. A computer-implemented method of speech recognition, the method comprising:
- performing speech recognition on an utterance to produce a recognition result for the utterance;
  
  receiving a selection of the recognition result;
  
  receiving a correction of the recognition result;
  
  performing speech recognition on the correction using a constraint grammar that permits spelling and pronunciation in parallel; and
  
  identifying whether the correction comprises a spelling or a pronunciation using the constraint grammar.

15. The method of claim 14 further comprising generating a replacement result for the recognition result based on the correction.

16. The method of claim 14 in which the constraint grammar includes a spelling portion and a dictation vocabulary portion.

17. The method of claim 16 in which the spelling portion indicates that the first utterance from the user is a letter in an alphabet.

18. The method of claim 16 in which the vocabulary portion indicates that the first utterance from the user is a word from the dictation vocabulary.

19. The method of claim 16 in which the spelling portion indicates a frequency with which letters occur in a language model.

20. The method of claim 16 in which the dictation vocabulary portion indicates a frequency with which words occur in a language model.

21. The method of claim 16 further comprising introducing a biasing value between the spelling and the dictation vocabulary portions of the constraint grammar.

Specification

Resources

Litigation Campaign Assessment

Litigation Data

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Inventors
Stevens, Daniell, Roth, Robert, Sturtevant, Dean, Abrahams, David, Gould, Joel M., Ingold, Charles E., Newman, Michael J., Gold, Allan
Primary Examiner(s)
CHAWAN, VIJAY B

Application Number

US11/126,271
Publication Number

US 20050203751A1
Time in Patent Office

965 Days
Field of Search

704/260, 704/235, 704/256, 704/258, 704/254, 704/251
US Class Current

704/235
CPC Class Codes

G10L 15/22 Procedures used during a sp...

G10L 2015/0631 Creating reference template...

Error correction in speech recognition

First Claim

1 Assignment

Litigations

0 Petitions

Reexamination

Accused Products

Abstract

317 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Error correction in speech recognition

First Claim

1 Assignment

Subscription Required

Subscription Required

Litigations

0 Petitions

Subscription Required

Reexamination

Accused Products

Subscription Required

Abstract

317 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links