Error correction in speech recognition
DCFirst Claim
Patent Images
1. A computer-implemented method for speech recognition, the method comprising:
- receiving dictated text;
generating recognized speech based on the received dictated text, the generating comprising determining acoustic models for the dictated text that best match acoustic data for the dictated text;
receiving an edited text of the recognized speech, the edited text indicating a replacement for a portion of the dictated text;
determining an acoustic model for the edited text;
determining whether to adapt acoustic models for the edited text based on the acoustic model for the edited text and the acoustic model for the dictated text portion.
1 Assignment
Litigations
0 Petitions
Reexamination
Accused Products
Abstract
New techniques and systems may be implemented to improve error correction in speech recognition. These new techniques and systems may be implemented to correct errors in speech recognition systems may be used in a standard desktop environment, in a mobile environment, or in any other type of environment that can receive and/or present recognized speech.
317 Citations
21 Claims
-
1. A computer-implemented method for speech recognition, the method comprising:
-
receiving dictated text; generating recognized speech based on the received dictated text, the generating comprising determining acoustic models for the dictated text that best match acoustic data for the dictated text; receiving an edited text of the recognized speech, the edited text indicating a replacement for a portion of the dictated text; determining an acoustic model for the edited text; determining whether to adapt acoustic models for the edited text based on the acoustic model for the edited text and the acoustic model for the dictated text portion.
-
-
2. The method of claim 1 further comprising calculating an acoustic model score based on a comparison between the acoustic model for the edited text and the acoustic data for the dictated text portion.
-
3. The method of claim 2 in which determining whether to adapt acoustic models for the edited text is based on the calculated acoustic model score.
-
4. The method of claim 3 in which determining whether to adapt acoustic models for the edited text comprises calculating an original acoustic model score based on a comparison between the acoustic model for the dictated text portion and the acoustic data for the dictated text portion.
-
5. The method of claim 4 in which determining whether to adapt acoustic models for the edited text comprises calculating a difference between the acoustic model score and the original acoustic model score.
-
6. The method of claim 5 in which determining whether to adapt acoustic models for the edited text comprises determining whether the difference is less than a predetermined value.
-
7. The method of claim 6 in which determining whether to adapt acoustic models for the edited text comprises adapting acoustic models for the edited text if the difference is less than a predetermined value.
-
8. The method of claim 6 in which determining whether to adapt acoustic models for the edited text comprises bypassing adapting acoustic models for the edited text if the difference is greater than or equal to a predetermined value.
-
9. The method of claim 1 in which receiving the edited text of the recognized speech occurs during a recognition session in which the recognized speech is generated.
-
10. The method of claim 1 in which receiving the edited text of the recognized speech occurs after a recognition session in which the recognized speech is generated.
-
11. The method of claim 1 in which receiving the edited text of the recognized speech comprises receiving a selection of the portion of the dictated text.
-
12. The method of claim 1 in which determining an acoustic model for the edited text comprises searching for the edited text in a vocabulary or a backup dictionary used to generate the recognized speech.
-
13. The method of claim 1 in which determining an acoustic model for the edited text comprises selecting an acoustic model that best matches the edited text.
-
14. A computer-implemented method of speech recognition, the method comprising:
-
performing speech recognition on an utterance to produce a recognition result for the utterance; receiving a selection of the recognition result; receiving a correction of the recognition result; performing speech recognition on the correction using a constraint grammar that permits spelling and pronunciation in parallel; and identifying whether the correction comprises a spelling or a pronunciation using the constraint grammar.
-
-
15. The method of claim 14 further comprising generating a replacement result for the recognition result based on the correction.
-
16. The method of claim 14 in which the constraint grammar includes a spelling portion and a dictation vocabulary portion.
-
17. The method of claim 16 in which the spelling portion indicates that the first utterance from the user is a letter in an alphabet.
-
18. The method of claim 16 in which the vocabulary portion indicates that the first utterance from the user is a word from the dictation vocabulary.
-
19. The method of claim 16 in which the spelling portion indicates a frequency with which letters occur in a language model.
-
20. The method of claim 16 in which the dictation vocabulary portion indicates a frequency with which words occur in a language model.
-
21. The method of claim 16 further comprising introducing a biasing value between the spelling and the dictation vocabulary portions of the constraint grammar.
Specification