Method and apparatus for the recognition of spelled spoken words
First Claim
1. A speech recognizer recognizing speech based on an input utterance, comprising:
- a dictation language model accessible to provide a dictation model output indicative of a likely word recognized based on an input utterance, given one or more preceding words;
a letter-based spelling language model accessible to provide a spelling model output indicative of a likely letter recognized based on the input utterance, given one or more preceding letters;
an acoustic model accessible to provide an acoustic model output indicative of a likely speech unit recognized based on the input utterance; and
a speech recognition component configured to access the dictation language model, the spelling language model and the acoustic model and to weight the dictation model output and the spelling model output and calculate likely recognized speech based on the input utterance and one of the weighted dictation model output and the weighted spelling model output, the weight of the dictation model output and the weight of the spelling model output determining which output is used to recognize the speech in the input utterance.
2 Assignments
0 Petitions
Accused Products
Abstract
The speech recognizer includes a dictation language model providing a dictation model output indicative of a likely word sequence recognized based on an input utterance. A spelling language model provides a spelling model output indicative of a likely letter sequence recognized based on the input utterance. An acoustic model provides an acoustic model output indicative of a likely speech unit recognized based on the input utterances. A speech recognition component is configured to access the dictation language model, the spelling language model and the acoustic model. The speech recognition component weights the dictation model output and the spelling model output in calculating likely recognized speech based on the input utterance. The speech recognizer can also be configured to confine spelled speech to an active lexicon.
-
Citations
17 Claims
-
1. A speech recognizer recognizing speech based on an input utterance, comprising:
-
a dictation language model accessible to provide a dictation model output indicative of a likely word recognized based on an input utterance, given one or more preceding words;
a letter-based spelling language model accessible to provide a spelling model output indicative of a likely letter recognized based on the input utterance, given one or more preceding letters;
an acoustic model accessible to provide an acoustic model output indicative of a likely speech unit recognized based on the input utterance; and
a speech recognition component configured to access the dictation language model, the spelling language model and the acoustic model and to weight the dictation model output and the spelling model output and calculate likely recognized speech based on the input utterance and one of the weighted dictation model output and the weighted spelling model output, the weight of the dictation model output and the weight of the spelling model output determining which output is used to recognize the speech in the input utterance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
an active lexicon, coupled to at least one of the speech recognition component and the spelling language model, containing entries indicative of currently used words.
-
-
3. The speech recognizer of claim 2 and further comprising:
a user interface providing a user input change signal indicative of a user attempting to change a previously recognized word, and wherein the speech recognition component is configured to adjust the weight of the spelling model output based on the user input change signal.
-
4. The speech recognizer of claim 3 wherein the speech recognition component is configured to increase the weight of the spelling model output based on the user input change signal and correspondingly decrease the weight of the dictation model output.
-
5. The speech recognizer of claim 3 wherein the spelling language model is configured to provide the spelling model output based on the entries in the active lexicon.
-
6. The speech recognizer of claim 5 wherein the spelling model output is limited to sequences of letters that form the entries in the active lexicon.
-
7. The speech recognizer of claim 2 and further comprising:
a user interface providing a user input add signal indicating a user request to add a word to the active lexicon, and wherein the speech recognition component is configured to reduce the weight of the dictation model output, and increase the weight of the spelling model output based on the user input add signal.
-
8. The speech recognizer of claim 7 wherein the speech recognition component is configured to reduce the weight of the dictation model output to substantially zero, and increase the weight of the spelling model output to a substantial maximum value, based on the user input add signal.
-
9. The speech recognizer of claim 7 wherein the user interface is configured to provide a restore signal indicative of a user request to return to a normal speech recognition mode and wherein the speech recognition component is configured to restore the weights on the spelling model output and the dictation model output to values prior to receiving the user input add signal.
-
10. The speech recognizer of claim 9 wherein the speech recognition component is configured to restore the weights on the language model output and the spelling model output to substantially equal weights.
-
11. A method of recognizing speech with a speech recognizer that includes at least a dictation language model accessible to provide a dictation model output indicative of a likely word sequence recognized based on an input utterance and a spelling language model accessible to provide a spelling model output indicative of a likely letter sequence recognized based on the input utterance, the method comprising:
-
receiving the input utterance;
accessing at least the dictation language model and the spelling language model;
biasing weights on the dictation model output and the spelling model output based on a likelihood that the user is spelling spoken words; and
calculating likely recognized speech based on the weighted spelling model output and the weighted dictation model output. - View Dependent Claims (12, 13, 14, 15, 16, 17)
biasing the weights based on whether the user has selected a word for correction.
-
-
13. The method of claim 12 wherein biasing the weights comprises:
-
if the user has selected a word for correction, increasing the weight on the spelling model output; and
decreasing the weight on the dictation language model output.
-
-
14. The method of claim 11 wherein the speech recognizer includes a lexicon and further comprising:
biasing recognition of spelled spoken words to words found in the lexicon.
-
15. The method of claim 11 wherein the speech recognizer includes a lexicon, and further comprising:
-
receiving a user input signal indicative of a user request to add a word to the lexicon;
adjusting the weights on the dictation model output and the spelling model output based on the user input signal;
receiving utterances indicative of spoken letters forming the word to be added;
accessing the spelling language model to recognize letters represented by the utterances; and
adding the word to the lexicon.
-
-
16. The method of claim 15 wherein adjusting the weights comprises:
-
reducing the weight on the dictation model output to substantially a minimum value; and
increasing the weight on the spelling model output to substantially a maximum value.
-
-
17. The method of claim 15 wherein adding the word to the lexicon comprises:
-
receiving a user input signal indicating that the word to be added is complete;
storing the word in the lexicon; and
restoring the weights on the dictation model output and the spelling model output to previous values.
-
Specification