Method and apparatus for the recognition of spelled spoken words

US 6,694,296 B1
Filed: 11/03/2000
Issued: 02/17/2004
Est. Priority Date: 07/20/2000
Status: Expired due to Term

First Claim

Patent Images

1. A speech recognizer recognizing speech based on an input utterance, comprising:

a dictation language model accessible to provide a dictation model output indicative of a likely word recognized based on an input utterance, given one or more preceding words;

a letter-based spelling language model accessible to provide a spelling model output indicative of a likely letter recognized based on the input utterance, given one or more preceding letters;

an acoustic model accessible to provide an acoustic model output indicative of a likely speech unit recognized based on the input utterance; and

a speech recognition component configured to access the dictation language model, the spelling language model and the acoustic model and to weight the dictation model output and the spelling model output and calculate likely recognized speech based on the input utterance and one of the weighted dictation model output and the weighted spelling model output, the weight of the dictation model output and the weight of the spelling model output determining which output is used to recognize the speech in the input utterance.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The speech recognizer includes a dictation language model providing a dictation model output indicative of a likely word sequence recognized based on an input utterance. A spelling language model provides a spelling model output indicative of a likely letter sequence recognized based on the input utterance. An acoustic model provides an acoustic model output indicative of a likely speech unit recognized based on the input utterances. A speech recognition component is configured to access the dictation language model, the spelling language model and the acoustic model. The speech recognition component weights the dictation model output and the spelling model output in calculating likely recognized speech based on the input utterance. The speech recognizer can also be configured to confine spelled speech to an active lexicon.

Citations

17 Claims

1. A speech recognizer recognizing speech based on an input utterance, comprising:
- a dictation language model accessible to provide a dictation model output indicative of a likely word recognized based on an input utterance, given one or more preceding words;
  
  a letter-based spelling language model accessible to provide a spelling model output indicative of a likely letter recognized based on the input utterance, given one or more preceding letters;
  
  an acoustic model accessible to provide an acoustic model output indicative of a likely speech unit recognized based on the input utterance; and
  
  a speech recognition component configured to access the dictation language model, the spelling language model and the acoustic model and to weight the dictation model output and the spelling model output and calculate likely recognized speech based on the input utterance and one of the weighted dictation model output and the weighted spelling model output, the weight of the dictation model output and the weight of the spelling model output determining which output is used to recognize the speech in the input utterance.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The speech recognizer of claim 1 and further comprising:
3. The speech recognizer of claim 2 and further comprising:
- a user interface providing a user input change signal indicative of a user attempting to change a previously recognized word, and wherein the speech recognition component is configured to adjust the weight of the spelling model output based on the user input change signal.
4. The speech recognizer of claim 3 wherein the speech recognition component is configured to increase the weight of the spelling model output based on the user input change signal and correspondingly decrease the weight of the dictation model output.
5. The speech recognizer of claim 3 wherein the spelling language model is configured to provide the spelling model output based on the entries in the active lexicon.
6. The speech recognizer of claim 5 wherein the spelling model output is limited to sequences of letters that form the entries in the active lexicon.
7. The speech recognizer of claim 2 and further comprising:
- a user interface providing a user input add signal indicating a user request to add a word to the active lexicon, and wherein the speech recognition component is configured to reduce the weight of the dictation model output, and increase the weight of the spelling model output based on the user input add signal.
8. The speech recognizer of claim 7 wherein the speech recognition component is configured to reduce the weight of the dictation model output to substantially zero, and increase the weight of the spelling model output to a substantial maximum value, based on the user input add signal.
9. The speech recognizer of claim 7 wherein the user interface is configured to provide a restore signal indicative of a user request to return to a normal speech recognition mode and wherein the speech recognition component is configured to restore the weights on the spelling model output and the dictation model output to values prior to receiving the user input add signal.
10. The speech recognizer of claim 9 wherein the speech recognition component is configured to restore the weights on the language model output and the spelling model output to substantially equal weights.

11. A method of recognizing speech with a speech recognizer that includes at least a dictation language model accessible to provide a dictation model output indicative of a likely word sequence recognized based on an input utterance and a spelling language model accessible to provide a spelling model output indicative of a likely letter sequence recognized based on the input utterance, the method comprising:
- receiving the input utterance;
  
  accessing at least the dictation language model and the spelling language model;
  
  biasing weights on the dictation model output and the spelling model output based on a likelihood that the user is spelling spoken words; and
  
  calculating likely recognized speech based on the weighted spelling model output and the weighted dictation model output.
- View Dependent Claims (12, 13, 14, 15, 16, 17)
- - 12. The method of claim 11 wherein biasing weights, comprises:
13. The method of claim 12 wherein biasing the weights comprises:
- if the user has selected a word for correction, increasing the weight on the spelling model output; and
  
  decreasing the weight on the dictation language model output.
14. The method of claim 11 wherein the speech recognizer includes a lexicon and further comprising:
- biasing recognition of spelled spoken words to words found in the lexicon.
15. The method of claim 11 wherein the speech recognizer includes a lexicon, and further comprising:
- receiving a user input signal indicative of a user request to add a word to the lexicon;
  
  adjusting the weights on the dictation model output and the spelling model output based on the user input signal;
  
  receiving utterances indicative of spoken letters forming the word to be added;
  
  accessing the spelling language model to recognize letters represented by the utterances; and
  
  adding the word to the lexicon.
16. The method of claim 15 wherein adjusting the weights comprises:
- reducing the weight on the dictation model output to substantially a minimum value; and
  
  increasing the weight on the spelling model output to substantially a maximum value.
17. The method of claim 15 wherein adding the word to the lexicon comprises:
- receiving a user input signal indicating that the word to be added is complete;
  
  storing the word in the lexicon; and
  
  restoring the weights on the dictation model output and the spelling model output to previous values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Ju, Yun-Cheng, Hwang, Mei-Yuh, Alleva, Fileno A.
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Han, Qi

Application Number

US09/706,375
Time in Patent Office

1,201 Days
Field of Search

704/251, 704/257, 704/255, 704/231
US Class Current

704/255
CPC Class Codes

G10L 15/197 Probabilistic grammars, e.g...

G10L 2015/086 Recognition of spelled words

Method and apparatus for the recognition of spelled spoken words

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for the recognition of spelled spoken words

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links