SYSTEM AND METHOD FOR SPELLING RECOGNITION USING SPEECH AND NON-SPEECH INPUT

US 20090281806A1
Filed: 07/22/2009
Published: 11/12/2009
Est. Priority Date: 07/19/2004
Status: Active Grant

First Claim

Patent Images

1. A system for recognizing a combination of speech and alternate input, the method comprising:

a processor;

a module configured to control the processor to generate an unweighted grammar permitting all letter sequences that map to a received non-speech input;

a module configured to control the processor to select a database of words;

a module configured to control the processor to generate a weighted grammar using the unweighted grammar and a statistical letter model trained on the database of words;

a module configured to control the processor to receive speech from a user associated with the non-speech input after receiving the non-speech input and after generating the weighted grammar; and

a module configured to control the processor to process the received speech and non-speech input using the weighted grammar.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for non-speech input or keypad-aided word and spelling recognition is disclosed. The method includes generating an unweighted grammar, selecting a database of words, generating a weighted grammar using the unweighted grammar and a statistical letter model trained on the database of words, receiving speech from a user after receiving the non-speech input and after generating the weighted grammar, and performing automatic speech recognition on the speech and non-speech input using the weighted grammar. If a confidence is below a predetermined level, then the method includes receiving non-speech input from the user, disambiguating possible spellings by generating a letter lattice based on a user input modality, and constraining the letter lattice and generating a new letter string of possible word spellings until a letter string is correctly recognized.

Citations

20 Claims

1. A system for recognizing a combination of speech and alternate input, the method comprising:
- a processor;
  
  a module configured to control the processor to generate an unweighted grammar permitting all letter sequences that map to a received non-speech input;
  
  a module configured to control the processor to select a database of words;
  
  a module configured to control the processor to generate a weighted grammar using the unweighted grammar and a statistical letter model trained on the database of words;
  
  a module configured to control the processor to receive speech from a user associated with the non-speech input after receiving the non-speech input and after generating the weighted grammar; and
  
  a module configured to control the processor to process the received speech and non-speech input using the weighted grammar.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The system of claim 1, wherein the database of words is a domain of words related to the non-speech input.
  - 3. The system of claim 1, further comprising a module configured to control the processor to perform speech recognition based on the received speech and non-speech input using the weighted grammar.
  - 4. The system of claim 1, wherein the statistical letter model is an N-gram letter model.
  - 5. The system of claim 4, wherein the N-gram letter model is unsmoothed.
  - 6. The system of claim 1, further comprising a module configured to control the processor to generate a final letter string based on a database lookup.
  - 7. The system of claim 1, wherein the non-speech input comprises a portion of a word.

8. A method of recognizing input from a user, the method comprising:
- receiving input from a user;
  
  performing spelling recognition via an automatic speech recognition (ASR) system on the input, the speech recognition being performed using a statistical letter model trained on a database of words;
  
  disambiguating possible spellings by generating a letter lattice based on a user input modality; and
  
  performing, with each letter received, until a letter string is correctly recognized;
  
  constraining the letter lattice; and
  
  generating a new letter string of possible word spellings.
- View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
- - 9. The method of claim 8, wherein constraining the letter lattice further comprises locating the most probably path through the lattice.
  - 10. The method of claim 8, wherein the user input modality comprises speech input devices and non-speech input devices.
  - 11. The method of claim 8, wherein the statistical letter model is an N-gram letter model.
  - 12. The method of claim 11, wherein the statistical letter model is unsmoothed.
  - 13. The method of claim 8, further comprising generating a final letter string based on a database lookup.
  - 14. The method of claim 13, wherein generating the final letter string based on a database lookup further comprises using a finite state network that accepts only valid letter strings.
  - 15. The method of claim 13, wherein receiving input comprises receiving a portion of a word.
  - 16. The method of claim 8, further comprising, if an ASR confidence is below a predetermined level, prompting the user to enter the first three or less letters of the input by using a keypad.

17. A computer-readable storage medium storing instructions for controlling a computing device having a processor to recognize input from a user, the instructions comprising controlling the processor to perform the steps of:
- generating an unweighted grammar permitting all letter sequences that map to a received non-speech input;
  
  selecting a database of words;
  
  generating a weighted grammar using the unweighted grammar and a statistical letter model trained on the database of words;
  
  receiving speech from a user associated with the non-speech input after receiving the non-speech input and after generating the weighted grammar;
  
  performing recognition via automatic speech recognition (ASR) on the received speech and non-speech input using the weighted grammar; and
  
  if an ASR confidence is below a predetermined level;
  
  disambiguating possible spellings by generating a letter lattice based on a user input modality; and
  
  constraining the letter lattice and generating a new letter string of possible word spellings, with each letter received, until a letter string is correctly recognized;
- View Dependent Claims (18, 19, 20)
- - 18. The computer-readable storage medium of claim 17, wherein the user input modality comprises speech input devices and non-speech input devices.
  - 19. The computer-readable storage medium of claim 17, wherein the statistical letter model is an N-gram letter model.
  - 20. The computer-readable storage medium of claim 19, wherein the statistical letter model is unsmoothed.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
AT&T Corporation (AT&T, Inc.)
Inventors
Parthasarathy, Sarangarajan

Granted Patent

US 7,949,528 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G06F 2203/0381   Multimodal input, i.e. inte...

G06F 3/0237   using prediction or retriev...

G06F 3/038   Control and interface arran...

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/197   Probabilistic grammars, e.g...

G10L 15/22   Procedures used during a sp...

G10L 2015/086   Recognition of spelled words

SYSTEM AND METHOD FOR SPELLING RECOGNITION USING SPEECH AND NON-SPEECH INPUT

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR SPELLING RECOGNITION USING SPEECH AND NON-SPEECH INPUT

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links