PREDICTIVE SPEECH-TO-TEXT INPUT

US 20080120102A1
Filed: 11/16/2007
Published: 05/22/2008
Est. Priority Date: 11/17/2006
Status: Active Grant

First Claim

Patent Images

1. A method of multi-modal text recognition, comprising:

(a) receiving a speech waveform corresponding to text to be recognized;

(b) receiving at least one letter corresponding to a portion of the text, the at least one letter being received from an unambiguous data source;

(c) dynamically reducing a base lexicon into a subset search lexicon network based on the at least one letter;

(d) searching the search lexicon network for a best-matching text using speech recognition techniques;

(e) returning the best-matching text to a user interface for a determination whether the best-matching text is the text to be recognized; and

(f) reiterating steps (b) to (e) until the best-matching text is the text to be recognized.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

This disclosure describes a practical system/method for predicting spoken text (a spoken word or a spoken sentence/phrase) given that text'"'"'s partial spelling (example, initial characters forming the spelling of a word/sentence). The partial spelling may be given using “Speech” or may be inputted using the keyboard/keypad or may be obtained using other input methods. The disclosed system is an alternative method for inputting text into devices; the method is faster (especially for long words or phrases) compared to existing predictive-text-input and/or word-completion methods.

Citations

11 Claims

1. A method of multi-modal text recognition, comprising:
- (a) receiving a speech waveform corresponding to text to be recognized;
  
  (b) receiving at least one letter corresponding to a portion of the text, the at least one letter being received from an unambiguous data source;
  
  (c) dynamically reducing a base lexicon into a subset search lexicon network based on the at least one letter;
  
  (d) searching the search lexicon network for a best-matching text using speech recognition techniques;
  
  (e) returning the best-matching text to a user interface for a determination whether the best-matching text is the text to be recognized; and
  
  (f) reiterating steps (b) to (e) until the best-matching text is the text to be recognized.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method recited in claim 1, wherein the text to be recognized is a single word.
  - 3. The method recited in claim 1, wherein the text to be recognized is a phrase.
  - 4. The method recited in claim 1, wherein the unambiguous data source comprises a mechanical input device.
  - 5. The method recited in claim 4, wherein the mechanical input device comprises a keypad, a keyboard, a pen, a stylus, and/or a touch screen.
  - 6. The method recited in claim 1, wherein the speech recognition techniques comprise a dual-pass technique.
  - 7. The method recited in claim 1, wherein the speech recognition techniques comprise acoustic pattern matching techniques.
  - 8. The method recited in claim 1, wherein dynamically reducing the active lexicon further comprises evaluating the active lexicon to eliminate entries that are inconsistent with the at least one letter.
  - 9. The method recited in claim 1, wherein the step of receiving the speech waveform occurs prior to the step of receiving the at least one letter.
  - 10. The method recited in claim 1, wherein the step of receiving the speech waveform occurs after the step of receiving the at least one letter.

11. A system for multi-modal text recognition, comprising:
- an unambiguous data source for receiving input corresponding to text to be recognized;
  
  a base lexicon including data that describes waveforms associated with a plurality of textual entries, each entry having a proper spelling, each proper spelling having an initial letter;
  
  a lexicon reducing module configured to create a search lexicon from the base lexicon, the search lexicon including entries from the base lexicon having initial letters that correspond to input received from the unambiguous data source; and
  
  a speech recognition module configured to receiving a speech waveform corresponding to text to be recognized, and to compare that speech waveform to the search lexicon to identify one or more entries in the search lexicon that correspond to the speech waveform.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Ashwin P. Rao
Original Assignee
Ashwin P. Rao
Inventors
Rao, Ashwin P.

Granted Patent

US 7,904,298 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/235
CPC Class Codes

G10L 15/22 Procedures used during a sp...

H04M 2250/74 with voice recognition means

PREDICTIVE SPEECH-TO-TEXT INPUT

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

PREDICTIVE SPEECH-TO-TEXT INPUT

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links