PREDICTIVE SPEECH-TO-TEXT INPUT
First Claim
Patent Images
1. A method of multi-modal text recognition, comprising:
- (a) receiving a speech waveform corresponding to text to be recognized;
(b) receiving at least one letter corresponding to a portion of the text, the at least one letter being received from an unambiguous data source;
(c) dynamically reducing a base lexicon into a subset search lexicon network based on the at least one letter;
(d) searching the search lexicon network for a best-matching text using speech recognition techniques;
(e) returning the best-matching text to a user interface for a determination whether the best-matching text is the text to be recognized; and
(f) reiterating steps (b) to (e) until the best-matching text is the text to be recognized.
0 Assignments
0 Petitions
Accused Products
Abstract
This disclosure describes a practical system/method for predicting spoken text (a spoken word or a spoken sentence/phrase) given that text'"'"'s partial spelling (example, initial characters forming the spelling of a word/sentence). The partial spelling may be given using “Speech” or may be inputted using the keyboard/keypad or may be obtained using other input methods. The disclosed system is an alternative method for inputting text into devices; the method is faster (especially for long words or phrases) compared to existing predictive-text-input and/or word-completion methods.
-
Citations
11 Claims
-
1. A method of multi-modal text recognition, comprising:
-
(a) receiving a speech waveform corresponding to text to be recognized; (b) receiving at least one letter corresponding to a portion of the text, the at least one letter being received from an unambiguous data source; (c) dynamically reducing a base lexicon into a subset search lexicon network based on the at least one letter; (d) searching the search lexicon network for a best-matching text using speech recognition techniques; (e) returning the best-matching text to a user interface for a determination whether the best-matching text is the text to be recognized; and (f) reiterating steps (b) to (e) until the best-matching text is the text to be recognized. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for multi-modal text recognition, comprising:
-
an unambiguous data source for receiving input corresponding to text to be recognized; a base lexicon including data that describes waveforms associated with a plurality of textual entries, each entry having a proper spelling, each proper spelling having an initial letter; a lexicon reducing module configured to create a search lexicon from the base lexicon, the search lexicon including entries from the base lexicon having initial letters that correspond to input received from the unambiguous data source; and a speech recognition module configured to receiving a speech waveform corresponding to text to be recognized, and to compare that speech waveform to the search lexicon to identify one or more entries in the search lexicon that correspond to the speech waveform.
-
Specification