MULTIMODAL INTERFACE FOR INPUT OF TEXT
First Claim
Patent Images
1. A multimodal system, the system comprising:
- a speech detection module configured to detect audio and convert the audio to at least one potential word;
a speech recognition module configured to accept the at least one potential word and at least one letter input, to create a reduced dictionary based on the at least one letter input, and to search the reduced dictionary for a speech recognized word based on the potential word;
a text prediction module configured to accept the least one letter input and predict a predicted word associated with the at least one letter input; and
a combine module configured to accept the predicted word and the speech recognized word and to create a set of choice words based on the predicted word and the speech recognized word.
0 Assignments
0 Petitions
Accused Products
Abstract
The disclosure describes an overall system/method for text-input using a multimodal interface with a combination of speech recognition and text prediction. Specifically, an “always listening” mode for entering words is combined with a push-to-speak mode for entering symbols and phrases. In addition, these two modes are further combined with keypad based text prediction. Finally, the overall user-interface of the proposed system is designed such that it enhances existing standard text-input methods; thereby minimizing the behavior change for mobile users.
336 Citations
15 Claims
-
1. A multimodal system, the system comprising:
-
a speech detection module configured to detect audio and convert the audio to at least one potential word; a speech recognition module configured to accept the at least one potential word and at least one letter input, to create a reduced dictionary based on the at least one letter input, and to search the reduced dictionary for a speech recognized word based on the potential word; a text prediction module configured to accept the least one letter input and predict a predicted word associated with the at least one letter input; and a combine module configured to accept the predicted word and the speech recognized word and to create a set of choice words based on the predicted word and the speech recognized word. - View Dependent Claims (2, 3, 4)
-
-
5. A computer-implemented method for multimodal text recognition, comprising:
-
a) detecting an utterance that is spoken based on knowledge of a first letter typed; b) extracting a prosodic feature associated with the utterance; c) searching a dictionary for at least one entry associated with the extracted prosodic feature, the at least one entry starting with the first letter typed; d) determining a set of potential matches from the at least one entry; e) combining the set of potential matches with a set of predicted matches, the set of predicted matches being determined using a text prediction technique based on the first letter typed; and f) determining a set of choice words based on a confidence value of the set of potential matches and the set of predicted matches. - View Dependent Claims (6, 7, 8, 9)
-
-
10. A computing device configured to provide a multimodal interface for entering text, the computing device comprising:
-
a computer storage medium including computer-readable instructions; a processor configured by the computer-readable instructions to; provide a user-interface for entering and displaying text; detect audio when the user-interface is in a listening mode and convert the audio to at least one potential word; recognize input of a first letter into a message window of the user-interface; reduce a search dictionary based on the first letter; search the search dictionary based on the potential word to determine a speech recognized word; predict a predicted word based on the first letter; create a set of choice words based on the predicted word and the speech recognized word; and display the set of choice words in a choice window of the user-interface. - View Dependent Claims (11, 12, 13, 14, 15)
-
Specification