Systems and methods for extracting meaning from multimodal inputs using finite-state devices
First Claim
1. Apparatus for recognizing an utterance comprising a plurality of modes, said apparatus comprising:
- means for recognizing a first mode in said plurality of modes;
means for outputting a recognition result for said first mode in said plurality of modes;
means for generating a recognition model for use in recognizing a second mode in said plurality of modes, said recognition model a function of said recognition result associated with said first mode in said plurality of modes; and
means for recognizing said second mode in said plurality of modes using said recognition model.
17 Assignments
0 Petitions
Accused Products
Abstract
Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.
-
Citations
23 Claims
-
1. Apparatus for recognizing an utterance comprising a plurality of modes, said apparatus comprising:
-
means for recognizing a first mode in said plurality of modes; means for outputting a recognition result for said first mode in said plurality of modes; means for generating a recognition model for use in recognizing a second mode in said plurality of modes, said recognition model a function of said recognition result associated with said first mode in said plurality of modes; and means for recognizing said second mode in said plurality of modes using said recognition model. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A recognition system for receiving and recognizing an utterance comprising a plurality of modes, the recognition system comprising:
-
a first mode recognition subsystem adapted to generate a first recognition result associated with a first mode in said plurality of modes; a multimodal recognition subsystem adapted to generate a recognition model based on said first recognition result; and a second mode recognition subsystem adapted to generate a second recognition result associated with a second mode in said plurality of modes as a function of said first recognition model. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
-
Specification